Senior Data Engineer (I,II,III)

South San Francisco, CA
Engineering - Software – Software /
Full Time /
Hybrid
We are seeking an experienced Senior Data Engineer who will contribute to developing our advanced cell therapy manufacturing platform. This role will lead the design, development, and management of scalable data pipelines and architecture.

The ideal candidate has extensive experience in data engineering, a strong understanding of modern data architectures, and a deep knowledge of big data technologies. This role involves working with large datasets, ensuring data quality, and collaborating with cross-functional stakeholders across the organization to provide data-driven business outcomes for our cell therapy business.

We're looking for someone who thrives in a fast-paced, mission-driven environment, is comfortable wearing multiple hats, and is ready to tackle diverse challenges as our company grows.

Responsibilities

    • Architect, design, and implement data pipelines and infrastructure to support the implementation of data analytics and AI solutions, leveraging Azure Cloud Platform and Databricks technologies
    • Collaborate with cross-functional teams to understand project requirements, translate them into technical specifications, and develop scalable and efficient data solutions
    • Build and maintain data ingestion, transformation, and storage systems using Azure services such as Azure Database for PostgreSQL, Azure Blob storage, and Azure Databricks, ensuring data quality, reliability, and security
    • Work closely with cell therapy process developers, manufacturing operators, and scientists to preprocess and prepare data for machine learning models, performing feature engineering, data augmentation, and exploratory data analysis as needed
    • Implement monitoring, logging, and alerting mechanisms to track data pipeline performance and ensure timely identification and resolution of issues
    • Document data engineering processes, workflows, and best practices and provide guidance and support to project team members as needed

Requirements

    • Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field with at least 6 years of related industry experience
    • Proven experience in data engineering, with a focus on building data pipelines and infrastructure in cloud environments, preferably on Microsoft Azure
    • Strong proficiency in Azure services such as Azure Databricks, Azure Database for PostgreSQL, Azure Table Storage, Azure Cosmos DB
    • Proficiency in managing, optimizing, and data integration with Elasticsearch clusters
    • Experience with data preprocessing, feature engineering, and data modeling techniques, particularly in the context of data analytics, machine learning, and AI applications
    • Proficiency in programming languages such as Python or C#, with experience in developing scalable and efficient data processing code
    • Expertise with data visualization tools such as Microsoft Power BI and Tableau
    • Familiarity with data governance, compliance, and security best practices, especially in regulated industries such as medical devices, cell therapy, bioprocessing, or instrumentation
    • Excellent problem-solving and analytical skills, with the ability to troubleshoot complex data engineering issues and optimize performance
    • Effective communication and collaboration skills, with the ability to work in cross-functional teams and interact with stakeholders at all levels
    • Comfortable working in a fast-paced start-up company environment with minimal direction and changing priorities
    • Ability to be a strong problem solver and team player
    • Self-awareness, integrity, authenticity, and a growth mindset
    • This will be a full-time onsite position in South San Francisco or Chicago
$90,000 - $210,000 a year

Cellares total compensation package contains competitive base salaries, highly subsidized Medical, Dental, and Vision Plans, 401(k) Matching, Free EV Charging, Onsite lunches, and Stock options. All displayed pay ranges are approximate, negotiable, and location dependent.