Data Engineer

San Mateo, CA
Engineering
Full-Time
The pace of innovation in cancer treatment has accelerated dramatically in recent years, with new breakthroughs every day and over 600 cancer drugs now on the market. And yet, the technology doctors use to take advantage of these innovations hasn’t evolved in decades. In fact, most doctors still use a software system that was designed in the 1990s for billing and compliance to make life-altering decisions about care.

At Project Ronin, we’re out to change this - fast. Our mission is to dramatically improve cancer care by giving doctors and patients the tools they need to make better decisions about treatment. We’re developing a cancer intelligence platform that provides all the information physicians need, in one place, to assess patient care options and take action. We believe that this technology will allow for truly individualized care and will have an immediate impact on quality of life and survival rates.

We’re building a team of highly motivated, passionate individuals to help us pioneer this new approach to cancer care. Nearly every person will be touched by cancer at some point in their lives, so the potential for our collective impact is vast.

Will you join us?

As one of the first engineers to join the Data Platform Team, you will be developing complex distributed streaming systems to ingest and process varied and huge amounts of data arriving from and delivered to our partner healthcare systems. Additionally you will develop Data Insights Platform, that will power ML workloads, in partnership with the Data Science team. Our stack is Python, Kafka, MySQL, Oracle ADWC and Cassandra. You will learn and build healthcare knowledge graphs that will be central to improve outcomes for cancer patients.

What You Will Do:

    • Develop distributed streaming data systems, services and frameworks to address high-volume complex data collection, processing, transformation, ingestion and reporting.
    • Develop data models, fixtures data and multi-stage distributed processing code for the models.
    • Write code and unit tests in Python and conduct code reviews
    • Drive continuous improvements by taking ownership of technical aspects of software development and identifying opportunities to adopt innovative methods and technologies.
    • Partner with peers to collaboratively build software solutions to address user's pain points.

What We're Looking For:

    • B.S. in Computer Science or 5+ years of experience with delivering production quality software
    • Expert in Python, Ruby or Java; expert in SQL
    • Expert level skills with building products using distributed technologies
    • Strong and demonstrable experience with more than one of: Relational Stores (E.g Postgres, MySQL, Oracle)Columnar or NoSQL Stores (Oracle ADWC, Redshift, Cassandra, DynamoDB)Distributed / Async Processing (Apache Spark, Apache Storm, Celery, Sidekiq)Distributed Queues (Apache Kafka, Kinesis, RabbitMQ)
    • Experience working with Oracle OCI, AWS or similar cloud platform technologies
    • Experience with data science or machine learning, especially supervised ML algorithms, clustering, or natural language processing preferred
    • Knowledge of healthcare datasets, data formats preferred
    • Worked in a regulated industry (e.g healthcare or financial) preferred
    • Experience with hierarchical, relational and unnormalized data formats preferred

What We Offer:

Our goal is to remove as many obstacles as we can so you are able to do the best work of your life. We offer the following benefits to help you do that:

- Opportunity to make an enormous impact on hundreds of millions of lives, while growing your career
- A team that is passionate about achieving our mission and each other’s success
- Medical, dental, and vision benefits
- 401K
- Commuter stipend
- Quarterly learning stipend
- Phenomenal location within walking distance to the San Mateo CalTrain
- No meeting Tuesday's for Engineering