ML Platform Engineer

San Mateo, CA

At Roam, our mission is to dramatically improve the health of the world’s population by bringing ever-more complete knowledge to patients, providers, professionals, and companies.

Roam is developing a machine learning and data platform that powers rich analysis of patient journeys and factors affecting treatment decisions. Analysis built on this platform enables life sciences organizations and health care providers to better leverage large, disparate data sources to identify and improve patterns of care.

The Roam platform is powered by machine learning and a proprietary data asset we call the Health Knowledge Graph. The Health Knowledge Graph converts billions of disparate, often unstructured, data elements into a coherent picture of healthcare. The relationships and information captured in the Graph are continuously enriched using machine learning and natural language processing to extract more information, and by making connections to new data sources. The result is an unprecedented, comprehensive view of the healthcare industry that allows life sciences companies to follow information instead of instincts when seeking to improve patient outcomes.

We need a high degree of flexibility in how we ingest and store data. Our ideal candidate is an accomplished data engineer with experience in ingesting, processing and warehousing large amounts of data. They have a burgeoning passion for machine learning and experience developing infrastructure to enable easy feature engineering.


    • 4+ years experience using python at a SaaS company (scala is a plus!)
    • Strong grasp of Object-oriented design, CS and programming fundamentals
    • Experience building data/machine learning pipeline (Spark)
    • Experience working with big data stores like Elasticsearch, MongoDB, RDBMS, HDFS, Cassandra, Neo4j (Graph DB experience is preferred)
    • Experience using workflow management software like Airflow
    • Experience using CI systems like Jenkins/CircleCI and Docker based development is a huge plus!