ML Platform Engineer
San Mateo, CA
At Roam, our mission is to dramatically improve the health of the world’s population by bringing ever-more complete knowledge to patients, providers, professionals, and companies.
Roam’s machine learning and data platform powers rich analysis of patient journeys to reveal the factors affecting treatment decisions and outcomes. Analysis built on this platform enables life sciences organizations and health care providers to better leverage large, disparate data sources to identify and improve patterns of care.
The Roam platform is powered by machine learning and a proprietary data asset we call the Health Knowledge Graph. The Health Knowledge Graph converts billions of disparate, often unstructured, data elements into a coherent picture of healthcare. The relationships and information captured in the Graph are continuously enriched using machine learning and natural language processing to extract more information, and by making connections to new data sources. The result is a comprehensive view of the healthcare industry that allows life sciences companies to follow information instead of instincts when seeking to improve patient outcomes.
Roam is looking for an enthusiastic, driven, and accomplished data engineer with experience building a large scale data processing and machine learning platform. Our ideal candidate excels when given autonomy and is excited by novel technical problems that regularly challenge their abilities. In this role you will help architect and develop diverse facets of the Health Knowledge Graph and our data infrastructure to facilitate feature engineering, predictive modeling, and responsive web applications.
- 4+ years experience using Python at a SaaS company
- 2+ years of experience building high throughput data pipelines using technologies like Spark
- Strong grasp of object-oriented design, computer science, and large scale data processing
- Extensive experience working with and setting up different big data stores like Elasticsearch, MongoDB, RDBMS, HDFS, Cassandra, Neo4j
- Experience with Docker based development
- Exposure to workflow management software like Airflow
- Ability and comfort serving as a subject matter expert responsible for making key architectural and infrastructure decisions
- Experience with Graph databases is a plus
- Experience with Scala or Kafka is a plus
- Experience using continuous integration systems like Jenkins/CircleCI is a plus
- Familiarity with Amazon Web Services is a plus