ML Engineer

San Mateo, CA
Engineering
Full-time
About Roam

The modern healthcare system generates enormous quantities of diverse, disconnected data. These data sets present substantial analytic challenges, but can also illuminate new avenues of inquiry that yield unprecedented improvements in global health. Roam is realizing this potential by combining our proprietary data platform with advanced machine learning, empowering life sciences companies, hospital systems, insurers, and governments to make data-driven decisions that improve patient outcomes and guide innovation.

Roam Health Knowledge Graph is the foundation of Roam's data platform and is central to all our applications. This pre-built data ontology brings the world's vast healthcare information together using a patent-pending graph architecture that structures the data while embracing the uncertainty inherent in health datasets.

Our clients generate insight from this data platform through an application suite we’ve engineered to facilitate efficient, iterative analysis of patient-level data at scale and with unprecedented depth. Analysis performed within the Roam ecosystem bypasses the inefficient data integration processes currently required to modify or address a new research question. Roam's technologies have been used to improve drug development, bring new drugs to market, demonstrate value to payors, and compute real world outcomes. These sample use cases, though distinct, all bring Roam closer to achieving our mission: leverage artificial intelligence to bring about sustainable and affordable improvements in patient health.

About Role

Roam builds web applications that enable healthcare domain specialists to leverage machine learning algorithms and other data-driven tools to make sense of complex data. Our machine learning engineers contribute to every step of this process: intelligently integrating data, working alongside clinical data scientists to design features, building the infrastructure that trains and selects models, and helping to design interfaces that facilitate meaningful interaction with the models for non-ML specialists.

Our machine learning team have excellent academic credentials from top universities, and we support team members who wish to continue to publish research. You’ll work with academics who regularly publish at top venues in machine learning, natural language processing and bioinformatics.

We expect all of our machine learning engineers to meet the ‘required qualifications’ below, and are particularly excited about applicants who meet at least one of our ‘beneficial qualifications’. Applicants who specialize in natural language process should also see our NLP Engineer job description.

Responsibilities

    • Develop predictive models.
    • Scale machine learning pipelines to work in production systems.
    • Work with structured and unstructured data to engineer interpretable features as inputs to machine learning models.
    • Conduct research into machine learning concepts that would particularly benefit the healthcare domain.
    • Contribute to the Roam Analytics blog and to conference submissions.

Qualifications

    • Deep understanding of the mathematical foundations of Machine Learning algorithms, including linear algebra, vector calculus and probability theory.
    • Experience of building end to end Machine Learning systems.
    • Proficient with Python (including numpy and scikit-learn) and object-oriented programming. Extensive experience of applying statistical algorithms to real world data.
    • Passion for healthcare and the potential of machine learning to help solve its challenges.
    • MS or PhD in Computer Science, Engineering, Statistics or a related field with substantial theoretical component.
    • 2 years professional experience of machine learning, and/or substantial open source contributions.

Beneficial Experience

    • Demonstrated knowledge of causal inference techniques.
    • Demonstrated knowledge of automatic model selection and hyperparameter tuning.
    • Experience of generating synthetic datasets.
    • Specific research into interpretability of ML models.
    • Experience of crowdsourcing annotations for use in classifiers.
    • Experience of productionizing machine learning models.
    • Contributions to relevant open source projects.
    • Publication history, and/or blog posts about machine learning or related topics.