Data Scientist/Biostatistican

Boston MA
Science Team
Joyn Bio is a joint venture between Bayer and Ginkgo BioWorks dedicated to addressing unmet needs in agriculture by applying synthetic biology approaches to engineer microbes.


Implement experimental design, statistics, and data science methodologies in collaboration with technical functions and program teams to help build engineered microbial products for agriculture.


    • Experimental Design and Statistical Modeling In Planta:  Apply modern data science and biostatistics tools for design and analysis of lab, greenhouse, and field experiments, to evaluate and rank biological performance of engineered microbes.
    • Digital Phenotyping, Sensor-Based Data and Geospatial Analytics:  Collaborate with plant scientists to implement standardized processing and analytical pipelines for complex data coming from sensor and imaging data (lab, greenhouse and field).  This includes working with internal partners, i.e. data engineers, to productionize code. Build and maintain a scientific code base, developing QC and preprocessing pipelines, prototyping machine learning algorithms, and creating business-relevant visualizations.  
    • Strategic Partnerships and Ginkgo Interface:  Build strong ties to Ginkgo’s Data Engineering team to build production level Data Science platform, as well as evaluating different existing data science platforms and technologies to enrich Joyn Bio’s technical productivity.
    • Create a data-centric culture: Contribute to increasing the data fluency in Joyn Bio by coaching and training others in basic statistical concepts and understanding of the potential and power of data science to support their work. Communicate effectively through listening, documentation and presentations, especially using compelling visualization tools to share analysis and interpretation of data. Continually seek opportunities to improve methods and analyses by application of statistical or quantitative tools.


    • Ph.D. or equivalent with a quantitative focus – some examples include but are not limited to Statistical Genetics, Plant Breeding, Quantitative Genetics, Bioinformatics, Data. Science, Ecology, or Microbial Population Genetics/Ecology, Evolutionary Genetics, etc.
    • 2-3 years of industry experience in at least three of the following: experimental design [for field trials, greenhouse-based experiments, or lab-based DOE], statistical modeling, optimization and simulation.
    • Supervisory experience is desirable but not required. This is an emerging function, depending on candidate, this role could expand to include building and supervising a team of 2-3 FTEs.
    • Fluent in Python, R, QGIS, and capable of framing problems as programming tasks; command line usage.
    • Experience with Tableau, Spotfire, R Shiny/Dash/Plotly/Kibana or other web-based interfaces to create graphic-rich customizable plots, charts, data maps, etc.
    • Familiarity and experience with deploying statistical pipelines in the cloud (e.g. AWS, Microsoft Azure).
    • Experience with interfacing via APIs with multiple database types, including graph and realtime databases is a plus.
    • Should have a proven track record of collaborating in a cross-disciplinary functional team.
    • A passion for creating transformative agricultural products, discovering new ways to add value to farmers.


    • This position reports to the Head of Computational Biology.
    • This position will be a part of the Platform and Infrastructure team.

This position can be based either at our Woodland, California or Boston, Massachusetts site and will require 5-20% travel.