Data Engineer / Data Scientist
Job / Candidate
Biobot Analytics measures opioids and other drugs in sewage to map real-time consumption in cities. We are building a product empowering leaders to improve the health of communities around the world. We are looking for people who are excited about using data for social good by improving the delivery of public health and social services and making a dent in the opioid epidemic.
You’ll be responsible for operationalizing our data analytics platform, which is based on award-winning research our team developed at MIT. Our analytics platform involves: (1) selecting sampling sites (manholes) in a city through GIS analysis of the sewer network, land use, and demographic data; (2) converting our chemistry lab results (which quantify drug metabolites excreted in urine and collected in the sewer) into population rates of drug use; (3) integrating our wastewater-derived data with other relevant datasets (e.g. reported overdoses or treatment centers in a city); and (4) generating static visualizations for our customer reports. We also hope that you’ll enjoy understanding the models and algorithms we use - even better if you’re excited to help us improve them.
For now, our platform is a combination of legacy workflows in ArcGIS, MATLAB, iPython notebooks, and Python scripts. You’ll be responsible for refactoring and operationalizing these workflows onto one platform (preferably in Python and on AWS) that our engineers, scientists, and designers can collaborate on. We’re also looking for someone with database skills to organize our company-wide data.
We're a small 5 person company and you’ll get a chance to work directly with everyone. We’re in the very early stages of setting up our data infrastructure, so you’ll get the chance to directly influence our entire company’s tech stack. This is a full-time contract position, with the potential of full-time employment in the future.
- You think in pipelines. You love finding the balance between manual and automated steps throughout an entire data process, from wrangling to compiling reports for our customers.
- You have excellent coding practices. Think: exquisite commenting, obsessive with READMEs, and clean file organization practices as standard parts of your workflow.
- You know what it takes to make production-level analytics code. You’re an expert in version control with git, GitHub, and continuous integration, and know how to implement the relevant tests to ensure the integrity of our data and analyses.
- You are familiar with GIS data. You know how to deal with complex spatial datasets and can manipulate and combine them with automated scripting languages.
- You have experience creating and managing data lakes, data warehouses and/or complex databases, and know the tradeoffs involved in organizing data from different sources and for different purposes. Bonus points if you have experience with AWS.
- Ideally, you’re thrilled about the challenge and opportunity to set up a robust data infrastructure from scratch. Even better if you’re also excited to help our data and engineering team (former science PhDs) get on board, and if you enjoy data analysis and are interested in helping us improve our models.
- Automate and improve our analysis pipelines and workflows. Convert our existing pipelines into production-level code, and help us improve our data security, coding review, and testing practices to ensure a robust data infrastructure.
- Lead the sampling site selection process for each of our customers and generate monthly PDF reports contextualizing our data, preferably via automated pipelines that you develop.
- Set up a data storage system to track our company-wide data, which includes: internally-generated chemistry data, sampling metadata, confidential GIS and public health data provided by cities we work with, results from our R&D experiments, and external datasets that you help download and wrangle.
For your portfolio, please highlight 2-3 data pipelines or analyses you’ve worked on which you are exceptionally proud of. Make sure to highlight any examples of geospatial / map-based work you have done. Also send us a link to your github, and if you want, point us to one or two repos that are most reflective of the kind of work you’ll produce for us.
About Biobot Analytics
Our mission is to transform wastewater infrastructure into public health observatories.
Biobot Analytics uses technology developed at MIT to measure opioids and other drugs in sewage (based on what is excreted in urine) and determine real-time consumption in cities. This data enables government to assess actual opioid consumption across communities, allocate resources addressing this public health crisis, plan interventions, and evaluate the efficacy of interventions over time.
Inspired by the potential of wastewater epidemiology, Biobot is the first company in the world to commercialize data from sewage. After winning multiple entrepreneurship competitions at MIT, Biobot completed the Y Combinator accelerator in San Francisco and raised a seed round from investors like Homebrew, Ekistic, Hyperplane, DCVC and Refactor.
Battling the drug epidemic is just the beginning - we’re building a public health action map. Eventually, Biobot will be an early warning system for disease, a map of nutrition disparities, and more. Headquartered in the Boston area, we aim to create the bedrock of human health infrastructure and smart cities in countries across all six continents.