Senior Data Scientist / Computational Biologist

Natick, MA /
Software and Bioinformatics /
Full time
The Senior Data Scientist / Computational Biologist will lead the computational biology group in driving algorithmic development of NGS cell-free diagnostic assays and be responsible for developing new algorithms that drive breakthroughs in cancer detection.
This exceptional opportunity offers the chance to support research and development efforts at the frontier of cancer genomics, leveraging a state-of-the-art data lake and the latest bioinformatic tools to generate cutting edge statistical methods and drive insight automation efforts. Algorithm development will require novel use of current and emerging machine learning techniques and efficient approaches to extra insights from terabytes of data to petabytes of data.

- Understand and explore data patterns, perform feature selection on large quantities of data, and evaluate those features’ performance in classification and regression settings
- Leverage statistical methods including but not limited to Hidden Markov Models, mixed models, hierarchical Bayesian models, and efficient methods for posterior estimation such as No U-Turn Sampler MCMC and Variational Bayes
- Translate product ideas into defined data science challenges and solutions
- Design and deploy algorithms for automating quality control (such as anomaly detection)
- Lead creation of production data analysis software that runs in a cluster, using technologies including, but not limited to: Airflow, Postgres, Parquet (accessed by Spark, Dask, Vaex, etc), Celery, Kubernetes
- Help optimize existing variant caller performance and drive down limits of detection at acceptable specificity
- Train computational biologists
- MS or PhD in Bioinformatics, Data Science, Computer Science, Statistics, Engineering, Physics, Mathematics or related field with 2+ years of industry experience
- Proven relevant experience in NGS data analytics
- Familiar with open-source tools available
- Experience working with NGS data, especially targeted sequencing (WES using amplicon or hybrid-capture based technologies) and WGS
- Proficiency in Python/R for numerical/statistical programming including Numpy, Pandas and standard machine learning libraries
- Proficiency with SQL in a data analysis context
- Strong analytical and problem-solving skills
- Willingness to work hard and be creative in a fast-paced environment
- Able to work both independently and collaboratively with colleagues
- Capable of leading teams
- Working knowledge of cancer genomics

- Ph.D. with 3+ years of relevant industry experience.
- Experience with C or other compiled languages.

Pillar Biosciences' mission is to create technologies that make precision medicine accessible. We are a growing company with a plan to disrupt the clinical testing market. 

Our focus is on building NGS diagnostic and software solutions from the ground up, leveraging experiences in genomics, data analytics, clinical testing and patient focus to quickly innovate and penetrate the market. If you thrive in a brilliant, fast-paced, and mission-driven environment, Pillar Biosciences is the place for you!

