Machine Learning Scientist

Menlo Park, CA
Engineering
Full-time
Introduction to Aether

Aether was founded under the belief that synthetic biology will fundamentally change the future of manufacturing. In order to catalyze the next industrial revolution, we are building a fully automated robotic laboratory that will generate enough data to leverage deep learning for biological engineering for the very first time. To do this, we are building a diverse team of software engineers, machine learning engineers, process engineers, roboticists, bioengineers, environmentalists, sci-fi nerds, and world-changers. We hope you can join us.

Job Description

Aether believes the key to unlocking the potential of bioengineering and nanotechnology is the application of deep learning algorithms trained on massive biological datasets. To leverage the data our high-throughput laboratories will generate, Aether requires sophisticated, scalable, and high-performance deep learning models. These models will define our ability to engineer biology and transition humanity to nanotechnology-based manufacturing, and we are looking for the best and brightest to join us in this challenge. 

As an early member of our machine learning team, your passion and ideas will play a significant role in defining the architectures and models that will serve as the foundation for our deep learning efforts. You will be developing software from the ground-up with an early-stage team, and as such we are searching for someone who is excited by the autonomy, creative freedom, and rigor that this opportunity represents. 

As a machine learning scientist you will research, design, build, and test the machine learning algorithms that power Aether’s enzyme engineering infrastructure. In addition, you will shape these algorithms and their hyperparameters into modules that will serve as building blocks for a generalizable machine learning platform which will enable the rapid construction and testing of models for new applications. These applications include the design of enzymes with novel catalytic capabilities, the experimental design to optimize throughput, data quality, and cost-effectiveness of our robotic laboratories, and the exploration and indexing of enzyme-design space. You will engage with the machine learning engineers and infrastructure engineers to leverage their expertise for the design of an infrastructure that best enables you to design, build, and test high-performance deep learning models. The performance of Aether’s algorithms will be measured by their predictive capacity, their speed of training and prediction, and additional metrics who’s minimum required value will be defined as necessary on a case by case basis. 

Furthermore, you will bring data analytics expertise to characterize and visualize our databases, both to understand the biases of our data and how they affect generalizability of our models as well as to assess the progress of our continual efforts to index biological space. In addition, you will monitor the quality of our datasets and build pipelines for the identification and removal of confounders. 

Lastly, you will help grow a team of innovative, high-performing machine learning scientists and play a critical role in defining the technical environment in which the machine learning team executes. Enabling a nanotechnology revolution requires a cutting-edge deep learning platform that is unmatched in its predictive performance, speed, scalability, and extensibility, and you believe you are up to this task. 

Responsibilities

As a member of the technical staff within Aether, you will play a key role in shaping our technical culture, building world class technology solutions, and evangelizing the use of technology across the organization. Core responsibilities include:

- Build a modular machine learning platform for the rapid construction and testing of models for new applications
- Leverage and improve upon best-in-class models, architectures, and layers to solve our deep learning challenges and integrate them into our machine learning platform
- Design world-first deep learning algorithms to tackle unsolved problems or outcompete current approaches
- Research and stay up-to-date on new publications relevant to our domain and challenges
- Leverage best practices in data science to interrogate, visualize, and understand the biases and generalizability of our databases
- Monitor the quality of our database and build pipelines to identify and remove experimental confounders
- Communicate with machine learning engineers/infrastructure engineers to help define the infrastructure the enables you best
- Help build a world-class team of ambitious machine learning scientists eager to leverage nanotechnology to solve humanity’s greatest challenges 
- Perform other related duties as assigned and based on Company needs

Minimum Requirements

    • Bachelor's degree in Computer Science, Mathematical Computing, Data Science, Bioinformatics, Machine Learning, or relevant field
    • At least 2 years of industry experience building state-of-the-art deep learning models and machine learning related tools
    • Strong understanding of machine learning frameworks (TensorFlow, Keras, Torch, Pytorch, etc...), their respective strength and limitations in data-heavy R&D and production environments
    • Strong understanding of techniques for dramatically increasing training and prediction speed for neural networks
    • Strong understanding of both recurrent neural networks (RNNs) and convolutional neural networks (CNNs) and their respective strengths and limitations for predictive performance, speed, etc..
    • Expert knowledge of approaches for data preprocessing, regularization techniques, optimizers, loss functions, and additional core deep learning concepts
    • Experience with git and version control
    • Fluent in Python and PEP 8

Preferred Requirements

    • MS or PhD in Computer Science, Mathematical Computing, Data Science, Bioinformatics or relevant field
    • At least 5 years of industry experience building state-of-the-art machine learning models and machine learning related tools
    • Strong knowledge of graph-based machine learning
    • Basic knowledge of biological data/systems (proteins, enzymes, DNA, etc) 
    • Experience with cloud computing (AWS or GCP)
    • Experience in rapidly growing start-ups
    • People management / hiring experience