Data Engineer

Cambridge, MA
Engineering /
On-site
Our Mission
Our mission is to solve the most important and fundamental challenges in AI and Robotics to enable future generations of intelligent machines that will help us all live better lives.

Data Engineers will work cross-functionally, creating new technology to improve software development for robots. If you have a passion for developing technology for robots and use it to advance their capabilities and usefulness, you will want to join us! We are onsite in our new Cambridge, MA office where we are building a collaborative and exciting new organization.

Responsibilities

    • Work collaboratively with research scientists and software engineers on software development for a range of different robotic platforms
    • Develop and maintain our data warehouses and data pipelines in cloud and on-premise infrastructureBuild event and batch driven ingestion systems for machine learning and R&D as needed
    • Develop and administer databases, knowledge bases, and distributed data stores
    • Create and use systems to clean, integrate, or fuse datasets to produce data products
    • Establish and monitor data integrity and value through visualization, profiling, and statistical tools
    • Perform updates, migrations, and administration tasks for data systems
    • Develop and implement a data governance, data retention strategyUse Python and SQL to develop, maintain and scale our data stores

Requirements

    • BS/MS in computer science, robotics, or a related field
    • 10+ years of experience in a data engineering or similar role
    • Demonstrated experience with a variety of relational database and data warehousing technology such as AWS Redshift, Athena, RDS, BigQuery
    • Demonstrated experience with big data processing systems and distributed computing technology such as Databricks, Spark, Sagemaker, Kafka, etc
    • Strong experience with ETL design and implementations in the context of large, multimodal, and distributed datasets

Bonus (Not Required)

    • 5+ years experience with Distributed data/computing tools (MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL)
    • 5+ year experience working on real-time data and streaming applications
    • 5+ years of experience with NoSQL implementation (Mongo, Cassandra)2+ years of experience with Airflow
    • 5+ years of data warehousing experience (Redshift or Snowflake)
    • 5+ years of experience with UNIX/Linux including basic commands and shell scripting
We provide equal employment opportunities to all employees and applicants for employment and prohibit discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.