Software Engineer - Data Infrastructure

San Francisco
Engineering
Full-time
Our trucks produce over 40gb of data per hour. The infrastructure team is responsible for ingesting, transforming, and supporting all the query workloads across the Starsky engineering team. We’re looking for someone to help take our data efforts to the next level, and help us scale in both the amount of data we work with and the type of queries we can run. The data eng role will take over our data pipeline, refactoring and maintaining where necessary, and own new projects encompassing all parts of the system. From building modern machine learning pipelines to helping make data-driven decisions about how to improve autonomous driving, we’re looking for someone excited by the prospect of creating new systems and looking critically at places to improve our current operations. You may not be immediately qualified to own these systems, but you have experience with distributed computing and distributed systems design.

Skills Required:

    • Background in computer science/Mathematics or equivalent experience
    • Deep experience with Python, Java, Scala or another language. We use python, but believe deep experience in one language is more important than breadth in many
    • System Architecture skills, have prototyped, built, deployed, and maintained large and complex systems
    • Can build fault-tolerant systems and provide visibility into the health of long-running jobs.
    • Demonstrated ability to produce production-grade code - produces tests, documentation, and can be on both sides of code reviews
    • Experience developing on linux platforms
    • Experience with building infrastructure on a cloud provider
    • Proficient in SQL and can assist engineers with their queries.

Nice-to-Have

    • Infrastructure as code experience (we use terraform)
    • Backend or web application experience