Software Engineer - Data Infrastructure
Our trucks produce over 40gb of data per hour. The infrastructure team is responsible for ingesting, transforming, and supporting all the query workloads across the Starsky engineering team. We’re looking for someone to own the data pipeline and all data workloads, and help us scale in both the amount of data we work with and the type of queries we can run. Our future data infra eng will continue to architect and deploy the platform that powers all of Starsky’s development work.
The Data Infra eng role will take over our data pipeline, refactoring and maintaining where necessary, and own new projects encompassing all parts of the system. From building modern machine learning pipelines to helping make data-driven decisions about how to improve autonomous driving, to creating user-focused APIs to interact with our infrastructure, we’re looking for someone excited by the prospect of creating new systems and looking critically at places to improve our current operations. You may not be immediately qualified to own these systems, but you have experience with distributed computing and distributed systems design.
- Background in computer science/Mathematics or equivalent experience
- Deep experience with Python, Java, Scala or another language. We use python, but believe deep experience in one language is more important than breadth in many
- System Architecture skills: have prototyped, built, deployed, and maintained large and complex systems
- Can build fault-tolerant systems and provide visibility into the health of long-running jobs
- Demonstrated ability to produce production-grade code - produces tests, documentation, and can be on both sides of code reviews
- Experience developing on linux platforms
- Experience with building infrastructure on a cloud provider (AWS, GCP, Azure, etc)
- Proficient in SQL and can assist engineers with their queries
- Infrastructure as code experience (we use terraform)
- Backend or web application experience