Site Reliability Engineer

Foster City, CA
Software – Software & Machine Learning Infrastructure
At Zoox, you'll be responsible for measuring and maintaining the uptime of the many services critical to the development process of an autonomous vehicle. You will also be heavily involved in all phases of rolling out a service from designing systems that are easy to maintain and fault-tolerant through deployment and operation and continual improvement. Zoox is a robotics company and our ethos of automation extends throughout the infrastructure components we build. Be prepared to work with systems handling large volumes of data and data processing pipelines performing compute intensive tasks on CPUs and GPUs.


    • Bachelor's degree in an engineering, math, or related field and 2+ years of relevant experience
    • You have supported multiple in production services
    • Utilized tools like Ansible, Terraform or Salt effectively  
    • Can extract and report useful performance or service metrics
    • Linux, no matter the flavor
    • familiar with Python or C / C++

Bonus Qualifications

    • AWS, ECS, Kubernetes
    • Experience handling large data sets
    • Master's degree in computer science or related degree

Zoox is developing the first ground-up, fully autonomous vehicle fleet and the supporting ecosystem required to bring this technology to market. Sitting at the intersection of artificial intelligence, robotics, and design, Zoox aims to provide the next generation of mobility-as-a-service in urban environments. We’re looking for top talent that shares our passion and wants to be part of a fast-moving and highly execution-oriented team.

Follow us on LinkedIn

A Final Note:
You do not need to match every listed expectation to apply for this position. Here at Zoox, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.