Senior Software Engineer (Infrastructure)
San Francisco, CA /
Research and Engineering /
This role offers an opportunity for a strong systems engineer to work closely with ML engineers and researchers to support cutting-edge ML research and deployment. You'll touch all parts of our code and infrastructure, whether that’s building large-scale distributed systems, improving the robustness and reliability of large language model training, optimizing network architecture, or improving our developer tooling.
- Own a thousand-node Kubernetes cluster to support ML research
- Pair with ML engineers to design and optimize infrastructure for serving large ML models
- Design and build fault-tolerant infrastructure to support running large-scale jobs reliably despite failures of individual nodes
- Migrating a cloud deployment to Terraform or Pulumi
- Optimize load-balancing strategies to efficiently use multiple zones
- Adding alerts and playbooks for cluster monitoring
You might be a good fit if you
- Have significant experience working with cloud infrastructure
- Are comfortable debugging large-scale software systems
- Enjoy close collaboration with engineers and researchers with a variety of backgrounds and expertise
- Care about the societal impacts of your work
- Pick up slack, even if it goes outside your job description
Strong candidates may also have experience with some of the following
- Operating cloud infrastructure
- Terraform / Pulumi
- High-performance networking
- Python internals
- Low-level Linux interfaces and administration
How we're different
- We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles.
- We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're most excited to hire researchers from diverse backgrounds who share this perspective.
- We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.
- We're trying to build a core of knowledge and intuition about the most robustly effective innovations in AI, and so thoroughly-documented null results are almost as valuable as positive discoveries.
- We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.
The easiest way to understand our research directions is to read some of our team’s previous work, such as: GPT-3, Circuit-Based Interpretability, Mulitmodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.
Come work with us! Anthropic is a public benefit corporation based in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.