Senior DevOps Engineer - Remote US/San Francisco Bay Area

San Francisco Bay Area
Core Systems Engineering
Full-time
About Pachyderm

At Pachyderm, we're building an open-source enterprise-grade data science platform that lets you deploy and manage multi-stage, language-agnostic data pipelines while maintaining complete reproducibility and provenance. If you want to learn more about our grand vision, read what has become our "manifesto."

The Role

Love Go, Kubernetes, cloud deployment, and automation?

Pachyderm is hiring a DevOps expert to be a senior member our team to help improve infrastructure, deployment, and testing processes. Pachyderm has a rapidly-growing engineering team and we're long overdue for some major improvements to our internal infra and engineering methodologies. Your major projects will include:

- Develop our internal Go backend for the hosted platform
- Manage and maintain internal Kubernetes clusters and hosted Pachyderm clusters
- Optimize Pachyderm's CI to improve our development workflow and increase developer velocity.
- Develop Pachyderm's internal testing/benchmarking framework (probably in Go) to perform large-scale benchmarks on a regular cadence.
- Improve, test, script, and document the multitude of deployment options for Pachyderm's core product including all cloud providers and various permutations of on-prem k8s and object stores.
- Build standard monitoring, logging, and deployment (e.g. Helm chart) packages so that Pachyderm users can get up and running faster
- Work closely with our front-end, backend, and systems team to improve hosted cluster stability and uptime.

While your primary focus will be building and maintaining various internal systems, you'll also have the opportunity to contribute to the core product and work directly with users/customers who have complex deployment environments. At Pachyderm, OSS user and customer feedback is major driver of our product roadmap and we believe that everyone within the company should experience that first-hand.

Pachyderm is just a small team right now, so you'd be getting in right at the ground floor and have an enormous impact on the success and direction of the company and product.  You can of course check out the product on GitHub.

We offer significant equity, full benefits, and all the usual startup perks.

Qualifications

    • Some Golang or other programming experience is required. While much of the job is automation and scripting, our testing frameworks, product backend, and internal automation work (e.g. k8s operators/CRDs) are all written in Go.
    • 4+ years of experience building, maintaining, and automating distributed systems, data infrastructure, back-end systems or related infrastructure.
    • Expertise running and managing Kubernetes and Docker in one or more cloud providers, preferably as part of a large-scale, enterprise-class product related to storage, processing, networking and/or virtualization
    • Expertise running and managing build, test, and release processes for 10+ person engineering orgs
    • Must have strong communication skills when talking about technical concepts. Our interview process strongly tests for communication as we have a very collaborative work environment where many parts of the codebase interact in complex ways.