Platform Engineer

Remote
Engineering
Engineering
At Flowmill, we are maniacally focused on eliminating cloud application failure by building tools that quickly and automatically pinpoint service disruption -- both caused externally from faults in cloud infrastructure and API providers, and internally from bugs and configuration errors. The underlying tech (developed at MIT) is unique in its extremely low-overhead collection and analysis, its full coverage, and its ability to be deployed in minutes with no code changes or configuration. These allow Flowmill to provide SREs and DevOps engineers with smart alerts and a complete, easy-to-read picture of their deployment -- dramatically accelerating fault resolution.

This is a chance to join a small, rockstar team with backgrounds at Facebook, Google, and VMware and change the way engineers achieve high availability and performance in their production applications.

What You’ll Do

    • Work embedded within the engineering team to design and automate our software and infrastructure deployment process using GitOps
    • Improve availability, durability, cost and monitoring of production services
    • Help us build a blameless culture
    • Participate in on-call rotation
    • Use our own product to debug real problems and provide expert feedback on areas for improvement

Qualifications

    • B.S. in Computer Science
    • 3+ years experience one or more general purpose programming languages such as Go, Java, Python, C/C++
    • Experience deploying, operating and monitoring large distributed systems
    • Experience automating deployment and monitoring of software services
    • (Highly Desired) Experience with AWS-based infrastructure, Kubernetes, data processing with Kafka, or Prometheus
    • (Highly Desired) Experience leading and growing people and teams

Example Projects

    • Automating AWS / Kubernetes / Helm using GitOps
    • Setting up Preview Environments on Pull Requests
    • Create process to deploy Grafana dashboards from Git
    • Replacing existing single-tenant services with scalable multi-tenant services
    • Create process for running full SAAS product on a local kubernetes cluster for developer productivity