Production Engineer - Remote

Brazil
Engineering /
Full-time /
Remote
Mattermost provides secure, workflow-centric collaboration for technical and operational teams that need to meet nation-state-level security and trust requirements. We serve technology, public sector, national defense, and financial services industries with customers ranging from tech giants to the world’s largest banks, to the U.S. Department of Defense and governmental agencies around the world. 

Our self-hosted and cloud offerings provide integrated workflow automation, AI-acceleration, ChatOps with team messaging, audio calling and screen share on an open core platform vetted and deployed by the world’s most secure and mission critical organizations. 

We co-build the future of collaboration with over 4,000 open source project contributors who’ve provided over 30,000 code improvements towards our shared product vision, which is translated into 20 languages.
To learn more, visit www.mattermost.com

We are looking for an engineer with demonstrated experience in software development and infrastructure using Kubernetes. You will be ensuring high reliability and scaling of Mattermost’s new SaaS offering through building tools, deploying infrastructure and automation in Kubernetes.

Here is some of the challenges and work of SRE team:
- Monitoring Cloud Environments at Scale with Prometheus and Thanos
- How We Use Sloth to do SLO Monitoring and Alerting with Prometheus
- Automate EKS Node Rotation for AMI Releases

Responsibilities:

    • Build services and tools to ensure the stability of Mattermost’s SaaS offering
    • Define infrastructure in code with IaC tools like Terraform
    • Write thoughtful and high-quality code in Go
    • Follow our engineering best practices, and ensure alignment with our Leadership Principles
    • Provide technical mentorship for fellow engineers
    • Develop services to handle automatic recovery from incidents and disasters
    • Automate incident or disaster simulations to identify blindspots
    • Set technical vision and innovate to be on the forefront of self-healing SaaS services
    • Implement, maintain and tune monitoring and alerting systems
    • Deploy applications to and manage Kubernetes clusters
    • Participate in our on-call rotation to respond to incidents and resolve problems.

Required Background/Skills:

    • Bachelor's degree in Computer Science or related fields, or significant professional DevOps or SRE experience
    • 5+ years of previous experience as a developer or SRE with operational responsibilities
    • Proven experience responding on-call to incidents with superior knowledge of incident response processes
    • Strong skills and experience working with Kubernetes inside and out
    • Strong skills and experience working with infrastructure as code tools, such as Terraform
    • Solid programming skills and experience with or an ability to quickly become proficient in Go
    • Familiarity with container systems such as Kubernetes & Docker
    • Familiarity with GitOps and Chaos Engineering
    • Ability and willingness to be on-call

Preferences:

    • Experience with distributed application systems using HTTP, WebSockets, RPC, pub/sub, etc. at scale
    • Open source contributions to related projects
    • Knowledge of Grafana and Prometheus suite
    • Comfortable with GitHub, Jira, Jenkins, CircleCI
    • Experience with WebRTC for real-time communication architectures
    • Experience working in open source communities
Mattermost is an EEO Employer. We are a remote-first, open source company.

We are constantly working towards adding more countries/regions to this list, but first we need to make sure we are compliant with local laws and regulations, which takes time. 

Mattermost is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people from all walks of life. We don't discriminate against staff or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!