Senior Site Reliability Engineer (V)

Remote Latam
Technology – Software Engineering /
Full-time /
Remote
Why Blue Coding? 

At Blue Coding, we specialize in hiring excellent developers and amazing people from all over Latin America and other parts of the world. For the past 11 years, we’ve helped cutting-edge companies in the United States and Canada build great development teams and develop great products. Large multinationals, digital agencies, Saas providers, and software consulting firms are just a few of our clients. Our team of over 150 engineers, project managers, QA, UX/UI designers, and many more is distributed in more than 10 countries across the Americas. We are a fully remote company working with a wide array of technologies, and we have expertise in every stage of the software development process.

Our team is highly connected, united, and culturally diverse, and our collaborators are involved in many initiatives around the world, from wildlife preservation to volunteering at local charities. We stand for honesty, fairness, respect, efficiency, hard work, and cooperation.

What are we looking for?

In this opportunity, we are looking for an experienced Site Reliability Engineer to work with one of our foreign clients, a corporation that, through its subsidiaries, provides life insurance protection targeted to the middle American market. They're transforming how technology powers the life insurance experience. To support their ambitious growth goals, they’re evolving from a traditional Network Operations Center (NOC) model to a modern, hybrid Site Reliability Engineering (SRE) approach.

This is a rare opportunity to be the very first SRE hire—you won’t just support reliability, you’ll define it. From shaping processes and selecting tools to mentoring engineers and building automation, you’ll help them create a high-performing reliability function from the ground up.


What's unique about this job?

Through innovation in product design and distribution that provides access to the middle market, including call center and web-enabled sales and underwriting processes, quick issuance of policies, and an emphasis on products not medically underwritten at the time of sale, the company seeks to make life insurance more affordable for the middle market.

Here are some of the exciting day-to-day challenges you will face in this role:

    • Build the foundation: Design and implement SRE best practices, processes, and tooling.
    • Lead operational transformation: Help transition their NOC into a technically empowered, automation-driven reliability team.
    • Own observability and monitoring: Drive improvements in system monitoring, alerting, and dashboards using tools like GrafanaCloudWatch, and Datadog.
    • Automate everything: Reduce manual effort and increase resilience through Terraform, scripting, and cloud-native automation.
    • Define and measure reliability: Establish SLIsSLOs, and error budgets that keep the team accountable to high uptime and stability goals.
    • Collaborate and mentor: Work closely with DevOps, SysOps, and engineering teams while helping upskill existing NOC engineers.
    • Be a change agent: Bring a forward-looking mindset, driving cultural and technical change across the organization.

You will shine if you have:

    • 5+ years in SRE, DevOps, or advanced systems engineering roles.
    • Proven experience building or transforming SRE practices—you know what it takes to stand up a new function.
    • Experience in creating and managing, reporting, and analyzing stability metrics.
    • Strong AWS expertise.
    • Strong experience with Secrets Management tooling like AWS Secrets Manager, HashiCorp Vault, Keeper, or Infisical strongly desired.
    • Experience in the Atlassian tool platform (Jira, Confluence, Bitbucket) strongly desired.
    • Hands-on experience with Terraform and infrastructure-as-code. This should include tools like Chef, Puppet, or Ansible.
    • Proficiency in Python and/or Ruby for automation and integrations.
    • Expertise in monitoring, observability, incident response, and service reliability.
    • Ability to define SLIs/SLOs and build data-driven operational metrics.
    • Excellent collaborator with a passion for mentorship and team growth.

It doesn’t hurt if you also have:

    • AWS certifications are highly preferred
    • Experience in insurance, fintech, or other regulated industries.
    • Familiarity with incident.io, Jira Service Manager, or similar ITSM tools.
    • Background in CI/CD pipelines and modern DevOps practices.

Here are some of the perks we offer you:

    • Salary in USD
    • Full-time
    • 100% Remote
Ready to learn more? Apply below!