Site Reliability Engineer (SRE) - Technical Referent
Buenos Aires / Montevideo / Rio de Janeiro / Sao Paulo / Córdoba / Mendoza / Rosario
DevOps – DevOps /
Contract | Remote /
Remote
About Us
Coderio designs and delivers scalable digital solutions for global businesses. With a strong technical foundation and a product mindset, our teams lead complex software projects from architecture to execution. We value autonomy, clear communication, and technical excellence. We work closely with international teams and partners, building technology that makes a difference.
🌍 Learn more: http://coderio.com
The SRE Technical Referent is a senior-level role responsible for guiding the design and evolution of reliable, observable, and scalable infrastructure. This position provides technical leadership and direction on best practices related to monitoring, automation, and incident response, playing a key part in improving system resilience across the organization.
What to Expect in This Role (Responsibilities)
Lead the continuous improvement of monitoring and alerting strategies across teams.
Collaborate with engineering and platform teams to identify and resolve performance bottlenecks.
Support incident response efforts, lead root cause analyses, and contribute to knowledge sharing.
Mentor team members and influence engineering culture around reliability and ownership.
Participate in architectural decisions, focusing on system health and long-term maintainability.
Requirements
4+ years of experience as an SRE Engineer or in a similar role focused on observability.
Deep expertise in Kubernetes, including core components, deployment strategies, and monitoring practices.
Cloud experience, especially AWS and ECS-based workloads.
Working knowledge of OpenTelemetry, including collector setup, service instrumentation, and pipeline optimization.
Proficiency with observability tools such as Grafana, Prometheus, Loki, New Relic, or Datadog.
Hands-on experience with Infrastructure-as-Code using Terraform.
Experience with GitOps CI/CD workflows using tools like ArgoCD, GitHub Actions, or similar.
Strong scripting skills in Python, Go, or similar languages for automation and tooling.
Experience integrating incident management platforms such as PagerDuty and Jira with alerting systems.
Nice to Have
Experience managing observability pipelines at scale in high-throughput environments.
Familiarity with Configuration-as-Code (Ansible, Chef, or SaltStack) for managing configurations across legacy instances.
Database performance monitoring experience, particularly in large-scale distributed environments.
Benefits
100% remote – (modificar en caso de no corresponder)
Long-term commitment, with autonomy and impact
Strategic and high-visibility role in a modern engineering culture
Collaborative international team and strong technical leadership
Clear path to growth and leadership within Coderio
Why join Coderio?
At Coderio, we value talent regardless of location. We are a remote-first company, passionate about technology, collaborative work, and fair compensation.
We offer an inclusive, challenging environment with real opportunities for growth.
If you are motivated to build solutions with impact, we are waiting for you.
Apply now.