Senior Site Reliability Engineer (L3)

Engineering – Site Reliability Engineering /
Contract /
What you'll be doing :
• System Architecture: Review architecture and software components with software engineers and architects. Ensure best practices are consistent across all teams.
• Operational Excellence: Own and ensure SLOs and SLAs are met. Monitor operational metrics and lead improvement plans. Develop tools including infra-as-code resources to scale operations and allow other teams to be autonomous.
• Security and Compliance: Manage and audit security controls to meet enterprise requirements. Implement and maintain best practices and compliance standards. Collaborate with legal and compliance to assess overall risk management.
• Release Planning: Conduct performance tests for large scale events or critical releases.  
• Disaster Recovery: Develop and implement DR plans and procedures, including data recovery and fault injection simulations on production replicas
• Incident Management: Lead incident response and post-mortems to resolve production issues, identify root-causes and prevent future occurrences.
• Documentation: Develop runbooks and other technical assets. Complete periodic technical audits as required. 
• Daily Operations: Perform and improve day-to-day tasks including access onboarding-offboarding, config and patch management etc.
• Sharpen the Saw: Stay up-to-date with emerging trends, threats and technologies to propose improvements and proof-of-concepts in technical roadmaps.
• Team Player: Collaborating with cross-functional teams to ensure smooth deployment and operation of software releases. Answer technical questions from other teams or outside the organization.
• Coaching: Provide feedback on the performance of junior staff and participate in people development initiatives.
Support any ad hoc tasks as required by the company.

What we look for in you :
• Proven Track Record: 3 to 5 years in managing software deployments and instrumentation in production environments with defined SLAs and SLOs. Strong knowledge of software delivery and devops principles.
• Cloud Operations: Experience with cloud platforms (e.g., AWS, CloudFlare, GCP) and infrastructure-as-code tools (e.g., Terraform, CloudFormation). Strong programming and scripting skills, preferably in languages such as Python, Go, or Ruby. 
• Accreditation: Bachelor’s degree in Comp Sci., InfoSec or similar fields, or professional certificates e.g. Certified DevOps Professional, Certified Solutions Architect Professional in AWS or GCP. 
• Scope of Work: Fully capable of taking substantial features from concept to shipping as a sole contributor. Works effectively in open-ended projects and is self-sufficient to deep dive and evaluate multiple solutions to a problem.
• Problem Solving: Solve hard problems with many constraints, using sound judgment to assess risks and present arguments in a well-structured, data-backed, written narrative. Have passion, creativity and empathy for users.
• Quick Thinking: Able to derive information, think critically and make snap judgements based on measured data in high pressure situations.
• People Skills: Strong communicator who is able to build positive working relationships between teams and form relationships with key customers. 
• Nice to have:
- Experience working in a early-to-growth stage startup
- Experience building applications in different tech stacks
- Keen interest in decentralized technologies and its applications including cryptocurrencies
CoinGecko is an equal employment opportunity employer. Qualified candidates are considered for employment without regard to race, religion, gender, gender identity, sexual orientation, national origin, age, military or veteran status, disability, or any other characteristic protected by applicable law.

Interested? Hit the apply button to get started on your application!