Sr. Site Reliability Engineer (SRE)

Latin America
Engineering – DevOps /
Full-Time Contractor (Remote) /
Remote
We are looking for a Senior Site Reliability Engineer with strong experience in AWS, system monitoring, and infrastructure automation. The role involves maintaining and improving the reliability and performance of a cloud-based lending platform used by mid-market and large financial institutions. 

The ideal candidate will have a solid background in systems engineering and software development, be comfortable working across teams, and take ownership of operational stability and tooling improvements.

Responsibilities:

    • Own your deep learning about the software, its functions, and how it fulfills the clients’ needs, and how they use the product. 
    • Oversee systems to ensure reliability for customers. 
    • Monitor distribution systems and notify appropriate persons of any potential issues. 
    • Run the production environment by monitoring availability and taking a holistic view of system health. 
    • Build software and systems to manage platform infrastructure and applications. 
    • Improve reliability, quality, and time-to-market of our suite of software solutions. 
    • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve. 
    • Partner with development teams to improve services through rigorous testing and release procedures.

Technical Skills:

    • Bachelor's Degree (B.A.) in Computer Science or Design or equivalent four-year degree, or equivalent related experience. 
    • 5-7 years of proven experience in a Site Reliability role or similar experience. 
    • Excellent oral and written communication skills, including facilitation of group presentations,  and consulting skills in the English language. 
    • Possess deep technical experience with AWS, containerization technologies, automated deployment frameworks, monitoring, logging, alerting, system internals, networking,  databases, distributed systems, and service-oriented architecture. 
    • Demonstrate hands-on technical leadership and business impact in combining software engineering skills with systems engineering skills to solve complex automation and reliability challenges. 
    • Experience working with Infrastructure and Application Monitoring tools such as: New Relic,  SumoLogic, Uptime monitoring (Pingdom), CloudTrail, CloudWatch Insights, CloudFormation, CodePipeline, CodeDeploy. 
    • Extensive working knowledge of managing AWS and Linux OS. 
    • Experience working with MSSQL, MySQL, in cloud-based environments, as well as demonstrable knowledge and experience of AWS service technologies, i.e., Aurora, MySQL.  
    • Experience of working with NoSQL database technologies (ideally DynamoDB). 
    • Experience of working with pipeline automation scripting and tooling, i.e., Jenkins, Terraform. 
    • Knowledge and experience utilizing coding languages (e.g., C++, Java, PHP) and frameworks/systems (e.g., AWS). 
    • Ability to learn new languages and technologies strongly preferred. 
    • Broad understanding of the lending industry, with the ability to become a subject matter expert on the job.

Soft Skills:

    • A strong sense of ownership. 
    • Excellent written and verbal communication and interpersonal skills. 
    • Able to effectively collaborate with technical and business partners. 
    • Can take on full projects from beginning to end. 
    • Problem solver. 
    • Team Player. 
    • Advanced English level.