Senior DevOps Engineer, Observability & Reliability Engineering

USA /
DevOps – Observability and Reliability Engineering (ORE) /
Full-time Salary
We’re Ada, a brand interaction platform that empowers brands to live up to their promises and have more—and more valuable—interactions with the people who love them. Our AI-powered platform has automated over 3.5 billion brand interactions for the world’s fastest growing enterprises, including Zoom, Facebook, and Shopify. In May 2021, we raised Series C funding of $130M by Spark Capital (early investors of Twitter, Slack, Snapchat), and officially became a global unicorn with a $1.2B valuation! 

The passion of our people and dedication to their craft continues to drive our dramatic global growth. 

Our work is rooted in authenticity, courage, empathy, and simplicity. We use these values to create a culture that encourages groundbreaking results, career progression, and community investment. You can learn more about the founding of our company here.  

We are inspired every day by the opportunity to pioneer a new industry, and welcome those who want to join us.

Ada is looking for an experienced Observability and Reliability Engineer to join our team. The Observability & Reliability Engineering team that is responsible for setting a standard of operational excellence that allows our customers to trust the availability, reliability and performance of our systems.

We partner with both internal and external customers to guide, coach, and empower them to independently achieve their goals within their systems and applications.  You will report to our Observability & Reliability Engineering Manager. 

Responsibilities

    • Identifying and defining the appropriate operational procedures to optimize our systems and services within production
    • Implementation and management of observability and monitoring tools, incident management, SLO/SLI/SLA initiatives
    • Following cloud-based architectures that meet availability and recoverability requirements
    • Serve in on call rotation
    • Execute high Availability, Disaster Recovery, Sustained Resiliency, Chaos Engineering tests
    • Partner with security engineers and develop plans and automation to aggressively and safely respond to new risks and vulnerabilities
    • Continuous analysis of the existing infrastructure from the reliability perspective, centered around removing performance bottlenecks, optimizing the infrastructure, the toolkit, and the workflows involved in running it

About You

    • 4+ years of experience in SRE/DevOps/Infrastructure engineering 
    • 2+ years of experience in a cloud environment (preferably AWS)
    • Strong technical leadership skills
    • Experience with modern cloud infrastructure configurations and environments
    • Hands on application of Configuration Management to live production instances
    • Familiarity with containers, bash, Linux, Kubernetes and infrastructure-as-code principles
    • Ability to program (structured and OO) with one or more higher level languages, such as Python, Java, C/C++, Ruby, or Javascript
    • Strong motivation and experience to apply automation to reduce operational toil for both your area as well as all engineering teams
    • Experience in Agile software development projects

Outcomes

    • Contribute and help lead the design of cross-cutting technical solutions that increase efficiency and reduce operational toil within the company
    • Championing a culture of using best practices software engineering to solve problems in operations
    • Continuously drive improves in areas like performance, automation, quality, monitoring, and reliability of platforms.
    • Influence team-level prioritization and technical direction - contribute towards team roadmaps
    • Working closely with the security team to implement and monitor the cloud security policies.
    • Communicate ideas effectively to team members and internal stakeholders through technical assessments, meetings, demos, incidents, etc
    • Mentor other engineers to help foster a cohesive team environment
    • Participating in an on-call rotation for the services the team owns, triaging and addressing production issues
#LI-remote

Benefits
• Competitive salary and generous stock option plan
• Unlimited vacation
• Wellness account
• Extended health coverage
• Dental/optical/travel insurance
• Life insurance
• Employee and family assistance plan

Perks
• Flexible work schedule
• Digital first, fully remote with WFH budget
• In-house social worker
• Paid parental leave for Canadian and U.S. residents
• Development opportunities

About Us
Ada is a rapidly growing digital first company in a thriving AI ecosystem. We optimize our communication, collaboration, and work ethic for the digital world instead of in-person. We are building the workplace of the future to build the customer experience of the future. With flexible working hours, together we'll determine a schedule that fits your style and the requirements of your role. 

We are backed by world-class investors, including Spark, Accel, FirstMark, Bessemer Venture Partners, and Version One. We provide our employees with competitive compensation, great health benefits, and ownership in our company.  

Ada is an equal opportunity employer. In fact, diversity is what drives our success—it’s at the core of how we hire, communicate, and work. Like our platform, we are inclusive to all, and combine our diverse backgrounds, skill sets and thinking to build the best experiences for our clients and their customers.

Ada Privacy Policy