Lead Engineer - Site Reliability (Metrics and Monitoring)

Wellington, NZ / Auckland, NZ / Hawkes Bay, NZ / Remote, NZ
Technology – Engineering /
Permanent /
Hybrid
Xero is a beautiful, easy-to-use platform that helps small businesses and their accounting and bookkeeping advisors grow and thrive. 

At Xero, our purpose is to make life better for people in small business, their advisors, and communities around the world. This purpose sits at the centre of everything we do. We support our people to do the best work of their lives so that they can help small businesses succeed through better tools, information and connections. Because when they succeed they make a difference, and when millions of small businesses are making a difference, the world is a more beautiful place.

About the role:

As a Lead Engineer within the Site Reliability Engineering (SRE) Metrics & Monitoring team you’ll have a thorough knowledge of industry-leading observability practices and extensive hands on experience. You’ll have a proven ability to provide strong technical mentorship, guiding engineers to upskill and enabling a focus on continual improvement. Working closely with the Product Manager and Team Lead, you’ll contribute your technical expertise and leadership to align team deliverables with the wider SRE and Xero initiatives. 

You’ll help to adapt and grow observability at Xero, informed by a strong understanding of systems and reliability engineering and modern SRE principles. You’ll drive uplift in observability at Xero, paving the way for engineering teams to adopt Open Telemetry. You’ll be a strong advocate for the customer while contributing to the technical direction and roadmap for SRE products. You’ll model a growth mindset and help improve our services by identifying gaps, promoting capability growth, discovering technical solutions to business problems and championing modern practices. 

What you'll do:

    • Design systems to improve adoption of Xero's observability tools with a strong focus on reducing toil in managing our monitoring and logging platforms.
    • Have a strong focus on developing and growing engineers through technical mentoring and coaching. 
    • Provide leadership around observability standards and practices.
    • Create systems that support and enable our product teams to uplift their observability practices.
    • Improve the implementation of system instrumentation as and when required.
    • Be a key member of the pod leadership, contributing to technical strategy, feasibility, backlog management and enabling delivery.
    • Participate in the wider SRE team on-call roster responding to Xero-wide incidents.
    • Empower other engineering teams at Xero to achieve a high standard of system awareness so they can create efficient, scalable and reliable applications for Xero's customers.

What you'll bring:

    • Experience with agile software development methodology including continuous integration and delivery.
    • An understanding of how solutions architecture or architecture design works in a large software delivery organisation.
    • Experience building and implementing observability with large distributed cloud environments (ideally AWS).
    • Excellent knowledge of reliability and observability concepts and practices.
    • An understanding of Open Telemetry and how it works.
    • Experience being on call and helping to resolve production incidents in a complex environment.
    • Experience in instrumenting applications and integrating with monitoring solutions like New Relic, Datadog, Dynatrace, SignalFX, Scalyr, Sumo Logic or Splunk (ideally New Relic).
    • Proficiency in one or more object-oriented programming languages such as C#, JavaScript, Golang, Python etc. 
    • Experience with DevOps tooling, eg. Linux, Docker, Kubernetes, IaC, CICD tools.
    • The ability to help structure work to make optimal use of the team’s resources.
    • The ability to set quarterly and annual objectives for the team in collaboration with the Product Manager and Team Lead
    • Proven ability to engage, influence and build relationships with internal stakeholders.
    • Experience in managing and maintaining healthy observability platforms for a large user base.
Why Xero
Offering very generous paid leave to use however you’d like (plus statutory holidays!), dedicated paid leave to care for your physical and mental wellbeing as well as an Employee Assistance Program to access mental health care for you and your family, free medical insurance, wellbeing and sports programmes, employee resource groups, 26 weeks of paid parental leave for primary caregivers, an Employee Share Plan, beautiful offices, flexible working, career development, and many other benefits that reflect our human value, you’ll do the best work of your life at Xero.

Our collaborative and inclusive culture is one we’re immensely proud of. We know that a diverse workforce is a strength that enables businesses, including ours, to better understand and serve customers, attract top talent and innovate successfully.  At Xero we embrace diversity and inclusion and value a #challenge mindset.

Research has shown that women and underrepresented groups are less likely to apply to jobs unless they meet every single competency or experience . If you are excited about this role, but your past experience doesn't align perfectly, we encourage you to apply anyway. You could be just the right person for this role and Xero. If you have any support or access requirements, we encourage you to advise us at time of application and throughout the interview process.

Xero is an NZ Immigration Accredited Employer and Rainbow Tick certified too.