Senior Site Reliability Engineer
Raleigh/Durham | Remote /
Engineering – Engineering /
JupiterOne is a fast growing, cyber security company named in America’s Top Startup Employers for 2023 by Forbes and trusted by cloud-first companies like Hashicorp, Databricks, Marqeta, Divvy, Auth0, and more. JupiterOne earned the trust of Fortune 100 customers and gained more than $119M in funding and support from investors, and advisors like Sapphire Ventures, Bain Capital Ventures, Cisco Investments, Splunk Ventures, Intel Capital, and Alpha Square Group.
As a Site Reliability Engineer (SRE) you'll be combining your software and systems engineering experience to help our software engineers build, deploy, and monitor distributed systems. You'll be tasked with increasing overall system reliability and fault tolerance, automating system continuity and recoverability, and improving system observability. Finally, we want you to be comfortable speaking up, identifying problems, and offering suggestions on how to improve. The Site Reliability team works with internal development teams to maintain and support their services. Our work can range from system instrumentation, infrastructure design, and developing reliability best practices. Ultimately, we want to remove the burden (or at least greatly reduce the friction) of developing more reliable and secure systems. If wearing many different hats and learning is fun and enticing then this is the team for you!
What you will do:
- Serve as a subject matter expert for one or more of the SRE teams core initiatives (observability, infrastructure as code, CI/CD, system resilience, etc).
- Help engineering teams define, measure, and meet Service Level Objectives (SLOs) around increasing availability, reliability, performance, and decreasing mean time to resolution (MTTR).
- Help engineering teams identify Service Level Indicators (SLIs) that will help them meet objectives related to availability, reliability, and performance.
- Drive standardization of service and application instrumentation to help development teams gain system observability.
- Develop and maintain reusable infrastructure components to guide best practices.
- Develop and maintain tooling to make our system more reliable, secure, and performant.
- Create a framework for incident management to standardize how teams respond to and document outages.
- Build proactive monitoring that alerts on symptoms rather than on outages.
- 3+ years of experience in a Site Reliability or Platform Engineering Role
- You are a relentless advocate for best practices and continual improvements.
- Experience with scaling high performance applications to ensure high availability and resilient services.
- Experience using Prometheus for metric aggregation and alerting.
- Proficient in writing code in one or more languages such as TypeScript, Go (Golang), or Kotlin.
- Experience using Infrastructure as Code tools such as Terraform, Pulumi, or CloudFormation.
- Experience with managing cloud infrastructure in AWS, Azure, or Google Cloud.
- Experience with containerization and container orchestration platforms (ECS, Kubernetes, Nomad, OpenShift).
- Experience using observability tooling such as Grafana, Datadog, CloudWatch, or Honeycomb for diagnosing production issues.
- Experience using security tools to keep infrastructure and services secure.
- Operational experience and insights into best practices with running and scaling Prometheus.
- Well versed in ability to query Prometheus and create useful PrometheusRules.
- Experience with eBPF and its ecosystem.
- Experience using Loki for log aggregation.
- Experience using Tempo for trace aggregation.
- Experience with graph databases (e.g. AWS Neptune).
- Experience helping development teams instrument their services to gain deep insights into runtime behavior.
- Experience making Kubernetes a seamless PaaS for software engineers.
- You have an active Certified Kubernetes Administrator (CKA) Certification and/or equivalent experience that makes you an expert in this area.
- You have an active Certified Kubernetes Application Developer (CKAD) and/or equivalent experience that makes you an expert at deploying applications to Kubernetes.
- You have contributed to one or more CNCF related projects on Github.
JupiterOne's competitive compensation packages for salaried roles include: base salary, equity, variable/incentive pay, perks, and benefits. Cash compensation is determined by a variety of factors, to include job family, function, level, and geographic location, benchmarked against current market data.
The target base salary range for this position is $136,500-$190,000*
** Interviewing both senior and principal level engineers**
Final compensation packages are determined by multiple factors such as individual experience, education, qualifications, certifications, performance level, specialized expertise, and geographic location, and may vary from the target salary range listed above.
What we offer:
💰 Competitive Salary, plus Annual Bonus eligibility, plus Equity Options
🩺 Medical & Rx Plans with Telemedicine, Mental Health Support, and Fertility Benefits (incl. domestic partners)
🦷 Robust Dental and 🤓 Vision Plans (includes adult orthodontics!)
🆓 Zero cost medical/dental/vision options for employee only coverage
🏝 Flexible Paid Time Off (PTO) plus 🇺🇸10 Paid Holidays, including JupiterOne Day on July 21st
🐣 Paid Maternity & Paternity Leave at 100% of your salary
🦺 Paid Time Off to Volunteer every quarter
🏋🏽♂️🏊🏼♀️ Wellness Activities Reimbursement
➕ 401(k), Life Insurance, Short Term / Long Term Disability Options
🗣 Generous Employee Referral Program
🎉 Fantastic Company Culture - Fun Company Events - Career Growth Potential
🏳️🌈 🏳️⚧️ ✊🏿✊🏾✊🏽✊🏼✊🏻✊ All are welcome, celebrated, supported, and appreciated!
We are committed to growing fast with the support of our customers, team, and community.
We enjoy a culture of excellence among an accomplished group of executive leaders, engineers, sales, and marketing professionals.
Here are some awards we’re proud to share:
JupiterOne named America’s Best Startup Employers by Forbes in 2023
JupiterOne named Best Software Company and Best Small Company in 2023 by InHerSight
JupiterOne named Forbes Top 20 Cybersecurity Startups to watch in 2021
JupiterOne Founder and CEO, Erkang Zheng, selected as The Top 25 Cybersecurity CEOs of 2021
JupiterOne CISO, Sounil Yu, named Winner of the Top 10 CISOs of 2021 and Finalist for Top 10 Cyber Security Experts in the Black Unicorn Awards at Black Hat 2021
This is an opportunity to join a fully funded startup with incredible prospects for growth (career, financial, and personal).
JupiterOne is an equal opportunity employer. We're an inclusive team that is dedicated to creating a diverse environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.
JupiterOne is committed to providing access, equal opportunity and reasonable accommodation for individuals with disabilities in employment, its services, programs, and activities. To request reasonable accommodation, contact us at email@example.com or 833-578-7663.
We are unable to provide employment sponsorship at this time.