Site Reliability Engineer
Any US - remote /
Engineering – Engineering /
Coalition’s Insurance and Cybersecurity offerings come together to provide a comprehensive shield from cyber risk. We believe the task of locking down every system and keeping up with every vulnerability is challenging and while being proactive is important, it’s not enough because breaches and other compromises happen, even to the vigilant.
While we proactively help our customers understand active risks and shut them down, when all else fails, we are there for them financially and with services to help mitigate damage and come back stronger after an incident.
Help us protect the world against cyber risk and give business owners a trusted support system and fighting chance.
We have over 25,000 customers, ranging from small and mid-sized businesses to Fortune 500 companies. Founded in 2017, Coalition has raised $125M from a number of top tier global investment firms including Ribbit Capital, Greenoaks Capital, Valor Equity Partners, Felicis Ventures, and Vy Capital. Headquartered in San Francisco, Coalition’s team is distributed across more than 15 locations globally, including Austin, Washington DC, Denver, Canada and Portugal.
Our culture is one of character, humility, responsibility, purpose, authenticity, and no a-holes. We are growing rapidly and that growth is enabled by strong teamwork, communication, and mentorship. We want people who are passionate about becoming experts in both the business and the technologies that support it.
Our core platform is written mostly in Python with some services in Java and Go. We prefer to use the right tool for the job and make pragmatic decisions about how to scale and de-couple systems as we continue to grow. We’re looking for someone who can navigate a cloud environment (AWS) with many moving pieces and systems to help the team understand how they fit into the broader puzzle.
About the Role
We are looking for a Site Reliability Engineer (Remote) who has the experience, ability, and mental fortitude to instrument and monitor the breadth of our full system stack (hosts, applications, and performance). In this role you will work closely with our engineering and information security teams to enhance the automated system provisioning and deployment subsystems within codified infrastructure. You will work with developers to create more robust and scalable services independent of cloud implementations. You will help to isolate, trap, and respond from the inevitability of system failure and develop strategies for continuous monitoring and analysis to reduce both downtime and required manual intervention.
- 3+ years of combined experience in SRE/DevOps roles in a full stack engineering environment
- 2+ years of experience in automated system provisioning, configuration, and Infrastructure as Code (Cloudformation, Terraform, Ansible, etc)
- Demonstrate proficiency with containerization and orchestration tools such as Kubernetes, Swarm, ECS
- Experience with CI/CD systems for example: Jenkins, Travis, or CircleCI
- Demonstrate proficiency in Python, GO or other scripting and systems languages
- Experience working with fault tolerance services and the iterative development of highly-available systems
- Some experience with one or more Infrastructure as a Service cloud providers (AWS/Azure/DigitalOcean/Google Cloud)
- Excellent organizational, verbal, and written communication skills
- Bachelor’s or Master’s degree in Computer Science, related field, or equivalent experience
- Experience with converting monolithic applications to microservices and service discovery technology
- Prior experience with full-stack monitoring from system level metrics to SLOs, failure-based testing approaches, and monitoring strategies
- Understanding of networking, systems engineering and hardware, data center architecture
- Exposure to systems security requirements and basic information assurance techniques
- Exposure to Kafka, AMQP, Kinesis, job queue and other pub/sub queuing systems
- Exposure to vulnerability scan results and reports
- Exposure to information security domain and data breaches
- Knowledge of Scrum & Agile Methodologies
- Enjoy a highly fulfilling, mission-driven culture
- Health, dental, and vision benefits for you and your family
- Life insurance and disability benefits
- Paid Parental Leave
- 401(k) plan
- Wellness and commuter benefits
- Flexible working hours
- Open vacation days
- We embrace distributed work; some benefits will vary by location
- You are an owner! We offer stock options to each of our employees
- More details at https://www.coalitioninc.com/careers
We are all here to build something we believe in and to make a company that will last. We’re also assembling a team of expert incident responders, threat and malware researchers, and security analysts to protect our customers before, during, and after a cyber incident. Our goal is to harness the power of technology with the safety of insurance, to provide the first holistic solution to cyber risk. Coalition's culture is one that strongly values humility, authenticity, and diversity. We want to work with people of different backgrounds and different paths in life, and we trust our team members to take responsibility, share ownership and work for one another. We are always looking for collaborative, inquisitive and dedicated individuals to join our team.
Recent press releases:
Coalition is proud to be an Equal Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.