Site Reliability Engineering Manager
San Francisco, CA
Engineering – Site Reliability
At Carta, our mission is to create more owners. By building an ownership management and execution platform, we’re changing how over 12,000 companies, and more than 800,000 investors, law firms, and employees manage equity around the globe.
We are looking for an SRE Manager to help scale our team and systems. Your primary responsibilities will be continuing to build the teams, systems and culture that will make our products, organization and infrastructure resilient, scalable, and reliable.
On the technical side you should have a strong background and opinions on distributed systems architecture, software development, systems theory and networking, and performance engineering including OS internals. You should have practical experience being responsible for production systems running at scale. You'll be hands on and also represent the SRE perspective in architectural and other engineering discussions. You understand how complex systems fail and spend the time and effort educating others.
On the non-technical side you should understand and believe that SRE is about product. You look forward to working closely with all aspects of the organization to make sure we're delivering the best products and services to all our customers- internal and external. You will be a champion of the SRE mindset which means spreading the culture and concepts of things like "psychological safety" and "just culture". You'll run blameless learning reviews and help continuously improve our processes across the board. You'll have responsibility for growing and mentoring the SRE team to support the increasing initiatives of the organization.
- 6+ years of professional experience in an SRE or similar role
- 2+ years of experience managing an SRE team
- You've been responsible for production systems
- Experience with managing distributed teams
- Experience and strong opinions on what monitoring and alerting should look like for complex systems
- Hands on experience with hosting distributed systems on a cloud provider (AWS, Google Cloud Platform, etc)
- Hands on experience with running production systems on kubernetes
- “Infrastructure as code” tools such as Terraform or Cloudformation
- Configuration management tools such as Ansible, Chef, Puppet, Salt
- Experience building services in python, golang or java
- Experience with a variety of database types: relational, time series, document
- Consumer facing web application development and operations experience
- Experience with CI/CD pipelines
- Experience with a message passing architectures
- Distributed tracing
- Running a service oriented architecture in production
- Any security background ( network, application, OS )
Carta is creating the ownership network that maps the world's assets. Check out who we are and how we work here.
At Carta we want to create an environment for Carta's owners - you - to do your best work, by offering competitive benefits and perks:
We are committed to WELLNESS:
- Health, dental, vision, and life insurance
- Competitive PTO and unlimited sick time
We are committed to INVESTING IN YOUR FUTURE:
- 401k matching program
We are committed to A SUPPORTIVE WORK ENVIRONMENT:
- Commuter benefits
- Catered lunch and unlimited snacks
- Cell phone stipend
We are committed to LIFELONG LEARNING:
- Unlimited reimbursement for work related books
- Fast paced work environment geared towards professional growth