Director, Site Reliability Engineering
San Francisco, CA /
Technology & Networking – Software - SRE /
Crusoe Energy is on a mission to unlock value in stranded energy resources through the power of computation.
Take a look at what we do! - https://www.youtube.com/watch?v=Rlt8k71Quqw
We aim to align the long term interests of the climate with the future of global computing infrastructure. As data centers consume an exponentially growing power footprint to deliver technology to all connected devices, we are inspired by making sure that the energy meeting that demand is sourced in an environmentally responsible fashion. Crusoe co-locates mobile data centers with stranded energy resources, like flare gas and underloaded renewables, to deliver low-cost, carbon-negative distributed computing solutions. Crusoe Cloud is a managed cloud services platform powered by stranded energy that enables climate-friendly innovation in computationally intensive fields including artificial intelligence, graphics rendering and computational biology.
Our values drive our work from thinking like a mountaineer, our relentless commitment to resource efficiency and tapping into our collective genius. At Crusoe, you will be challenged to do the best work of your entire life and to be and become your best self.
Our teams are empowered to tackle hard problems that benefit our customers, grow our business and create a positive impact in reducing global emissions. You should have a proven track record of building fast, highly available infrastructure at scale.
About The Role
In this role, you will be responsible for leading a team of engineers in building and maintaining key infrastructure for the company. You should have a proven track record of building fast, highly available infrastructure at scale, and be able to use your expertise in cloud services, developer tools, and security practices to guide decision-making and ensure a high level of technical excellence.
You will report directly to the CTO and work closely with the engineering team to align the SRE team's work with the company's overall business objectives, while balancing resource trade-offs and maintaining a high degree of technical excellence.
In addition to coordinating cross-functional technical projects and evaluating tools and frameworks for the engineering team, you will mentor engineers and take an active role in their development as the company grows. You will constantly refine technical and team fit assessments and work with engineering leadership to ensure that the team is meeting its goals and making a positive impact on global emissions.
A Day in the Life:
- Grow and lead the SRE organization focused on both the cloud platform and our fleet’s operational expansion
- Develop and implement the SRE strategy and roadmap around supporting network uptime, application deployment, security, and privacy
- Hire, onboard, mentor, and develop the SRE team members, while optimizing resources needed to deliver high-quality software on time
- Collaborate cross-functionally with relevant teams around the org (i.e., product management, operations, business development, customer support), to ensure that software reliability meets stakeholders’ needs and expectations
- Stay up-to-date on modern SRE tools, technologies, processes and best practices
- Hold the SRE team and wider organization accountable for maintaining this standard Co-facilitate architecture and infrastructure design decisions
You Will Thrive In This Role If:
- Bachelor's Degree in Computer Science or related field, or 15+ years relevant work experience
- 12+ years of professional SRE experience
- 8+ years of experience contributing to architecture, design patterns, reliability and scaling of new and current systems
- Has managed in an org where you helped define the role and built out the SRE function, including the hiring and developing of other SREs
- Experience with information security best practices
- Experience running and maintaining production infrastructure hosted on AWS/GCP/Azure
- Experience with logging, monitoring and alerting systems and tools where revenue and business goals were impacted by performance
- Leverages excellent written / verbal communication and interpersonal skills while embodying our company values at all times
- Embody the Company values
Exposure across the following domains, with strong expertise in at least four:
- Unix/Linux environments (kernel, networking, virtualization)
- Bare metal provisioning and physical datacenter operations
- Modern cloud infrastructure tools such as Docker, Kubernetes
- Infrastructure-as-code tooling such as Pulumi, Ansible, Cloud Formation, Terraform
- Modern CI/CD practices and build systems, such as GitLab CI/CD, CircleCI, GitHub Actions
- TCP/IP and network programming
- Hybrid work schedule
- Industry competitive pay
- Restricted Stock Units in a fast growing, well-funded technology company
- Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
- Paid Parental Leave
- Paid life insurance, short-term and long-term disability
- Pet-friendly offices
- 401(k) with a 100% match up to 4% of salary
- Unlimited time off
- Cell phone reimbursement
- Tuition reimbursement
- Subscription to the Calm app
- NYDIG - Bitcoin Savings Plan
- Company paid commuter benefit; $100 per month
- Compensation will be paid in the range of $225,000 - $250,000. RSUs are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.
Crusoe Energy is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.