Site Reliability Engineer
New York, NY /
Product, Data, and Engineering – Engineering /
Full time flexible location
Spruce is the platform enabling modern real estate transactions. We work with forward thinking mortgage lenders, real estate companies and investors, and we believe the future of real estate will be driven by automation, efficiency and digital experiences, and we provide the products and services necessary to make that happen.
About the job
The Site Reliability team plays a critical role in developing infrastructure to reliability serve our customers, enhancing developer productivity, and teaching others about SRE engineering best practices. As a Site Reliability Engineer you’ll get hands-on experience building, deploying, and maintaining our internal and external systems, while working closely with the product engineers who use the systems you’re building.
We’re looking for people with a strong background in systems. We’d love to hear from you whether you’re a seasoned systems developer, or whether you’ve just learned you might like working with databases. Our site reliability team works remotely, and we’d be happy to talk to you about the possibility of working remote.
You might be a fit for this role if
- You think about systems design, edge cases, and fault tolerance
- You enjoy working with Kubernetes everyday to build scalable infrastructure systems
- You want to build automated release pipelines with CI/CD tools (e.g. CircleCI, Jenkins)
- You can use a Unix shell to debug complex bugs across the whole stack
- You focus on the needs of both internal and external users
- You can write high quality code in a scripting language (e.g. Golang, Python, NodeJS)
- You have experience with at least one major cloud provider (AWS, GCP, Azure)
Projects you could work on
- Develop features for our automated Github Actions QA system
- Upgrade Kubernetes clusters with modern cloud-native tooling and systems design
- Enhance our GitOps style CI/CD pipelines to make better production releases
- Implement monitoring tools to better observe our systems and understand if we’re meeting our service level objective (SLOs)
- Add request tracing to our systems and services
- Test our systems fault tolerance with load testing and Chaos Engineering tools
We are proud of the team we’re building. We're committed to equal opportunity employment -- and beyond. We believe diverse experiences and perspectives build a stronger team and a better product. We welcome fresh perspectives and challenge our own assumptions to make Spruce better. The more inclusive we are as a company, the better we can serve our customers.