SRE (Site Reliability Engineer)
Lisbon, Portugal /
Who Are We?
Obol Labs is a remote-first research and software development team focused on Proof of Stake infrastructure for public blockchain networks. Specific topics of focus are Internet Bonds, Distributed Validator Technology, and Multi-Operator Validation. The core team includes 14 members spread across 8 countries.
The core team is building the Obol Network, a protocol to foster trust-minimized staking through multi-operator validation. This will enable low-trust access to Ethereum staking yield, which can be used as a core building block in various Web3 products.
The network can be best visualized as a work layer that sits directly on top of the base layer consensus. This work layer is designed to provide the base layer with more resiliency and decentralization as it scales. In this chapter of Ethereum, we will move on to the next great scaling challenge, which is stake centralization. Layers like Obol are critical to the long-term viability and resiliency of public networks, especially networks like Ethereum.
Obol as a layer is focused on scaling main chain staking by providing permissionless access to Distributed Validators. The network utilizes a middleware implementation of Distributed Validator Technology (DVT), to enable the operation of distributed validator clusters that can preserve validators' current client and remote signing configurations.
Similar to how roll-up technology laid the foundation for L2 scaling implementations, we believe DVT will do the same for scaling the consensus layer while preserving decentralization. Staking infrastructure is entering its protocol phase of evolution, which must include trust-minimized staking networks that can be plugged into at scale. We believe DVT will evolve into a widely used primitive and will ensure the security, resiliency, and decentralization of public networks.
The Obol Network develops and maintains four core public goods that will eventually work together through circular economics:
The DV Launchpad, a User Interface for bootstrapping and managing Distributed Validators
Charon, a middleware Golang client that enables validators to run in a fault-tolerant, distributed manner
Obol Managers, a set of solidity libraries for the formation of Distributed Validators tailored to different use cases such as DeFi, Liquid Staking, and Fractionalized Deposits
Obol Testnets, a set of ongoing public incentivized testnets that enable any sized operator to test their deployment before serving for the Ethereum Main net
Sustainable Public Goods
Obol is inspired by previous work on Ethereum public goods and experimenting with circular economics. We believe that to unlock innovation in staking use cases, a credibly neutral layer must exist for innovation to flow and evolve vertically. Without this layer, highly available uptime will continue to be a moat.
The Obol Network will become an open, community-governed, self-sustaining project over the coming months and years. Together we will incentivize, build, and maintain distributed validator technology that makes public networks a more secure and resilient foundation to build on top of.
The Platform Engineering team at Obol is looking for a talented and experienced SRE (site reliability engineer ) to help us build and support our global infrastructure and operations.
Join our growing organization and you will get a chance to be in the driving seat of innovation and change at Obol.
As a site reliability engineer, you will be responsible for building, monitoring, securing, and ensuring the reliability of our globally distributed infrastructure that supports Obol's network of thousands of Distributed Validator clusters and deployments.
- Responsible for infrastructure automation and observability on different Cloud and on-prem platforms.
- Monitor and troubleshoot incidents and issues in Obol’s infrastructure and ensure the incident management and post-mortem standard procedures are intact.
- Collaborate with staking operators to ensure Obol’s DVT optimal performance, seamless deployments and rollouts, identify issues and fix them.
- This collaboration includes synchronous communication through calls and asynchronous communication via discord, telegram, emails, etc.
- Support and enhance the reliability and performance of Obol’s Distributed Validator Client for running Ethereum validators in a fault-tolerant manner.
- Participate in the engineering on-call rotations to ensure systems uptime and incident resolution.
- Apply platform engineering and cloud-native best practices and standards to the software you write.
- At least 2 years of experience in Site Reliability Engineering or a similar role.
- Experience in one of the public cloud platforms GCP, AWS, or Azure.
- Experience in containerization technologies with Docker, Docker-compose, and Kubernetes. Experience in developing infrastructure as code with Terraform.
- Development and scripting experience, preferably with bash, Python, or Golang.Experience with monitoring using Prometheus, and Grafana.
- Prior experience in web3 and blockchain technologies such as Ethereum is highly preferred, particularly an understanding of how an Ethereum proof of stake validator works. (Mandatory)
- Excellent communication and delivery skills to represent Obol externally by working with enterprise staking node operators deploying our software.
Nice to have
- Experience with Ansible, Helm, and Prometheus Loki
- Experience in networking and distributed systems
- Experience working with remote teams
- Fully Remote, flexible working hours (Independent contractor)
- Work with a team of talented engineers building amazing stuff
- We care about work-life balance, you are important as a professional and also as a person!
- Annual global offsite
- Unlimited paid time off (based on our company policy)
- Personal hardware & professional training budget
Apply to our role and be part of our growing team!