SRE (Site Reliability Engineer)

Lisbon, Portugal

Engineering /

Full Time /

Remote

Who Are We?

Obol Labs is a remote-first research and software development team focused on Proof of Stake infrastructure for public blockchain networks. Specific topics of focus are Internet Bonds, Distributed Validator Technology, and Multi-Operator Validation. The core team includes 35 members spread across +14 countries.

The core team is building the Obol Network, a protocol to foster trust-minimized staking through multi-operator validation. This will enable low-trust access to Ethereum staking yield, which can be used as a core building block in various Web3 products.

The Network

The network can be best visualized as a work layer that sits directly on top of the base layer consensus. This work layer is designed to provide the base layer with more resiliency and decentralization as it scales. In this chapter of Ethereum, we will move on to the next great scaling challenge, which is stake centralization. Layers like Obol are critical to the long-term viability and resiliency of public networks, especially networks like Ethereum.

Obol as a layer is focused on scaling main chain staking by providing permissionless access to Distributed Validators. The network utilizes a middleware implementation of Distributed Validator Technology (DVT), to enable the operation of distributed validator clusters that can preserve validators' current client and remote signing configurations.

Similar to how roll-up technology laid the foundation for L2 scaling implementations, we believe DVT will do the same for scaling the consensus layer while preserving decentralization. Staking infrastructure is entering its protocol phase of evolution, which must include trust-minimized staking networks that can be plugged into at scale. We believe DVT will evolve into a widely used primitive and will ensure the security, resiliency, and decentralization of public networks.

The Obol Network develops and maintains three core public goods that will eventually work together through circular economics:

The DV Launchpad, a User Interface for bootstrapping and managing Distributed Validators

Charon, a Golang based middleware client that enables validators to run in a fault-tolerant, distributed manner.

Obol Splits, a set of solidity contracts for the formation of Distributed Validators tailored to different use cases such as DeFi, Liquid Staking, and Fractionalized Deposits

Sustainable Public Goods

Obol is inspired by previous work on Ethereum public goods and experimenting with circular economics. We believe that to unlock innovation in staking use cases, a credibly neutral layer must exist for innovation to flow and evolve vertically. Without this layer, highly available uptime will continue to be a moat.

The Obol Network will become an open, community-governed, self-sustaining project over the coming months and years. Together we will incentivize, build, and maintain distributed validator technology that makes public networks a more secure and resilient foundation to build on top of.

The Role

The Platform Engineering team at Obol is looking for a talented and experienced site reliability engineer to help us build and support our global infrastructure and operations. Join our growing organization and you will get a chance to be in the driving seat of innovation and change at Obol.

As a site reliability engineer, you will be responsible for building, monitoring, securing, and ensuring the reliability of our globally distributed infrastructure that supports Obol's network of thousands of Distributed Validator clusters and deployments.

Responsibilities

Responsible for security, observability and monitoring stack.
Troubleshoot incidents and issues in Obol’s infrastructure.
Support and enhance the reliability and performance of Obol’s Distributed Validator Client for running Ethereum validators in a fault-tolerant manner.
Collaborate with staking node operators to ensure Obol’s DVT optimal performance, seamless deployments and rollouts, identify issues and fix them.
Coordination of the on-call rotations to ensure systems uptime and incident resolution.
Apply platform engineering and cloud-native best practices and standards to the software you write.

Requirements

At least 4 years of experience in Site Reliability Engineering or a similar role.
Experience in web3 and blockchain such as Ethereum, and particularly understanding of how proof of stake works and node operations.
Experience in one or more of the public cloud platforms GCP, AWS, or Azure.
Experience in containerization technologies with Docker, and Kubernetes.
Experience with monitoring using Prometheus, Loki, and Grafana.
Experience in developing infrastructure as code with Terraform.
Development experience, preferably with bash, Python, or Golang.
Excellent communication and delivery skills.

Nice to have

Experience with Helm, Ansible, and DevSecOps.
Experience in networking and distributed systems.
Experience working with remote teams.

Benefits

Fully Remote Working and flexible hours.
Meet the team at our Annual Offsites.
Chance to attend crypto and staking conferences.
Working with the purpose of decentralising Ethereum.
Generous paid time off.
Budget for equipment.
Budget for training or education.

Thank you for your interest. Looking forward to building amazing stuff together!

Apply for this job