Senior Site Reliability Engineer
Every day, millions of people rely on WeTransfer to share their creative ideas.
Having made its name in the game of quick and simple file-sharing, WeTransfer has grown into an end-to-end suite of digital solutions with more than 87 million monthly active users in 190 countries. Beyond the WeTransfer.com platform, we have the storytelling platform WePresent, quick slide-making tool Paste, immersive sketching app Paper, and inspiration-capturing tool Collect. We design and deliver delightful experiences that continue to feel obvious and intuitive to millions of people—from our moms to your favorite artists. As a certified B-Corp, WeTransfer aims to be a sustainable and responsible tech company, balancing people, planet, and profit
So, the work we do matters. Come and be a part of it.
Site Reliability Engineering
SRE is part of the Infrastructure pod (together with Security, Platform and Authentication & Billing), which is at the core of engineering at WeTransfer. You'll accelerate innovation by providing developers with the tools they need to take full ownership, while ensuring those are reliable, performant, and secure. Together with the product teams you will work on processes, tools, and services to support end-to-end ownership and to improve our services’ reliability. SRE at WeTransfer is still very new so you get to join at a great time because we just started building the team.
What you'll be doing :
- Build and maintain our observability stack to help engineers monitor and debug their applications.
- Improve incident management, response and review processes.
- Build and maintain services for engineers to improve the reliability of their products.
- Work with other teams and engineers to understand their needs, help them with their reliability related queries, and advise on how to improve stability and resilience.
What we will be looking for :
- Above all else we value open communication and good teamwork to achieve a shared vision. We are looking for proactivity in spotting problem areas and proposing ideas for improvements. You value good engineering principles and long-term design.
- Some hands-on experience with monitoring from SRE or user perspective, experience with DataDog is a plus.
- Experience with incident response and review, preferably in a no-blame culture.
- Some hands-on experience implementing resilience patterns such as circuit breakers, retry mechanisms, rate limiters, etc. Some professional technical experience, building and operating platform services at scale in a containerized environment.
- Some hands-on experience with building services with backend language such as Golang or Python.
- Hands-on experience using an infrastructure as a service provider, experience with AWS is a plus.
- Working understanding of networking principles such as routing, load-balancing, TCP/IP stack.
- Experience with teaching, implementing SLOs and/or chaos engineering is a plus.
WeTransfer is an equal opportunity employer and we pride ourselves on the diversity of our people. We welcome you, and everything that makes you—well, you. That includes your gender identity, sexual orientation, religion, ethnicity, age, or disability status.
A note on remote
Our work environment is hybrid-remote, meaning that we support our employees to work remotely and in the office. We encourage employees to decide for themselves and with their team whether or when to go to the office. However, we recommend that you don't come to the office more than 2-3 per week - since that wouldn't be hybrid anymore.
While it is not necessarily a determining or disqualifying factor for any role, you may be required to complete a standard employment background screening.