Site Reliability Engineer

Beijing
IT Department – DevOps /
Full time /
On-site
The Xsolla DevOps team is looking for a passionate Senior Site Reliability Engineer.

Xsolla Technology Stack: Ubuntu, Kubernetes, Gitlab, Terraform, Terragrunt, Puppet, Nginx, Google Cloud Platform, Prometheus, Grafana, New Relic, ELK, Zabbix, Artifactory and Harbor.

RESPONSIBILITIES

    • Ensure high reliability and availability and meet SLAs, SLOs, and SLIs.
    • Review, provide feedback, and mentor coworkers on changes to maintain reliability.
    • Design and develop infrastructure and operational tasks with scalability and stability in mind.
    • Collaborate with stakeholders to deliver cost-effective, excellent infrastructure solutions and identify areas for improvement.
    • Drive incident resolution and process improvements to minimize downtime and increase operational transparency.
    • Ensure all key services are measured, monitored and raising alerts when needed.
    • Develop comprehensive monitoring solutions to provide full visibility to the different platform components using tools and services like Kubernetes, Prometheus, Grafana, New Relic and others.
    • Support services before they go live through activities such as capacity planning, monitoring setup, logging, and production readiness reviews.
    • Engage in service capacity planning and demand forecasting, performance analysis, and system tuning.
    • The ability to train and mentor less experienced engineers and set the direction for other engineers.
    • Collaborate with the development teams to enhance the product's operational stability.

REQUIREMENTS

    • Proven experience as a Site Reliability Engineer, or similar Software Engineering role in a large-scale production environment (5+ years). 8+ years overall in IT area (as Ops or Developer).
    • Robust knowledge and experience in cloud computing (preferred AWS/GCP).
    • Proficiency in Python or Go. Experience with PHP will be a plus.
    • Deep knowledge of monitoring systems such as Prometheus, Grafana, ELK/EFK, New Relic or Datadog, OpsGenie.
    • Excellent understanding of continuous integration/continuous delivery processes and platforms (Gitlab preferred). Experience with Helm.
    • Solid experience with Docker, Kubernetes, or other container orchestration systems.
    • Solid experience with infrastructure automation tools like Terraform.
    • Experience with automation, system administration, and system hardening.
    • Experience with Linux-based infrastructures, Linux/Unix administration.
    • Demonstrated problem-solving skills, particularly debugging and troubleshooting complex software systems. Ability to work under pressure.
    • Excellent communication skills with a capacity to articulate and solve complex technical problems.
    • Proficiency in written and verbal English language.

    • NICE TO HAVE
    • Prometheus Certified Associate (PCA)
    • HashiCorp Certifications
    • Certified Kubernetes Administrator or Developer
ABOUT XSOLLA

Xsolla is a global video game commerce company with a robust and powerful set of tools and services designed specifically for the video game industry. Since its founding in 2005, Xsolla has helped thousands of game developers and publishers of all sizes fund, market, launch and monetize their games globally and across multiple platforms. As an innovative leader in in-game commerce, Xsolla’s mission is to solve the inherent complexities of global distribution, marketing, and monetization to help our partners reach more geographies, generate more revenue and create relationships with gamers worldwide. Xsolla is headquartered and incorporated in Los Angeles, California, with offices in Berlin, Seoul, and cities worldwide. Xsolla supports major gaming titles like Valve, Twitch, Roblox, Ubisoft, Epic Games, Take-Two, KRAFTON, Nexters, NetEase, Playstudios, Playrix, miHoYo, and more. 

For additional information and to learn more, please visit xsolla.com

Longevity Opportunity Vision Enjoy the game.

For more vacancies: https://xsolla.com/careers/vacancies