Senior Site Reliability Engineer

Kuala Lumpur
IT Department – DevOps /
Full time /
On-site
Xsolla KL is looking for a Senior Site Reliability Engineer in our KL Office.

Xsolla Technology Stack: Ubuntu, Kubernetes, Gitlab, Terraform, Terragrunt, Puppet, Nginx, Google Cloud Platform, AWS, Prometheus, Grafana, New Relic, ELK, Zabbix, Artifactory and Harbor.

RESPONSIBILITIES

    • Ensure high reliability and availability and meet SLAs, SLOs, and SLIs.
    • Review, provide feedback, and mentor coworkers on changes to maintain reliability.
    • Design and develop infrastructure and operational tasks with scalability and stability in mind.
    • Collaborate with stakeholders to deliver cost-effective, excellent infrastructure solutions and identify areas for improvement.
    • Drive incident resolution and process improvements to minimize downtime and increase operational transparency.
    • Ensure all key services are measured, monitored and raising alerts when needed.
    • Develop comprehensive monitoring solutions to provide full visibility to the different platform components using tools and services like Kubernetes, Prometheus, Grafana, New Relic and others.
    • Support services before they go live through activities such as capacity planning, monitoring setup, logging, and production readiness reviews.
    • Engage in service capacity planning and demand forecasting, performance analysis, and system tuning.
    • The ability to train and mentor less experienced engineers and set the direction for other engineers.
    • Collaborate with the development teams to enhance the product's operational stability.

REQUIREMENTS

    • Proven experience as a Site Reliability Engineer, or similar Software Engineering role in a large-scale production environment (5+ years). 8+ years overall in IT area (as Ops or Developer).
    • Robust knowledge and experience in cloud computing (preferred AWS/GCP).
    • Proficiency in Python or Go. Experience with PHP will be a plus.
    • Deep knowledge of monitoring systems such as Prometheus, Grafana, ELK/EFK, New Relic or Datadog, OpsGenie.
    • Excellent understanding of continuous integration/continuous delivery processes and platforms (Gitlab preferred). Experience with Helm.
    • Solid experience with Docker, Kubernetes, or other container orchestration systems.
    • Solid experience with infrastructure automation tools like Terraform.
    • Experience with automation, system administration, and system hardening.
    • Experience with Linux-based infrastructures, Linux/Unix administration.
    • Demonstrated problem-solving skills, particularly debugging and troubleshooting complex software systems. Ability to work under pressure.
    • Excellent communication skills with a capacity to articulate and solve complex technical problems.
    • Proficiency in written and verbal English language

NICE TO HAVE

    • IT professional certifications are not required, but it will be a plus
    • Prometheus Certified Associate (PCA) 
    • HashiCorp Certifications
    • Certified Kubernetes Administrator or Developer
BENEFITS:
Convenient work tools:
Latest Mac workplaces + additional hardware to make you more effective at work
Google Chat, Gmail, Google Drive, Confluence, Jira, GitLab
Professional growth:
Free training and participation in specialized conferences
Rich knowledge exchange within the company
More perks:
Health insurance
Flexible hours: organize your day according to your needs and sprint & teamwork demands
No dress code
Comfortable and new office environment
 
ABOUT XSOLLA
Xsolla is a global video game commerce company with a robust and powerful set of tools and services designed specifically for the video game industry. Since its founding in 2005, Xsolla has helped thousands of game developers and publishers of all sizes fund, market, launch and monetize their games globally and across multiple platforms. As an innovative leader in in-game commerce, Xsolla’s mission is to solve the inherent complexities of global distribution, marketing, and monetization to help our partners reach more geographies, generate more revenue and create relationships with gamers worldwide. Xsolla is headquartered and incorporated in Los Angeles, California, with offices in Berlin, Seoul, and cities worldwide. Xsolla supports major gaming titles like Valve, Twitch, Roblox, Ubisoft, Epic Games, Take-Two, KRAFTON, Nexters, NetEase, Playstudios, Playrix, miHoYo, and more. 

 For additional information and to learn more, please visit xsolla.com
 
PHYSICAL DEMANDS
The physical demands for this position are sitting, standing, bending, lifting, and moving intermittently during working hours. These physical requirements may be accomplished with or without reasonable accommodations.
The duties of this position may change from time to time so the individual and organization can achieve their results. This job description is intended to describe the general level of work being performed. It is not intended to be all-inclusive. Xsolla KL Sdn Bhd takes your privacy very seriously, and will not sell or externally distribute any data received during the hiring process. Pursuant to the Personal Data Protection Act 2010 (“PDPA”), Xsolla KL Sdn Bhd is mindful and committed to the protection of your personal information and your privacy. For more information related to PDPA 2010 please reach out to careers@xsolla.com.

Longevity Opportunity Vision Enjoy the game.
 
**Currently this position is open to all local applicants (Malaysian Citizens) or holding PR status in Malaysia.