Senior Software Engineer - GPU Virtualization

Toronto, ON
Engineering /
Full-Time /
Hybrid
About Us
We believe AI will fundamentally transform how people live and work. CentML's mission is to massively reduce the cost of developing and deploying ML models so we can enable anyone to harness the power of AI and everyone to benefit from its potential.

Our founding team is made up of experts in AI, compilers, and ML hardware and has led efforts at companies like Amazon, Google, Microsoft Research, Nvidia, Intel, Qualcomm, and IBM. Our co-founder and CEO, Gennady Pekhimenko, is a world-renowned expert in ML systems who holds multiple academic and industry research awards from Google, Amazon, Facebook, and VMware.

Overview: 
 
We are seeking a highly motivated and skilled research software development engineer to join our team in a key role focused on designing, developing, and maintaining the CentML platform that offers a cost effective infrastructure for serving and training large scale machine learning models. As part of the role, you will be working with our research team for developing software-level GPU virtualization technologies specifically tailored for ML applications to improve the hardware resource utilization of our GPU cluster. You will also be working with our infrastructure team to incorporate these solutions into the CentML platform.

Responsibilities:

    • Taking part in the design and development of the CentML platform.
    • Work with our research team to implement an efficient and reliable software-level GPU virtualization technology for ML training and inference workloads and incorporate it with the CentML platform.
    • Communicate with our product teams and define new features and goals for improving the CentML platform.

Qualifications

    • Graduate degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. It is highly desirable to have publication records as well.
    • Strong background in GPU architecture and GPGPU programming. Experience specifically in Nvidia GPU architecture, PTX ISA, and CUDA programming is highly desired.
    • Experience working on GPU hardware and software-level virtualization technologies (e.g., Nvidia MPS, MIG).
    • Experience with OS-level programming (e.g., Linux kernel, Nvidia device driver). A big plus if you have expertise in container runtime technologies like docker engine, containers, or CRI-O.
    • Fluent in C/C++
Benefits & Perks
- An open and inclusive culture and work environment
- Fully stocked kitchen at the office
- Full health and dental benefits
- Parental Leave top-up for 6 months
- Continuous education budget
- Generous vacation - we're not saying unlimited, but if you need extra time to recharge, just ask

At CentML, we celebrate our differences and value cultivating an inclusive environment for all. We welcome applications of all kinds and are committed to providing an equal opportunity process.