ML Engineer L4, Consumer Inference

Los Gatos, California
Streaming – Personalization Engineering /
Full-time /
With more than 230 million members in over 190 countries, Netflix continues to shape the future of entertainment around the world. Machine Learning/Artificial Intelligence is powering innovation in all areas of the business, from helping members choose the right title for them through personalization, to better understanding our audience and our content slate, to optimizing our payment processing and other revenue-focused initiatives.

The Machine Learning Platform (MLP) provides the foundation for all of this innovation. It offers ML/AI practitioners across Netflix the means to achieve the highest possible impact with their work by making it easy to develop, deploy and improve their machine learning models. 

As part of our mission to support the infrastructure for machine learning across the company, we are hiring for a Machine Learning Engineer to join our team to contribute to the team's mission of bridging the gap between ML research and productization. In this role, you will:

-Develop customer facing libraries and services to productize machine learning models for efficient and scalable inference. 
-Develop and maintain online inference services that provide real-time predictions with low latency and high reliability.
-Optimize and deploy large language models (LLMs) for efficient, scalable inference, ensuring high performance and low latency in production environments.
-Maintain and improve a model registry to facilitate the discovery, versioning, and governance of machine learning models.
-Participate in and improve ML Platform incident management and support workflows.

What we offer
Opportunity for impact. You will work on cutting edge ML infrastructure use cases and technologies. This role will develop into strategic ownership opportunities for defining MLP’s path from research to production, specifically focusing on building services and tools to accelerate research to production velocity for Netflix ML practitioners..

Responsibility. Netflix offers true transparency and autonomy. Our culture is unique and is key to how we innovate. From day one, your expertise and opinion will be respected and valued by the team and you’ll be given autonomy in deciding the best direction to set for optimizing research to the production path for ML practitioners at Netflix.

Learning. You will be developing libraries, tools and services to ensure an efficient and reliable journey of productizing ML models. You will have the opportunity to work with stunning colleagues who value collaboration and have a wealth of experience you can tap into.

A work environment where you can grow your career. ML Platform offers a wide variety of projects that can help find the areas you are passionate about.

Who will be successful in this role?

    • You are highly customer-driven / developer-driven and empathic. You strive to always focus on delivering customer / user value with an excellent customer service mentality.
    • You have a strong understanding of building scalable and efficient model serving solutions to support large-scale inference for generative models and large language models (LLMs). You create solutions that your stakeholders love and you drive development success from planning to implementation to delivery.
    • You can successfully execute changes within a team's systems, including developing, testing, deploying, and revising solutions.
    • You can communicate and collaborate effectively  (e.g. project meetings, team meetings, code reviews) with immediate team peers and cross-functional project teams.
    • You are eager to both go deep and wide on ML-facing projects. When a project needs deep technical expertise in a domain area you are able to get up to speed quickly. When projects require breadth of focus you are eager to do what’s needed to deliver value even if it means going outside of your comfort zone.


    • Strong programming skills, particularly in languages such as Python and Java, and familiarity with ML libraries and frameworks like TensorFlow, PyTorch.
    • Familiarity tools and techniques for deploying machine learning models into production environments, with a particular emphasis on GPU inference optimization (e.g., Triton Inference Server, TensorRT), as well as containerization (e.g., Docker) and orchestration (e.g., Kubernetes).
    • Experience designing with data handling, preprocessing, and transformation techniques to prepare data for model inference.
    • Demonstrated industry-leading experience in large-scale build, release, CI/CD and observability techniques, with particular emphasis on multi-language environments including Scala, Java, and Python.
    • Adopt and promote best practices in operations, including observability, logging, reporting, and on-call processes to ensure engineering excellence.
Our compensation structure consists solely of an annual salary; we do not have bonuses. You choose each year how much of your compensation you want in salary versus stock options. To determine your personal top of market compensation, we rely on market indicators and consider your specific job family, background, skills, and experience to determine your compensation in the market range. The range for this role is $100,000 - $464,000.

Netflix provides comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family-forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs.  Full-time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off. Full-time salaried employees are immediately entitled to flexible time off. See more detail about our Benefits here.

Netflix is a unique culture and environment.  Learn more here.

We are an equal-opportunity employer and celebrate diversity, recognizing that diversity of thought and background builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.