LLM Ops Engineer

Pune, India

Pattern Corporate – Engineering /

Full-time /

Hybrid

Monitor, evaluate, and optimize AI/LLM workflows in production environments. Ensure reliable, efficient, and high-quality AI system performance by building out an LLM Ops platform that is self-serve for the engineering and data science departments.

Key Responsibilities:-

Collaborate with data scientists and software engineers to integrate an LLM Ops platform (Opik by CometML) for existing AI workflows
Identify valuable performance metrics (accuracy, quality, etc) for AI workflows and create on-going sampling evaluation processes using the LLM Ops platform that alert when metrics drop below thresholds
Cross-team collaboration to create datasets and benchmarks for new AI workflows
Run experiments on datasets and optimize performance via model changes and prompt adjustments
Debug and troubleshoot AI workflow issues
Optimize inference costs and latency while maintaining accuracy and quality Develop automations for LLM Ops platform integration to empower data scientists and software engineers to self-serve integration with the AI workflows they build

Requirements:-

Strong Python programming skills
Experience with generative AI models and tools (OpenAI, Anthropic, Bedrock, etc)
Knowledge of fundamental statistical concepts and tools in data science such as: heuristic and non-heuristic measurements in NLP (BLEU, WER, sentiment analysis, LLM-as-judge, etc), standard deviation, sampling rate, and a high level understanding of how modern AI models work (knowledge cutoffs, context windows, temperature, etc)
Familiarity with AWS
Understanding of prompt engineering concepts
People skills: you will be expected to frequently collaborate with other teams to help to perfect their AI workflows
Experience Level 3-5 years of experience in LLM/AI Ops, MLOps, Data Science, or MLE

Apply for this job