LLM Ops Engineer

Pune, India
Pattern Corporate – Engineering /
Full-time /
Hybrid
Monitor, evaluate, and optimize AI/LLM workflows in production environments. Ensure reliable, efficient, and high-quality AI system performance by building out an LLM Ops platform that is self-serve for the engineering and data science departments.

Key Responsibilities:-

    • Collaborate with data scientists and software engineers to integrate an LLM Ops platform (Opik by CometML) for existing AI workflows
    • Identify valuable performance metrics (accuracy, quality, etc) for AI workflows and create on-going sampling evaluation processes using the LLM Ops platform that alert when metrics drop below thresholds
    • Cross-team collaboration to create datasets and benchmarks for new AI workflows
    • Run experiments on datasets and optimize performance via model changes and prompt adjustments
    • Debug and troubleshoot AI workflow issues
    • Optimize inference costs and latency while maintaining accuracy and quality Develop automations for LLM Ops platform integration to empower data scientists and software engineers to self-serve integration with the AI workflows they build

Requirements:-

    • Strong Python programming skills
    • Experience with generative AI models and tools (OpenAI, Anthropic, Bedrock, etc)
    • Knowledge of fundamental statistical concepts and tools in data science such as: heuristic and non-heuristic measurements in NLP (BLEU, WER, sentiment analysis, LLM-as-judge, etc), standard deviation, sampling rate, and a high level understanding of how modern AI models work (knowledge cutoffs, context windows, temperature, etc)
    • Familiarity with AWS
    • Understanding of prompt engineering concepts
    • People skills: you will be expected to frequently collaborate with other teams to help to perfect their AI workflows
    • Experience Level 3-5 years of experience in LLM/AI Ops, MLOps, Data Science, or MLE