Software Development Engineer II - AI

India - Hyderabad

Product – Engineering /

Full time /

Hybrid

About Highspot

Highspot is a software product development company and a recognized global leader in the sales enablement category, leveraging cutting-edge AI and GenAI technologies at the core of its robust Software-as-a-Service (SaaS) platform. Highspot is revolutionizing how millions of individuals work worldwide. Through its AI-powered platform, Highspot drives enterprise transformation to empower sales teams through intelligent content management, training, contextual guidance, customer engagement, meeting intelligence, and actionable analytics. The Highspot platform delivers advanced features tailored to business needs, in a modern design that sales and marketing executives appreciate and is the #1 rated sales enablement platform on G2 Crowd.

While headquartered in Seattle, Highspot has expanded its footprint across America, Canada, the UK, Germany, Australia, and now India, solidifying its presence in the Asia Pacific markets.

About The Role

You will safeguard the quality of our AI and GenAI features by evaluating model outputs, creating “golden” datasets, and guiding continuous improvements in collaboration with data scientists and engineers. Be the guide to the team as the team creates a robust methodology and framework that will drive evaluation of hundreds of AI agents.

Responsibilities

Evaluation Frameworks – Develop reusable, automated evaluation pipelines using frameworks such as Raagas; integrate LLM-as-a-judge methods for scalable assessments.
Golden Datasets – Build and maintain high-quality benchmark datasets in collaboration with subject matter experts.
AI Output Validation – Evaluate results across text, documents, audio, and video, using both automated metrics and human-in-the-loop judgment.
Metric Evaluation – Implement and track metrics such as precision, recall, F1 score, relevance scoring, and hallucination penalties.
RAG & Embeddings – Design and evaluate retrieval-augmented generation (RAG) pipelines, vector embedding similarity, and semantic search quality.
Error & Bias Analysis – Investigate recurring errors, biases, and inconsistencies in model outputs; propose solutions.
Framework & Tooling Development – Build tools that enable large-scale model evaluation across hundreds of AI agents.
Cross-Functional Collaboration – Partner with ML engineers, product managers, and QA peers to integrate evaluation frameworks into product pipelines.

Required Qualifications

2–4 years of experience as a Software Development Engineer in AI/ML systems.
Strong coding skills in Python (evaluation pipelines, data processing, metrics computation).
Hands-on experience with evaluation frameworks (Ragas or equivalent).
Knowledge of vector embeddings, similarity search, and RAG evaluation.
Familiarity with evaluation metrics (precision, recall, F1, relevance, hallucination detection).
Understanding of LLM-as-a-judge evaluation approaches.
Strong analytical and problem-solving skills; ability to combine human judgment with automated evaluations.
Bachelor’s or Master’s degree in Computer Science, Data Science, or related field.
Strong English written and verbal communication skills.
Good to Have: Experience in data quality, annotation workflows, dataset curation, or golden set preparation.
Sales domain knowledge is a strong plus.

#LI-SG1

Equal Opportunity Statement

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of age, ancestry, citizenship, color, ethnicity, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or invisible disability status, political affiliation, veteran status, race, religion, or sexual orientation.

Did you read the requirements as a checklist and not tick every box? Don't rule yourself out! If this role resonates with you, hit the ‘apply’ button.

Apply for this job