Data Scientist – Gen AI

India
Artificial Intelligence – Data Science /
Permanent - Full Time /
Hybrid
XA Group is a global technology company innovating in the automotive and insurance sectors. We're building next-gen GenAI-powered copilots and real-time systems that solve real-world problems at scale.

Role Overview:

    • We’re hiring a skilled and proactive Data Scientist – Gen AI (2–6 yrs) to join our remote team in building AI copilots, chatbots, and intelligent agent systems using the latest in LLMs, document AI, and NLP. This role places a strong emphasis on the design, development, and orchestration of agents and multi-agent systems, which are critical to our AI architecture and product strategy.
    • You’ll play a central role in developing POCs and scalable, real-time production applications powered by collaborative agents that drive intelligent automation.

Key Responsibilities:

    • Build advanced GenAI applications using LLMs, Advanced RAG, and especially agent-based and multi-agent architectures (e.g., LangGraph, LangChain).
    • Design, orchestrate, and scale agent workflows that coordinate tasks across copilots, document processing, and real-time systems.
    • Plug, play, test, and integrate Hugging Face models (NLP, OCR, NER, etc.) into modular, extensible agent pipelines.
    • Work with OCR, NER, and document extraction & processing pipelines for intelligent document understanding.
    • Build intelligent copilots, chatbots, and backend logic using Python, FastAPI, async programming, websockets, and parallel processing.
    • Deploy and manage agent-based applications using Docker, Azure, and modern CI/CD pipelines.
    • Implement and manage vector databases, semantic search, and retrieval workflows for high-quality contextual responses.
    • Conduct prompt engineering, LLM/RAG/Agent evaluation, and continuous system improvement.
    • Collaborate with cross-functional teams – product, engineering, QA – to translate ideas into production-ready, agent-enabled tools.
    • Build POCs, internal tools, and full-fledged production applications based on multi-agent designs.
    • Stay updated with cutting-edge research papers and trends in LLMs, agentic workflows, and GenAI.

Required Skills & Experience:

    • 2–6 years of experience in Python, NLP, machine learning, and transformers.
    • Must have hands-on experience with LangChain, LangGraph, agent orchestration, multi-agent system design, and retrieval-augmented generation.
    • Proven experience working with agents and multi-agent collaboration patterns in real-time applications.
    • Experience with OCR, NER, document extraction, and automated document workflows.
    • Proficiency with Hugging Face Transformers and model testing/integration.
    • Hands-on experience deploying scalable applications using Docker, Azure, and CI/CD pipelines.
    • Experience working with FastAPI, asyncio, and websockets for building real-time, responsive interfaces.
    • Familiarity with rapid prototyping tools like Streamlit or frontend stacks like React.
    • Strong problem-solving mindset with excellent debugging and optimization skills.
    • Comfortable working with both structured and unstructured databases.
    • Knowledge of LLM fine-tuning, semantic caching, and memory-enhanced agents.
    • Excellent communication and planning skills; comfortable working across cross-functional teams.