Machine Learning Intern (Summer 2025)

San Mateo, CA
Machine Learning /
Internship /
On-site
About zaimler

We are on a mission to bridge the gap between enterprise business knowledge and data, democratizing data discovery and curation to prepare organizations for the era of generative AI. Today's data tools are overly complex, poorly integrated, and siloed, forcing AI Practitioners and data scientists alike to spend more time wrestling with tools, relying on tribal knowledge, and navigating data lakes rather than doing meaningful data science work. The current landscape of data tools and processes is heavily manual and needs to catch up with the vast amount of data generated daily. With the advent of Gen AI and multi-modality, this challenge has only grown more complex and broken.

Come help build a platform to help enterprises get their data AI ready! A key part of our platform solution, beyond the data infrastructure, is innovating in the AI/ML space to (semi)automatically create a semantic layer on top of their data. This includes innovating on LLM, semantic augmentation, knowledge extraction and knowledge graph building, as well as discovery (search, ranking, recommendation).In this role you will get the opportunity to work with, and learn from, world-class AI leaders and Data Infra leaders.

About the job

We are looking for a few interns to join us either part-time through the year or Full-time for the summer. The ideal candidate should have interest and some experience in one or more of the following areas: Knowledge Extraction, Natural Language Understanding, Unsupervised Learning, Information Retrieval, and Fine-tuning LLMs. In this internship, you'll play a critical part in developing and training models, pipelines, and methodologies that power our semantic graph systems. You will get experience working at large-scale real data with the goal of making sense of it and putting structure to it so it is discoverable and understandable to end-users. You will be working with models and techniques that involve LLMs, machine learning, natural language processing, and semantic technologies.

What You Will be Doing

    • Build and/or use best-in-class models to extract knowledge from heterogeneous sources
    • Develop methods to build and evaluate AI Data Graphs
    • Fine-tuning LLMs with domain-specific context 
    • Work with data infra engineers to develop the best platform for your needs

Prior Experience

    • Pursuing a Bachelors/ Masters in CS
    • Startup internship experience is highly preferred
    • Interest in working with and fine-tuning language models such as BERT, LLM, SLMs
    • Interest in working with NLP tools such as spacy, openNLP, openNER, GLiNER, etc.
    • Interest in working with embedding-based retrieval
    • Strong background in the fundamentals of machine learning
    • Deployed and maintained ML, NLP or LLM models
    • Strong data manipulation skills using tools such as numpy and pandas
    • Great communication skills and a team player

Nice to Have

    • Familiar with LLM ecosystem and best practices of fine-tuning and prompt-engineering
    • Familiar working on ML and data in the cloud
    • Familiar with GPU optimization
    • Familiar with docker, k8s
    • Familiar with ray, vllm