Senior Applied Machine Learning Scientist

European Union

Science – Science /

Full-time /

Hybrid

Sanas.ai is pioneering the future of human communication. Founded by a team of Stanford researchers and entrepreneurs with deep industry experience, Sanas has developed the world’s first real-time speech transformation platform capable of accent translation, noise elimination, speech enhancement, and cross-language communication.

Sanas makes conversations clearer, more inclusive, and more effective, removing barriers that prevent people from being understood, regardless of accent, background noise, or native language.

Since going to market in 2023, Sanas has scaled at an extraordinary pace, growing from $0 to $32M ARR in under two years, with a projected >$50M ARR by the end of 2025. The company recently recorded its first $10M quarter and is on track to achieve $120M in ARR next year. With a SaaS-based model, Sanas serves some of the world’s largest enterprises, including Comcast, UPS, UHG. Today, Sanas technology is deployed across >17 of the Fortune 500 and continuing to accelerate growth.

The company’s valuation has a clear trajectory toward multi-billion-dollar market capitalization as it continues to expand into new verticals and product categories. With a TAM that spans all human in the loop communications and beyond, Sanas has the potential to impact every industry and every global interaction.

Sanas is revolutionizing the way we communicate with the world’s first real-time algorithm, designed to modulate accents, eliminate background noises, and magnify speech clarity. Pioneered by seasoned startup founders with a proven track record of creating and steering multiple unicorn companies, our groundbreaking GDP-shifting technology sets a gold standard.

Sanas is a 200-strong team, established in 2020. In this short span, we’ve successfully secured over $100 million in funding. Our innovation has been supported by the industry’s leading investors, including Insight Partners, Google Ventures, Quadrille Capital, General Catalyst, Quiet Capital, and other influential investors. Our reputation is further solidified by collaborations with numerous Fortune 100 companies. With Sanas, you’re not just adopting a product; you’re investing in the future of communication.

We are seeking a Senior Applied Machine Learning Scientist with deep expertise in foundational modeling and large-scale speech AI systems. In this role, you will lead the development of advanced models that push the boundaries of speech processing, including self-supervised learning, large-scale pretraining, and multimodal architectures. Your focus will be on scaling models efficiently while ensuring real-time performance, robustness, and adaptability to diverse environments.

This position requires a strong foundation in ML techniques, an innovative mindset, and a deep commitment to continuous improvement of deployed systems.

Key Responsibilities:

Architect, train, and optimize large-scale speech AI models, including speech-to-speech, speech restoration, and speech translation.
Leverage self-supervised learning, contrastive learning, and transformer-based architectures (e.g., wav2vec, Whisper, GPT-style models) to improve model accuracy and adaptability.
Develop efficient model distillation and quantization strategies to deploy large models with low-latency inference.
Innovate on cross-lingual and multilingual speech processing using large-scale pretraining and fine-tuning.
Curate and scale massive diverse, multilingual, and multimodal datasets for robust model training.
Apply active learning, domain adaptation, and synthetic data generation to overcome data limitations.
Lead efforts in data quality assessment, augmentation, and curation for large-scale training pipelines.
Develop distributed training strategies for large-scale models using cloud-based and on-prem GPU clusters.
Design and implement scalable model evaluation frameworks, tracking WER, MOS, and latency across diverse scenarios.
Optimize real-time inference pipelines to ensure high-throughput, low-latency speech processing.
Stay ahead of advancements in foundational models, generative AI, and large-scale speech modeling.
Collaborate with academia, open-source communities, and research partners to drive innovation.
Work closely with MLOps, Data Engineering, and Product teams to deploy scalable AI systems.
Ensure seamless integration of foundational models with edge devices, real-time applications, and cloud platforms.
Translate cutting-edge research into production-grade models that power real-world communication.

Must have qualifications:

Bachelor’s, Master’s or Ph.D. in Computer Science, Electrical Engineering, or a related field with a focus on Machine Learning, Deep Learning, or Speech Processing.
5+ years of hands-on industry experience in developing and implementing the following systems:
Speech-to-text (ASR)
Text-to-speech (TTS)
Voice conversion & speech enhancement
Speech translation & multimodal learning
Strong proficiency in transformer-based architectures (e.g., wav2vec 2.0, Whisper, GPT, BERT).
Expertise in deep learning frameworks such as PyTorch, TensorFlow, and large-scale training techniques.
Experience with distributed training and optimization across multi-GPU clusters.
Strong understanding of self-supervised learning, contrastive learning, and generative modeling for speech AI.
Hands-on experience with cloud-based AI platforms (AWS, GCP, Azure) and model deployment.

Preferred experience:

Experience in developing multimodal AI models integrating speech, text, and vision.
Track record of publishing in top-tier AI/ML conferences.
Experience optimizing large models for real-time inference on edge devices.
Proficiency with MLOps best practices for deploying and monitoring models in production.
Familiarity with open-source ASR/TTS toolkits.

Joining us means contributing to the world’s first real-time speech understanding platform revolutionizing Contact Centers and Enterprises alike.

Our technology empowers agents, transforms customer experiences, and drives measurable growth. But this is just the beginning. You'll be part of a team exploring the vast potential of an increasingly sonic future

Apply for this job