Machine Learning Scientist

Paris
1 - Speech
Full-time
Who we are

Automatic Speech Recognition (ASR) research is sadly mostly largely done in big companies. Sure, talent is around, but it's really only in a less crowded space that you can really shine and see the impact that you can do. A.K.A. an early-stage startup, when you’re still a group of friends with a crazy ambition to change the world.

What if you could change 400M lives by building the first AI system able to transcribe group conversations ? Turn a lifetime of frustration into a deep connection?

Ava aims at captioning the world to make it fully accessible, 24/7, to deaf & hard-of-hearing people. Mobile-first, the app is the fastest & most advanced captioning system in the world, beating what tech giants have done, by cleverly using speech and speaker identification technologies to make conversations between deaf & hard-of-hearing people and hearing people possible.

At Ava, the CEO is the only hearing person in a family of deaf people, and the CTO is deaf and non-speaking - both were Forbes 30 under 30 2017. We use our ASR-based product everyday to communicate. We’re working with companies such as GE, Airbus, Salesforce, but also universities, stores, and even churches to fulfill our mission to make the world truly accessible.

They talk about us: TechCrunch, Wall Street Journal, Forbes, TF1, Le Figaro, Le Monde.


What we need to get to the next level?

You - someone with significant research exposure to Machine Learning. The core of your mission will be to leverage state-of-the-art Machine Learning techniques to perform online multi-sensors speaker diarization of group conversations: crack the cocktail-party problem.

The signal is acquired via an array of ad-hoc microphones, and is processed to guess who says what, using a set of techniques: source localization, voice recognition, microphones calibration, speech recognition, source separation… all in real time.

You will need to design, train, optimize and monitor the performance of Machine Learning models in order to determine who speaks when and where. The inputs are streams of spatial localization and voice information. 

Interested to learn more about it? Let’s chat.


Especially if:
- You have >3 years of research experience in Machine Learning (including Deep Learning).
- Experience in audio processing, Time Series or real-time systems is a plus.
- You ambition to be a pioneer in the field, and do what is necessary to make things work in real world situations.
- You're of the persistent, yet open-minded and collaborative type: you reason by independent thinking first, but you know that together, we're stronger.

What we offer:
- An opportunity to apply cutting-edge technologies to solve real world problems, right now.
- Empowering and fast-paced working environment.
- Competitive salary and equity opportunity.
- The job will be based in our Paris office at station F.

Interested? Let us know at alex@ava.me.