Founding Machine Learning Applied Scientist - Speech (Stealth Startup)

San Francisco, California
Stealth /
Full Time /
Hybrid
About Rime 

Rime is a speech technology startup based in San Francisco, CA, founded by linguists and engineers. We work on deep learning speech and language technologies that power incredibly engaging customer experiences. The next wave of ML technologies is here, and Rime is harnessing these advancements to deliver something truly unique: real-time speech synthesis that is natural, lifelike, and infinitely customizable. Rime turns previously boring conversational AI applications into the most compelling experiences possible. Join us and pioneer these changes.

Responsibilities

    • Train speech synthesis networks and evaluate their quality and performance.
    • Evaluate and diagnose training procedure of both autoregressive and non-autoregressive speech synthesis models. Evaluate and make architectural decisions driven by product requirements.
    • Own both research-driven experiment pipelines as well as in-production model quality and capabilities.
    • Be involved with ingestion, preprocess, and cleaning of speech data, ensuring its quality, relevance, and adequacy for training and testing purposes.
    • Continuously research and experiment with new approaches, methodologies, and algorithms in speech data processing and analysis, staying up-to-date with the latest advancements in the field.

Requirements:

    • Bachelor's or higher degree in Computer Science, Engineering, Data Science, or a related field (or equivalent practical experience).
    • 5+ Years Experience as a Machine Learning Engineer, Data Scientist, or a similar role with a focus on implementing and training deep neural networks.
    • Strong programming skills, with proficiency in machine learning frameworks (e.g., TensorFlow, PyTorch, scikit-learn).
    • Experience working with speech data, including audio preprocessing, feature extraction, and signal processing techniques.
    • Passion for startups and the ability to thrive in a fast-paced, dynamic environment.

Preferred Qualifications:

    • On-the-ground experience with speech synthesis technologies including training of SOTA architectures (both two-stage and end-to-end/).
    • Familiarity with large-scale distributed computing frameworks and cloud-based ML platforms (e.g., AWS SageMaker, Google Cloud AI Platform).
    • Knowledge of speech data annotation tools and techniques for training machine learning models.
    • Knowledge of general natural language processing techniques and tools.
Additional details:
Hybrid: home/from our office in San Francisco (SOMA neighborhood)
Full medical, dental, vision insurance
Competitive salary
Generous equity package
Monthly health & wellness stipend
Access to a recording studio and audiophile-grade equipment