Software Engineer - LLM Dataset

San Francisco
ML /
Full-time /
Remote
Join us to build and safely deploy aligned, superhuman AI. For decades, technology was "just a tool" - could it be a colleague?

We are building an AI pair programmer that feels like a full colleague inside your computer - capable, conversational, and reliable across domains. If this isn't AGI, then what is? Join us if you want to build and safely deploy aligned AGI in products that matter.

As a Software Engineer on our core ML team, you will work on building internet-scale multimodal datasets and be responsible for working on petabyte-scale internet crawls, audio and video data, as well as small high-quality datasets.

Magic's culture

    • Integrity. Words and actions should be aligned.
    • Hands-on. Most of us have previously led engineering teams. At Magic, there are no managers. We all spend the vast majority of our time on engineering. If you want to solve hard problems, Magic is the right place for you.
    • Teamwork. We move as one team, not N individuals.
    • Focus. Ethically deploy AGI. Everything else is noise.
    • Quality. We have high standards for ourselves and our products. Magic should feel like magic.

Responsibilities

    • Write scrapers, crawlers, and parsers for websites and other text and multi-modal data sources.
    • Clean large datasets, ensuring data quality and integrity.
    • Identify new sources of data and obtain relevant data for dataset creation.
    • Combine various tools such as OCR engines with general data pipelines and custom parsers and scrapers.

What we're looking for in a new teammate

    • Exceedingly bright with strong analytical thinking abilities
    • Exceptionally fast and productive engineer
    • Strong proficiency in Python
    • Knowledge of Rust and familiarity with C would be highly desirable
    • Experience working with ML models on various platforms such as Hugging Face, Transformers, spaCy or other open-source LLM's is preferred
    • Open source contributor to distributed systems tools like ray.io or to other ML related systems would be amazing!

Benefits and perks

    • Benchmark-based compensation in the 75th or 90th percentile, including base salary, generous equity, and benefits
    • 401K with 6% match
    • Flexible working hours
    • In-person (SF or Vienna) or remote
    • A small, fast-paced, highly focused team
FAQ:
What's your motivation?
Automation has led humanity from subsistence farming to becoming a globally connected society. AGI is the ultimate chapter of the story of human tool-building, presenting the potential to decouple productivity and ingenuity from human labor. What if the last 50 years of technological progress happened in 2 days? We want to make this a possibility.

Funding?
We've recently raised $28M.

How do we balance deploying the technology today with ambitions for AGI?
We think deploying AI within the right interfaces is just as important as the technology itself. Building an AI pair programmer helps us do both at the same time. We aim to launch gradually improving AI assistants while pursuing work on what will ultimately become AGI. 

Do you train your own models?
Yes

Do you care about the product?
It's funny that this is a question, but many AI companies neglect UX and focus only on their model. Yes, we care.

Can I work from anywhere?
We welcome applications from anyone around the world. We'll look at visa requirements case by case.

I don't meet all the criteria, should I still apply?
If you feel you have something to contribute to the mission and you're a high-energy person, absolutely. We make exceptions for exceptional people. In all hires, we are looking for either 1) difference makers on world class teams or 2) individuals who would become this very quickly if placed on such a team tomorrow.