Machine learning engineer

San Francisco
Engineering /
Full time /
Join us in making Large Language Models (LLMs) viable for real-world, high throughput, tasks, starting with finance. While significantly more efficient than humans, the cost and latency of LLMs is still prohibitive for most applications. At Ntropy, our mission is to make LLMs viable at scale. We are developing a new kind of domain-specific wrappers for language models, that leads to a reduction in the number of queries to the base model by 3-5 orders of magnitude and cost per datapoint by 2-3 orders of magnitude, without impacting accuracy.

We are a team of engineers, mathematicians, physicists, and artists united by a passion for innovation, disdain for over-engineering, and a commitment to disrupting the status quo. If you share our vision for a future where technology empowers every decision, we want you on our team.

Role Responsibilities

As an early ML engineer at Ntropy, you will play a pivotal role in shaping our products, culture and direction of the company. Your work will directly contribute to our mission of making LLMs viable at a scale of 100M+ requests per day. You will develop and enhance our domain-specific caching infrastructure by creating algorithms for query decomposition, caching, and reassembly and develop approaches to extend this technology to new domains beyond finance.

What You Need to be Successful

- At least 5 years of relevant experience, either academic, professional, or purely personal projects. We value diverse experiences that demonstrate your skills and passion for ML engineering.

- Demonstrated proficiency in Python and PyTorch, with a strong background in machine learning concepts and hands-on experience in training and deploying large models.

- Ability to adapt quickly to new technologies and challenges, with a proven track record of solving hard problems.

- Excellent communication skills, with the ability to work effectively in a dynamic team environment and lead projects to successful completion.

Bonus Points

- Experience with cloud infrastructure, multi-GPU environments, other ML frameworks, Kubeflow, Rust.

- Contribution to open-source projects or participation in competitive programming events.

Joining Ntropy

By joining Ntropy, you become part of a team that values radical honesty, collaboration, and the freedom to challenge norms. We offer competitive salaries, equity, and the opportunity to work on cutting-edge technology alongside some of the top engineers in the field. We have offices in San Francisco, London, and Lisbon, but are also open to remote arrangements in exceptional cases.


Where is Ntropy located?
We are currently starting our SF office and are hiring there for in-person roles only. We also have hubs in London, UK and Lisbon, Portugal.

Do you consider part-time work?
Not at the moment. Full-time roles only.

How are you funded?
We are backed by some of the top funds in the world and have raised double-digit millions of dollars so far. We can share more details over the call.

Do you already have a product and customers?
Yes. We have been in production since 1st Jan, 2022 and have in the high double digits customers using our APIs in production.

How big is your team?
We are around 20 people at the moment. Mostly engineering and product.

What is the interview process like?
1. Send us an overview of problems you have encountered before and how you approached solving them. Please include as much detail as possible: code, algorithms, derivations, proofs, etc. We will then do a video call to kick things off and go through it (45 mins).
2. We will give you a take-home project related to whatever we are currently working on (3-4 hours). Alternatively, if you have a relevant project that you worked on previously that demonstrates your skills as an engineer, you are welcome to use that instead.
3. We will then do a deep-dive through the project over a call and discuss the implementation, improvements and bottlenecks.
Above all, we respect your time and commitment and will keep you up to speed on where we are at during the whole process.

What are your hiring plans?
We expect to be 40-50 people by the end of 2024.

What is your current stack?
backend - Python, Rust
compute - AWS, GCP
ML - PyTorch, ONNX, Triton, LLMs

Work / life balance
We are a startup which requires you to put in a lot more work and soul than a regular job. We believe, however, that nothing easy is worth doing. We will expect a lot from you, and you should expect a lot from us.

What is the compensation?
approx. $130-200k • 0.1-0.3%