Evals Research Scientist / Engineer

London

Evals Team /

Full-time /

On-site

Applications deadline: We are reviewing applications on a rolling basis. It might take a few weeks until you hear from us.

ABOUT THE OPPORTUNITY

We’re looking for Research Scientists and Research Engineers who are excited to develop or contribute to safety evaluations to audit frontier AI models.

YOU WILL HAVE THE OPPORTUNITY TO

Build evaluations for scheming-related properties (such as deceptive reasoning, sabotage, and deception tendencies). See our conceptual work on safety cases for scheming, e.g. our work on evaluation-based safety cases for scheming
Conduct evaluations on frontier models and publish the results either to the general public or a target audience such as AI developers or governments, e.g. our work in OpenAI’s o1-preview system card and Anthropics’s Opus 4 and Sonnet 4 system card.
Create model organisms and demonstrations of behavior related to deceptive alignment, e.g. exploring the influence of goal-directedness on scheming.
Design and evaluate AI control protocols. We have not started these efforts yet, but intend to work on them starting Q3 2025.

KEY REQUIREMENTS

We don’t require a formal background or industry experience and welcome self-taught candidates.
Broad experience in empirical research methodologies and a scientific mindset: You have designed and executed experiments. You can identify alternative explanations for findings and test alternative hypotheses to avoid overinterpreting results.This experience can come from academia, industry, or independent research.
Track record of excellent scientific writing and communication: You can understand and communicate complex technical concepts to our target audience and synthesize scientific results into coherent narratives.
Comprehensive experience in Large Language Model (LLM) steering and the supporting Data Science and Data Engineering skills. LLM steering can take many different forms, such as:
Prompting: Elicited specific behavior through clever word choice
LM Agents and Scaffolding: Chained inputs and outputs from various models in a structured way, making them more goal-directed and agentic
Fluent LLM usage: Integrated LLMs heavily into workflows
Supervised fine-tuning: Created datasets and fine-tuned models to improve a specific capability or to study aspects of learning/generalization
Reinforcement Learning (Human Feedback/AI Feedback): Used other models, programmatic reward functions, or custom reward models as a source of feedback for fine-tuning an existing LLM.
Software engineering skills: Developed APIs (ideally around LLMs or eval tasks), system design and front-end development. Generally, you must be able to code in Python.
(Bonus) Experiences related to AI control and cyber security
(Bonus) We have recently switched to Inspect as our primary evals framework and we value those with experience using it.
Depending on your preferred role and how these characteristics weigh up, we can offer either a RS or RE role.

We want to emphasize that people who feel they don’t fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.

LOGISTICS

Start Date: Target of 2-3 months after the first interview.
Time Allocation: Full-time
Location: The office is in London, and the building is shared with the London Initiative for Safe AI (LISA) offices. This is an in-person role. In rare situations, we may consider partially remote arrangements on a case-by-case basis.
Work Visas: We can sponsor UK visas

BENEFITS

Salary: a competitive UK-based salary
Flexible work hours and schedule
Unlimited vacation
Unlimited sick leave
Lunch, dinner, and snacks are provided for all employees on workdays
Paid work trips, including staff retreats, business trips, and relevant conferences
A yearly $1,000 (USD) professional development budget

ABOUT APOLLO RESEARCH

The capabilities of current AI systems are evolving at a rapid pace. While these advancements offer tremendous opportunities, they also present significant risks, such as the potential for deliberate misuse or the deployment of sophisticated yet misaligned models. At Apollo Research, our primary concern lies with deceptive alignment, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight.

Our approach focuses on behavioral model evaluations, which we then use to audit real-world models. We also combine black-box approaches with applied interpretability. In our evaluations, we focus on LM agents, i.e. LLMs with agentic scaffolding similar to AIDE or SWE agent. We also study model organisms in controlled environments (see our security policies), e.g. to better understand capabilities related to scheming. At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful. If you’re interested in more details about what it’s like working at Apollo, you can find more information here.

ABOUT THE TEAM

The current evals team consists of Mikita Balesni,Jérémy Scheurer,Alex Meinke,Rusheb Shah,Bronson Schoen, Andrei Matveiakin, Felix Hofstätter, Axel Højmark, Nix Goldowsky-Dill, Teun van der Weij and Alex Lloyd. Marius Hobbhahn manages and advises the evals team, though team members lead individual projects. You will mostly work with the evals team, but you will likely sometimes interact with the governance team to translate technical knowledge into concrete recommendations. You can find our full team here.

Equality Statement: Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

How to apply: Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest working on hands-on LLM evals projects (e.g. as suggested in our starter guide), such as building LM agent evaluations in Inspect.

Applications deadline: We are reviewing applications on a rolling basis. It might take a few weeks until you hear from us.

* This role is supported by AI Futures Grants, a UK Government program designed to help the next generation of AI leaders meet the costs of relocating to the UK. AI Futures Grants provide financial support to reimburse relocation costs such as work visa fees, immigration health surcharge and travel/subsistence expenses. Successful candidates for this role may be able to get up to £10,000 to meet associated relocation costs, subject to terms and conditions.

Apply for this job