Unique opportunity to contribute to AI safety at Apollo Research, an organisation tackling the risks of deceptive alignment in AI advancements

Overview

Salary

£100000 - £200000

Location

London, UK - On-site

Expires

Expires at anytime

Visa

Visa sponsor - UK

Role snap shot

Full job description

Organisation Summary

Apollo Research is at the forefront of managing risks and opportunities associated with rapid AI evolution. Its main focus is on deceptive alignment in AI - a phenomena where models seem aligned but are misaligned, potentially bypassing human supervision. The organisation's activities entail constructing evaluations for scheming, researching the science of scheming, and developing mitigations.

Role Summary

Perform safety evaluations, study scheming science, and devise control and monitoring strategies for frontier models.
Collaborate with reputed AI labs such as OpenAI, Anthropic, and Google DeepMind on pre-deployment evaluations and mitigations.
Build evaluations for scheming-related properties, thoroughly study scheming, automate the evaluation pipeline, and design AI control methodologies.

Role Requirements

Empirical research skillset related to AI control and evaluations.
Excellent scientific writing and communication skills.
Expertise in Large Language Model steering.
Strong software engineering, especially Python, skills. Familiarity with the Inspect evals framework is a plus.
No formal background or industry experience required; self-taught individuals are welcomed and encouraged to apply.

Application Process Details

Submit application form along with CV. Cover letter is optional. Links to relevant work samples are recommended.
The interview process includes screening, a take-home test, three technical interviews, and the final interview with the CEO.
UK visa sponsorship available.

Apollo Research is an Equal Opportunity Employer and encourages equal opportunity irrespective of personal characteristics.

Apollo Research is dedicated to addressing the significant risks and opportunities presented by the rapid advancement in AI capabilities. We focus on the phenomena of deceptive alignment or scheming, where models may appear aligned but are actually misaligned and capable of evading human oversight. Our work includes building evaluations for scheming, studying the science of scheming, and developing mitigations.

The role: As a Research Scientist or Engineer at Apollo Research, you will be involved in safety evaluations, the science of scheming, and control/monitoring for frontier models. You will collaborate with leading AI labs like OpenAI, Anthropic, and Google DeepMind on pre-deployment evaluations and mitigations. Your tasks will include building evaluations for scheming-related properties, studying scheming in detail, automating the evals pipeline, and designing AI control protocols.

Job requirements: We welcome candidates without a formal background or industry experience, including self-taught individuals. Required skills include empirical research related to AI control and evaluations, excellent scientific writing and communication, comprehensive experience in Large Language Model steering, and strong software engineering skills, particularly in Python. Experience with the Inspect evals framework is a bonus.

Benefits: Apollo Research offers a salary range of 100k - 200k GBP, flexible work hours, unlimited vacation and sick leave, provided meals on workdays, paid work trips, and a yearly professional development budget of $1,000 USD.

How to apply: Interested candidates should complete the application form with their CV. A cover letter is optional. Links to relevant work samples are encouraged. The interview process includes a screening interview, a take-home test, three technical interviews, and a final interview with the CEO.

Work Visas: We can sponsor UK visas.

Equality Statement: Apollo Research is an Equal Opportunity Employer, committed to providing equal opportunities regardless of various personal attributes.

Research Scientist / Engineer

Apollo Research