AI Physics and Safety Lab

Evaluations and Benchmarks for Physics Agents

This research theme focuses on developing new evaluation methods and benchmarks for physics agents. The goal is to use physics as a ground truth, providing a solid foundation for assessing the performance and capabilities of AI agents. Innovative interpretability methods are also being explored.

Autonomous Computational Physics Agents

This research theme explores the development of autonomous computational physics agents, with a special emphasis on physics simulations. These AI agents act within simulation sandboxes, allowing for thorough exploration and characterization of their capabilities.

Safety of Science-Capable LM Agents

This research theme explores methods to ensure the alignment and safety of actions taken by AI physics agents. In simulated environments, many safety and alignment issues can be explored without real-world consequences.

Scalable Oversight for Super- Alignment

Funded by OpenAI, this research theme aims to develop scalable oversight methods for super-alignment, using physics as a ground truth. The objective of super-alignment is to ensure that AI systems remain aligned with human values and intentions, even in the limit where they become more capable than humans.