Ajeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode
Peter Wildeford discusses methods for forecasting AI progress and why he sees AI as neither a bubble nor a normal technology, covering economic effects, national security, cyber capabilities, robotics, export controls, and prediction markets.
Inria researcher Carina Prunkl discusses why AI evaluation struggles to keep pace with general-purpose systems, including jagged capabilities, missed real-world behavior, misuse risks, de-skilling, red teaming, and layered safeguards.
Li-Lian Ang from Blue Dot Impact discusses how to build a workforce to defend against AI-driven risks, including engineered pandemics, cyber attacks, job disempowerment, and concentrated power, using a defense-in-depth framework for uncertain AI progress.