Anders Sandberg from The Future of Humanity Institute joins the podcast to discuss ChatGPT, large language models, and what he's learned about the risks and benefits of AI.
Anders Sandberg from The Future of Humanity Institute joins the podcast to discuss ChatGPT, large language models, and what he's learned about the risks and benefits of AI. Timestamps: 00:00 Introduction 00:40 ChatGPT 06:33 Will AI continue to surprise us? 16:22 How do language models fail? 24:23 Language models trained on their own output 27:29 Can language models write college-level essays? 35:03 Do language models understand anything? 39:59 How will AI models improve in the future? 43:26 AI safety in light of recent AI progress 51:28 AIs should be uncertain about values
Peter Wildeford discusses methods for forecasting AI progress and why he sees AI as neither a bubble nor a normal technology, covering economic effects, national security, cyber capabilities, robotics, export controls, and prediction markets.
Inria researcher Carina Prunkl discusses why AI evaluation struggles to keep pace with general-purpose systems, including jagged capabilities, missed real-world behavior, misuse risks, de-skilling, red teaming, and layered safeguards.
Li-Lian Ang from Blue Dot Impact discusses how to build a workforce to defend against AI-driven risks, including engineered pandemics, cyber attacks, job disempowerment, and concentrated power, using a defense-in-depth framework for uncertain AI progress.