Watch Episode Here
Listen to Episode Here
Show Notes
Carina Prunkl is a researcher at Inria. She joins the podcast to discuss how to assess the capabilities and risks of general-purpose AI. We examine why systems can solve hard coding and math problems yet still fail at simple tasks, why pre-deployment tests often miss real-world behavior, and how faster capability gains can increase misuse risks. The conversation also covers de-skilling, red teaming, layered safeguards, and warning signs that AIs might undermine oversight.
LINKS:
CHAPTERS:
(00:00) Episode Preview
(01:04) Introducing the report
(02:10) Jagged frontier capabilities
(05:29) Formal reasoning progress
(12:36) Risks and evaluation science
(19:00) Funding evaluation capacity
(24:03) Autonomy and de-skilling
(31:32) Authenticity and AI companions
(41:00) Defense in depth methods
(48:34) Loss of control risks
(53:16) Where to read report
PRODUCED BY:
SOCIAL LINKS:
Website: https://podcast.futureoflife.org
Twitter (FLI): https://x.com/FLI_org
Twitter (Gus): https://x.com/gusdocker
LinkedIn: https://www.linkedin.com/company/future-of-life-institute/
YouTube: https://www.youtube.com/channel/UC-rCCy3FQ-GItDimSR9lhzw/
Apple: https://geo.itunes.apple.com/us/podcast/id1170991978
Spotify: https://open.spotify.com/show/2Op1WO3gwVwCrYHg4eoGyP