Anders Sandberg from The Future of Humanity Institute joins the podcast to discuss ChatGPT, large language models, and what he's learned about the risks and benefits of AI.
Anders Sandberg from The Future of Humanity Institute joins the podcast to discuss ChatGPT, large language models, and what he's learned about the risks and benefits of AI. Timestamps: 00:00 Introduction 00:40 ChatGPT 06:33 Will AI continue to surprise us? 16:22 How do language models fail? 24:23 Language models trained on their own output 27:29 Can language models write college-level essays? 35:03 Do language models understand anything? 39:59 How will AI models improve in the future? 43:26 AI safety in light of recent AI progress 51:28 AIs should be uncertain about values
Maya Ackerman discusses human and machine creativity, exploring its definition, how AI alignment impacts it, and the role of hallucination. The conversation also covers strategies for human-AI collaboration.
Adam Gleave, CEO of FAR.AI, discusses post-AGI scenarios, risks of gradual disempowerment, defense-in-depth safety strategies, scalable oversight for AI deception, and the challenges of interpretability, as well as FAR.AI's integrated research and policy work.
Beatrice Erkers discusses the AI pathways project, focusing on approaches to maintain human oversight and control over AI, including tool AI and decentralized development, and examines trade-offs and strategies for safer AI futures.