Ajeya Cotra joins us to discuss how artificial intelligence could cause catastrophe. Follow the work of Ajeya and her colleagues: https://www.openphilanthropy.org Timestamps: 00:00 Introduction 00:53 AI safety research in general 02:04 Realistic scenarios for AI catastrophes 06:51 A dangerous AI model developed in the near future 09:10 Assumptions behind dangerous AI development 14:45 Can AIs learn long-term planning? 18:09 Can AIs understand human psychology? 22:32 Training an AI model with naive safety features 24:06 Can AIs be deceptive? 31:07 What happens after deploying an unsafe AI system? 44:03 What can we do to prevent an AI catastrophe? 53:58 The next episode
Maya Ackerman discusses human and machine creativity, exploring its definition, how AI alignment impacts it, and the role of hallucination. The conversation also covers strategies for human-AI collaboration.
Adam Gleave, CEO of FAR.AI, discusses post-AGI scenarios, risks of gradual disempowerment, defense-in-depth safety strategies, scalable oversight for AI deception, and the challenges of interpretability, as well as FAR.AI's integrated research and policy work.
Beatrice Erkers discusses the AI pathways project, focusing on approaches to maintain human oversight and control over AI, including tool AI and decentralized development, and examines trade-offs and strategies for safer AI futures.