AXRP - the AI X-risk Research Podcast
Un pódcast de Daniel Filan
59 Episodo
-  
35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
Publicado: 24/8/2024 -  
34 - AI Evaluations with Beth Barnes
Publicado: 28/7/2024 -  
33 - RLHF Problems with Scott Emmons
Publicado: 12/6/2024 -  
32 - Understanding Agency with Jan Kulveit
Publicado: 30/5/2024 -  
31 - Singular Learning Theory with Daniel Murfet
Publicado: 7/5/2024 -  
30 - AI Security with Jeffrey Ladish
Publicado: 30/4/2024 -  
29 - Science of Deep Learning with Vikrant Varma
Publicado: 25/4/2024 -  
28 - Suing Labs for AI Risk with Gabriel Weil
Publicado: 17/4/2024 -  
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
Publicado: 11/4/2024 -  
26 - AI Governance with Elizabeth Seger
Publicado: 26/11/2023 -  
25 - Cooperative AI with Caspar Oesterheld
Publicado: 3/10/2023 -  
24 - Superalignment with Jan Leike
Publicado: 27/7/2023 -  
23 - Mechanistic Anomaly Detection with Mark Xu
Publicado: 27/7/2023 -  
Survey, store closing, Patreon
Publicado: 28/6/2023 -  
22 - Shard Theory with Quintin Pope
Publicado: 15/6/2023 -  
21 - Interpretability for Engineers with Stephen Casper
Publicado: 2/5/2023 -  
20 - 'Reform' AI Alignment with Scott Aaronson
Publicado: 12/4/2023 -  
Store, Patreon, Video
Publicado: 7/2/2023 -  
19 - Mechanistic Interpretability with Neel Nanda
Publicado: 4/2/2023 -  
New podcast - The Filan Cabinet
Publicado: 13/10/2022 
AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.
