Episodes

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Apr 24 2026

GPT 5.5 full analysis, plus DeepSeek V4 paper highlights, comparisons with Mythos, a vibe-coded game w/ GPT Image 2, and 50 data-points you wouldn’t get from just reading the headlines.

Chapters:
01:11 - GPT 5.5 Comparison
06:04 - Mythos Marketing
11:50 - Recursive Self-Improvement?
14:11 - Deepseek V4
18:03 - VibeCode Experiment Extravaganza
21:44 - The Scarce Compute Era

https://80000hours.org/aiexplained

OpenAI Benchmarks: https://openai.com/index/introducing-gpt-5-5/

5.5 System Card: https://deploymentsafety.openai.com/gpt-5-5/gpt-5-5.pdf

Direct Comparison: https://pbs.twimg.com/media/HGnNm5GWEAAJ1Ob?format=jpg&name=4096x4096

DeepSeek Paper: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

SWE Bench Pro - benchmark of choice? https://x.com/ChowdhuryNeil/status/2047416077622395025

AA Omniscience: https://artificialanalysis.ai/evaluations/omniscience

Vending Bench: https://x.com/andonlabs/status/2047377260412649967

Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf

Sam Altman Drunk Phase: https://x.com/sama/with_replies

Noam Brown: https://x.com/polynoamial/status/2047387675762802998

DeepSeek Compute Crunch: https://www.bloomberg.com/news/articles/2026-04-24/deepseek-unveils-newest-flagship-a-year-after-ai-breakthrough?srnd=phx-ai

Spreadsheet Bench: https://x.com/nicochristie/status/2047476237464211721

Pattern Recognition: https://arcprize.org/leaderboard

Leader Interviews:
Core Memory: https://www.youtube.com/watch?v=NCKQL0op30E
Knowledge Podcast: https://www.youtube.com/watch?v=6JoUcQ1qmAc
Big Tech Round 1: https://www.youtube.com/watch?v=J6vYvk7R190&t=1116s
Big Tech Round 2: https://www.youtube.com/watch?v=YnoQ8RJbALw&t=8s

Claude Code Limitations: https://x.com/TheAmolAvasare/status/2046724659039932830

ChatGPT 5.4 for Clinicians: https://openai.com/index/making-chatgpt-better-for-clinicians/

Image Arena: https://x.com/arena/status/2046670703311884548

VibeCode Bench: https://www.vals.ai/benchmarks/vibe-code

5.5-made Game +Seedance 2.0: https://rosemere-quest.pages.dev/

Show More Show Less

25 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Apr 17 2026

Claude Opus 4.7 just dropped, but behind every headline lies a deeper story. From a bonanza of benchmarks, to seeing the fruits of one of the biggest mega-projects in US history, to sneaky Mythos disclaimers, to Anthropic admitting compute restraints and, forcing lower capability of Opus 4.7. Where the new model falls behind Gemini but ahead of GPT 5.4, plus why some users are furious at Anthropic. Ending with a 9-year animus, that still affects AI today…

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:58 - Benchmarks
05:21 - Market Share + Compute Problems
08:12 - Mythos Exclusives
12:56 - User Frustration + Claude Code Updates
14:03 - Brockman Amodei Rivalry
17:40 - OpenAI vs Anthropic Approach to Code

Claude 4.7 Opus Release Notes: https://www.anthropic.com/news/claude-opus-4-7
vs Mythos: https://pbs.twimg.com/media/HGCGugrXUAAKcHp?format=jpg&name=medium

232-page System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf

ARC-AGI 2: https://x.com/arcprize/status/2044834615417053305/photo/1

ParseBench: https://x.com/jerryjliu0/status/2044902620746363016/photo/1

GDPVal: https://artificialanalysis.ai/evaluations/gdpval-aa

Vidoc Security Replication: https://blog.vidocsecurity.com/blog/we-reproduced-anthropics-mythos-findings-with-public-models

Boris Cherny Settings: https://x.com/Hesamation/status/2043016923961577516/photo/2

User Frustration: https://x.com/RileyRalmuto/status/2044836116189069660

VibeCode Bench: https://x.com/ValsAI/status/2044791415524471099/photo/1

Verge Memo: https://www.theverge.com/ai-artificial-intelligence/911118/openai-memo-cro-ai-competition-anthropic

5.4 Cyber: https://openai.com/index/scaling-trusted-access-for-cyber-defense/

Data Centers in Absolute $: https://x.com/finmoorhouse/status/2044933442236776794/photo/1

…in % of GDP: https://pbs.twimg.com/media/HGEN8FGWQAAN7Np?format=jpg&name=4096x4096

WSJ Exclusive: https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde

Brockman Interview: https://www.youtube.com/watch?v=J6vYvk7R190

$1T Valuation: https://x.com/StefanFSchubert/status/2045039686997967082

Emotions: https://www.patreon.com/c/aiexplained/posts

https://lmcouncil.ai/benchmarks

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Show More Show Less

20 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Claude Mythos: Highlights from 244-page Release

Apr 8 2026

The model, the mythos, the legend. We have a new best AI model, but not all of us. How good is it, what does it’s new offensive capabilities mean? Why does it’s 244 page report card remind me of Her, and why did the creator of Claude Code call it ‘terrifying’. 30+ highlights sourced by reading the paper in full, old-school, no AI summary.

https://80000hours.org/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:56 - Internal Release + Availability
02:37 - General Capabilities
05:12 - Self-improvement?
06:15 - ‘Terrifying’ Landscape
11:07 - Safety Decision
13:22 - Coding
14:49 - Alignment, Awareness
19:52 - GUI for Agents/Claws + Hallucinations
21:34 - …Emotions?
25:29 - Her connection

244-page System Card: https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf

Project Glasswing: https://www.anthropic.com/glasswing
Zero-Day Details: https://red.anthropic.com/2026/mythos-preview/

Mythos ‘terrifying’: https://x.com/bcherny/status/2041605852382351666

New Yorker Altman/Amodei: https://archive.fo/20260406100412/https://www.newyorker.com/magazine/2026/04/13/sam-altman-may-control-our-future-can-he-be-trusted

Alignment Risk Update: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf

In a Park: https://x.com/sleepinyourhat/status/2041584808514744742

“Uhm” - https://x.com/thsottiaux/status/2041749947385815109

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Show More Show Less

28 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

Mar 26 2026

First look at exclusive reports about OpenAI's new Spud model, and the model Anthropic think will stir governments to urgency, all in the context of the newly-launched ARC-AGI-3. What does the extreme difficulty of that benchmarks, and its quirky scoring metrics, mean for AI in 2026?

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:55 - OpenAI Side Quests
01:58 - Claude New Model Coming + Universal Equity?
03:13 - ARC-AGI 3
05:00 - Intentional or Unintentional Gaming?
07:11 - But is it AGI Harbinger? No Harness
09:41 - Not the First
12:32 - Automated Researcher
15:00 - Claw Caveat

Spud: https://www.theinformation.com/articles/openai-ceo-shifts-responsibilities-preps-spud-ai-model?utm_campaign=Editorial&utm_content=Article&utm_medium=organic_social&utm_source=bluesky%2Cfacebook%2Clinkedin%2Cthreads%2Ctwitter&rc=sy0ihq

FT: OpenAI Special Model: https://www.ft.com/content/de9bf0af-b241-424f-8229-5870b1c0d93d?syn-25a6b1a6=1

Jensen Huang: https://www.forbes.com/sites/antoniopequenoiv/2026/03/23/nvidias-jensen-huang-says-he-thinks-weve-achieved-agi/

Axios Article: https://archive.fo/20260326100140/https://www.axios.com/2026/03/26/anthropic-pentagon-ai-deal#selection-827.0-829.257

https://arcprize.org/arc-agi/3

ARC AGI 3 Paper: https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf

NetHack Leaderboard: https://balrogai.com/
Paper: https://ai.meta.com/research/publications/the-nethack-learning-environment/
https://x.com/_rockt/status/2036864121585438995

Claw Shells: https://x.com/DrJimFan/status/2036494601750716711

OpenAI Automated Researcher: https://www.technologyreview.com/2026/03/20/1134438/openai-is-throwing-everything-into-building-a-fully-automated-researcher/

Patreon Post: https://www.patreon.com/c/aiexplained/posts

Eng Jobs: https://x.com/lennysan/status/2036535460726767793

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Show More Show Less

16 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
What the New ChatGPT 5.4 Means for the World

Mar 6 2026

Just 48 hours after releasing GPT 5.3 Instant, OpenAI have released GPT 5.4 Thinking, so either their is an imminent singularity or perhaps we are being distracted from other news. This video will give 9 crucial bits of context, not just on the GPT 5.4 drop but on the background to the meltdown between the Pentagon and Anthropic. What does this say about the state of AI progress, your job, and what is next.

Check out my fast-growing (!) app, free to use, and code INSIDER15 for 15% off paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:06: GPT 5.4 Breakdown
05:06 - Closing the Loop
06:35 - Spiky Performance
10:31 - Advice
11:32 - Less Encouraging Developments - Fired Like Dogs
17:45 - But Used in Iran

GPT 5.4: https://openai.com/index/introducing-gpt-5-4/

Hallucinations: https://artificialanalysis.ai/evaluations/omniscience
Investment Banking Bench: https://x.com/bradlightcap/status/2029684672343728452
Move 37: https://x.com/nasqret/status/2029628846518010099
System Card: https://deploymentsafety.openai.com/gpt-5-4-thinking/gpt-5-4-thinking.pdf

Prediction Market Scandal: https://www.wired.com/story/openai-fires-employee-insider-trading-polymarket-kalshi/

GPT 5.3 Instant: https://openai.com/index/gpt-5-3-instant/

GDPVal: https://openai.com/index/gdpval/

Claude in Iran: https://www.washingtonpost.com/technology/2026/03/04/anthropic-ai-iran-campaign

‘Like Dogs’: https://x.com/AndrewCurran_/status/2029605783311470679

Altman leak: https://www.cnbc.com/2026/03/03/sam-altman-tells-openai-staff-operational-decisions-up-to-government.html

Original 2024 Switch: https://archive.fo/20240116172526/https://www.bloomberg.com/news/articles/2024-01-16/openai-working-with-us-military-on-cybersecurity-tools-for-veterans#selection-6173.83-6173.226

Amodei Original Memo: https://www.theinformation.com/articles/read-anthropic-ceos-memo-attacking-openais-mendacious-pentagon-announcement?rc=sy0ihq
Anthropic Apology: https://www.anthropic.com/news/where-stand-department-war
OpenAI Employee Reaction: https://x.com/tszzl/status/2029334980481212820

DoD Suppler Risk: https://www.cnbc.com/amp/2026/03/05/anthropic-pentagon-ai-claude-iran.html
Atlantic Exclusive: https://archive.fo/20260301152646/https://www.theatlantic.com/technology/2026/03/inside-anthropics-killer-robot-dispute-with-the-pentagon/686200/#selection-941.61-941.212
No Negotiation: https://x.com/USWREMichael/status/2029754965778907493

$20B Doubling: https://archive.ph/20260304111124/https://www.bloomberg.com/news/articles/2026-03-03/anthropic-nears-20-billion-revenue-run-rate-amid-pentagon-feud

March 2022 Interview: https://www.youtube.com/watch?v=uAA6PZkek4A

https://lmcouncil.ai/

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Show More Show Less

22 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Deadline Day for Autonomous AI Weapons & Mass Surveillance

Feb 27 2026

Will Anthropic be forced to make a version of Claude for war? And does a new paper expose the risks of Claude agents, in both OpenClaw and the field of war? Plus, 5 more twists in the story of the Pentagon versus Anthropic + some AI lab employees, and a petition that could change everything, or nothing...

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:44 - Deadline Day + Petition
02:42 - Twist 1: Existing Deal
03:26 - Twist 2: Existing Policy
04:21 - Twist 3: Twin Threats
05:54 - Twist 4: Interesting Objections
11:32 - Twist 5: Anthropic’s Dropped Policy

Dario Statement: https://www.anthropic.com/news/statement-department-of-war

Google/OpenAI Petition: https://notdivided.org/

Axios on Amodei Rejection: https://www.axios.com/2026/02/26/anthropic-rejects-pentagon-ai-terms

FT on US Threat: https://www.ft.com/content/11d27612-d6c5-4cf7-94dd-f65603549b7f

Politico on Latest: https://archive.ph/20260227013117/https://www.politico.com/news/2026/02/26/incoherent-hegseths-anthropic-ultimatum-confounds-ai-policymakers-00800135

The Verge on Current Deal: https://www.theverge.com/ai-artificial-intelligence/883456/anthropic-pentagon-department-of-defense-negotiations

Anthropic RSP change: https://www.anthropic.com/news/responsible-scaling-policy-v3

Time Magazine on RSP: https://time.com/7380854/exclusive-anthropic-drops-flagship-safety-pledge/

Agent of Chaos Paper: https://x.com/NatalieShapira/status/2026062499599319526

AI Agent Reliability Paper: https://arxiv.org/pdf/2602.16666

My Patreon Video: https://www.patreon.com/posts/real-mystery-ai-151647211

Patreon Documentary: https://www.patreon.com/posts/our-new-age-of-133960279

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Show More Show Less

14 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI

Feb 20 2026

Do we have a new best AI model, or do we have the downfall of benchmarks in general, as a way of capturing machine intelligence? Full breakdown of Gemini 3.1 Pro, guest-starring the new Sonnet 4.6, plus analysis from 7 papers/posts that will give you much needed context. Oh, and a new record on Simple Bench!

https://epoch.ai/ai-explained-datacenters

Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:30 - Post-training Dominance
04:00 - ARC-AGI 2 Caveat
05:54 - Simple Bench Record
08:22 - Hallucination Caveat
10:05 - Model Card
11:12 - Exponential Coming
12:20 - Amodei on Generalizing
15:10 - One True Benchmark?
17:02 - Other Metrics…

Gemini 3.1 Model Card: https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-1-Pro-Model-Card.pdf

Release: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/

Where are Agents deployed?: https://www.anthropic.com/research/measuring-agent-autonomy

Newsletter Post: https://signaltonoise.beehiiv.com/p/4-ai-numbers-that-surprised-me-this-week

Hallucination AA: https://artificialanalysis.ai/evaluations/omniscience

Melanie Mitchell: https://x.com/MelMitchell1/status/2022738363548340526
ARC-AGI-2: https://x.com/arcprize/status/2024522812728496470/photo/1

Chollet on Agentic Coding and ML: https://x.com/fchollet/status/2024519439140737442

METR Caveat: https://metr.org/notes/2026-01-22-time-horizon-limitations/

Talaas Fast: https://chatjimmy.ai/

Amodei Interview Continual learning: https://www.dwarkesh.com/p/dario-amodei-2?open=false#%C2%A7002942-is-continual-learning-necessary-how-will-it-be-solved

Metaculus FutureEval: https://www.metaculus.com/futureeval/

Next Vid to Watch: https://www.patreon.com/posts/what-you-need-to-150647292

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Show More Show Less

19 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
The Two Best AI Models/Enemies Just Got Released Simultaneously

Feb 6 2026

The two models that you will hear discussed for at least the next two months - Claude Opus 4.6 and GPT 5.3 Codex - just got released within 26 mins or each other. The full breakdown of around 250 pages of reports, with just the most interest moments, from the battle of which is best, Claude personhood, the surprising misbehaviour of Opus 4.6, and much more

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

AI Insiders ($9): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:54 - Self-improvement?
02:44 - Knowledge Work
05:30 - Overly agentic behaviour
09:12 - Who Shouldn’t Use Claude Opus
11:39 - Step-change?
15:09 - Claude’s ‘Personhood’

Hassabis Roadmap: https://www.patreon.com/posts/hassabis-roadmap-149750869

Release of Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6
212 Page System Card: https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdf
Claude Code Tip: https://x.com/bcherny/status/2019475897691124107

GPT Codex 5.3: https://openai.com/index/introducing-gpt-5-3-codex/
System Card: https://openai.com/index/gpt-5-3-codex-system-card/

Browse Comp: https://arxiv.org/pdf/2504.12516v1
Finance Agent: https://www.vals.ai/benchmarks/finance_agent
Terminal Bench 2: https://arxiv.org/pdf/2601.11868
Vending Bench: https://andonlabs.com/blog/opus-4-6-vending-bench

My X post: https://x.com/AIExplainedYT/status/2016851303436095647

Anthropic Apology: https://x.com/ch402/status/2014066134194995256/photo/1

Altman rebuttal: https://x.com/sama/status/2019139174339928189
https://x.com/sama/status/2019140276246442089

4% of GitHub: https://x.com/dylan522p/status/2019490550911766763

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

Show More Show Less

20 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free

Episodes

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Claude Mythos: Highlights from 244-page Release

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

What the New ChatGPT 5.4 Means for the World

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Deadline Day for Autonomous AI Weapons & Mass Surveillance

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

The Two Best AI Models/Enemies Just Got Released Simultaneously

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed