Chain of Thought | AI Agents, Infrastructure & Engineering

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Chain of Thought | AI Agents, Infrastructure & Engineering

Written by: Conor Bronsdon

Listen for free

About this listen

AI is reshaping infrastructure, strategy, and entire industries. Host Conor Bronsdon talks to the engineers, founders, and researchers building breakthrough AI systems about what it actually takes to ship AI in production, where the opportunities lie, and how leaders should think about the strategic bets ahead. Chain of Thought translates technical depth into actionable insights for builders and decision-makers. New episodes weekly. Conor Bronsdon is an angel investor in AI and dev tools, Technical Ecosystem Lead at Modular, and previously led growth at AI startups Galileo and LinearB. Disclaimer: All views, opinions and statements expressed on this account are solely my own and are made in my personal capacity. They do not reflect, and should not be construed as reflecting, the views, positions, or policies of Modular. This account is not affiliated with, authorized by, or endorsed by Modular in any way.Conor Bronsdon

Economics

Episodes View all

Every AI Agent Has an Evaluation Gap | Alex Ratner, Snorkel AI

Apr 29 2026
Alex Ratner co-founded Snorkel AI out of Chris Ré's Stanford lab and helped establish data-centric AI as a field. Today, Snorkel is a $1.3B company shipping thousands of data sets and environments a week to frontier labs and vertical AI teams like Harvey.
In this conversation, he argues our ability to build AI agents has outpaced our ability to measure them. That gap is what's keeping most enterprise agents stuck in demo purgatory.
If you can't measure it, you can't improve it. And you can't deploy it.
In this conversation:
The three axes of the evaluation gap: input complexity, autonomy horizon, and output complexity
Big Law Bench: how Snorkel and Harvey benchmarked legal agents on deep-research tasks that take lawyers 10-15 hours
What Snorkel's $3M Open Benchmarks Grant is funding, and why "benchmaxxing" critiques don't kill the case for public benchmarks
Why 40-50% of Snorkel's data work is still review and labeling, even with the best models in the loop
The "expert-agentic" era, where domain expertise (law, finance, coding, even woodworking) is the new bottleneck
Why self-supervision is a dead end outside narrow cases like distillation
The false dichotomy between data and environments, and why pure-environment vendors miss how AI actually works
Chapters
(00:00) Intro: Alex Ratner and Snorkel AI
(02:50) What the evaluation gap actually is
(06:05) Moravec's paradox and the jagged frontier
(08:46) Where AI agents fall down in enterprise work
(10:40) Big Law Bench: benchmarking Harvey's legal agents
(12:00) The three axes: input, autonomy horizon, output
(18:31) Snorkel's $3M Open Benchmarks Grant
(22:33) From "janitorial" to epicenter: 15 years of data-centric AI
(29:26) The expert-agentic data era
(34:54) The false dichotomy between data and environments
(40:05) DoorDash Tasks and expert data at scale
Connect with Alex Ratner:
X/Twitter: https://x.com/ajratner
Snorkel AI: https://snorkel.ai
Connect with Conor:
Newsletter: https://newsletter.chainofthought.show/
Twitter/X: https://x.com/ConorBronsdon
LinkedIn: https://www.linkedin.com/in/conorbronsdon/
YouTube: https://www.youtube.com/@ConorBronsdon
More episodes: https://chainofthought.show
Thanks to Galileo — download their free 165-page guide to mastering multi-agent systems at galileo.ai/mastering-multi-agent-systems
Show More Show Less
43 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
250,000 Lines of Code/Week: Inside an AMD VP's Agent-First Workflow | Anush Elangovan

Apr 22 2026
What happens when a VP of AI Software at a major chip company goes all-in on AI coding agents for his own team's work?
Anush Elangovan runs 10–12 Claude Code agents across three machines, burns 6.5 billion tokens a week, and rewrote a 25-year-old project (Slurm → Spur in Rust) in a single night.
He does it all on dangerously-skip-permissions.
About Anush
Anush Elangovan is Corporate VP of AI Software at AMD. He founded Nod.ai, where his team built SHARK and was a primary contributor to Torch-MLIR and IREE. AMD acquired Nod.ai in 2023, and Anush now leads AI software strategy across AMD's full silicon portfolio. Before Nod.ai, he shipped the graphics stack on the first ARM Chromebook and led Chrome OS's migration to Gentoo.
We cover:
How Anush runs 10–12 parallel agents with a geo-distributed AMD hardware rig
Why the test harness is the new code review (and why agents are "sneaky and dumb")
Rewriting a 25-year-old project in Rust overnight, without opening the editor
Why every new project is in Rust specifically because he refuses to learn it
The "HR partner fixing engineering bugs" moment and what it says about upskilling
Why normal SDLC is dead and speed is the only durable moat
AMD's fully open-source software stack and how community contributions are accelerating ROCm
"Software is just tokens" and what that means for AMD's bet against CUDA lock-in
Connect with Anush
LinkedIn: linkedin.com/in/anushelangovan
Twitter/X: @AnushElangovan
AMD AI blog: amd.com
AMD AI Developer Program: amd.com/developer
Connect with Conor
Newsletter: newsletter.chainofthought.show
Twitter/X: @ConorBronsdon
LinkedIn: linkedin.com/in/conorbronsdon
YouTube: @ConorBronsdon
More episodes: chainofthought.show
Chapters
0:00 Cold open
0:21 Welcome + guest intro
3:43 250K lines a week, 10–12 parallel agents
7:34 Agent architecture + geo-distributed test rig
9:57 When does AI-generated code become a liability?
14:12 80% tests first: the test harness philosophy
18:24 Dangerously-skip-permissions + testing as code review
19:52 "Normal SDLC is dead in the agentic world"
20:44 Advice for engineers and leaders who feel behind
24:51 Tokens, throughput, and what happens next
26:29 Block layoffs, uneven AI gains, the 25-year Slurm rewrite
32:55 Galileo sponsor break
34:24 When agents go off the rails: sneaky and dumb
37:52 Orchestrator agents vs. focused multi-threading
40:45 Open source, ROCm, AMD's software bet
44:19 "Software is just tokens"
45:24 AMD Developer Program + community contributions
47:09 Where to start with AMD
48:39 Heterogeneous compute
50:13 Outro
Thanks to Galileo. Download their free 165-page guide to mastering multi-agent systems at galileo.ai/mastering-multi-agent-systems
Full show notes: newsletter.chainofthought.show
Disclaimer from our host: All views, opinions and statements expressed on this account are solely my own and are made in my personal capacity. They do not reflect, and should not be construed as reflecting, the views, positions, or policies of my employer. This account is not affiliated with, authorized by, or endorsed by my employer in any way.
Show More Show Less
51 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free
Hallucinations Are a Data Architecture Problem | Sudhir Hasbe, Neo4j

Apr 16 2026
Sudhir Hasbe is President and Chief Product Officer at Neo4j, the graph database company powering 84 of the Fortune 100 (Walmart, Uber, Airbus) at $200M+ ARR and a $2B+ valuation. Before Neo4j, he ran product for all of Google Cloud's data analytics services: BigQuery, Looker, Dataflow, and led the Looker acquisition.
His thesis: the hallucinations we blame on AI models are really a data architecture problem. LLMs weren't trained on your enterprise knowledge, so handing them a data lake with 10,000 disconnected tables and asking them to reason is the wrong design. The fix is knowledge graphs: feeding the model a structured map of relationships, entities, and context so it can reason over meaning, not just vector similarity.
Sudhir breaks down the five capabilities knowledge graphs unlock for enterprise AI: GraphRAG (moving accuracy from 60% to 97%), semantic mapping across siloed systems, context graphs, agent memory, and multi-hop reasoning. He explains three architecture patterns customers are actually shipping, why giving an LLM hundreds of tools makes it worse, and what Uber, EA Sports, Klarna, and Novo Nordisk are doing differently.
This is the case for treating knowledge as infrastructure.
We cover:
Why enterprise AI needs a different playbook than consumer AI
The five data asset types every agentic system needs: system of record, historical, memory, context, and reference
How GraphRAG combines vector search and graph traversal to move from 60% accuracy to 95%+
Three architecture patterns: semantic layer only, semantic map plus domain data, full consolidation (the Klarna/Kiki model)
What context graphs capture that Salesforce doesn't: the Slack and email negotiation behind every deal
Why giving an LLM hundreds of tools drops accuracy, and how Uber uses knowledge graphs as a business validation layer
What Neo4j's Aura Agent, MCP server, and A2A support mean for developers starting today
Chapters:
(0:00) Why building a self-driving car is hard
(0:22) Intro
(2:03) Hallucinations as a data architecture problem
(4:31) From models-as-core to systems-of-knowledge
(6:13) Why data lakes fail AI agents
(9:15) The five data asset types enterprise agents need
(11:46) Where basic RAG breaks down: the Spotify metadata lesson
(16:00) GraphRAG: 3x accuracy, easier development, explainability
(18:47) Semantic mapping across the enterprise estate
(19:23) Three knowledge-graph architecture patterns
(22:42) Context graphs: capturing the "why" behind decisions
(25:33) Individual vs. organizational agent memory
(28:40) Multi-hop reasoning for fraud rings and AML
(31:52) Why there are no shortcuts in enterprise AI
(36:38) What happens when you give an LLM 100 tools
(39:19) The Uber example: knowledge graph as business validation
(44:42) First mile of a 26-mile marathon
(48:32) Aura Agent, MCP server, and the A2A protocol
(50:43) Where developers should start
Connect with Sudhir Hasbe:
LinkedIn: https://www.linkedin.com/in/shasbe/
Neo4j: https://neo4j.com/
Neo4j Aura: https://neo4j.com/product/auradb/
Connect with Conor:
Newsletter: https://newsletter.chainofthought.show/
Twitter/X: https://x.com/ConorBronsdon
LinkedIn: https://www.linkedin.com/in/conorbronsdon/
YouTube: https://www.youtube.com/@ConorBronsdon
More episodes: https://chainofthought.show
Thanks to Galileo — download their free 165-page guide to mastering multi-agent systems at:
galileo.ai/mastering-multi-agent-systems
Show More Show Less
52 mins

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Listen for free

No reviews yet

Chain of Thought | AI Agents, Infrastructure & Engineering

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Chain of Thought | AI Agents, Infrastructure & Engineering

About this listen

Every AI Agent Has an Evaluation Gap | Alex Ratner, Snorkel AI

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

250,000 Lines of Code/Week: Inside an AMD VP's Agent-First Workflow | Anush Elangovan

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed

Hallucinations Are a Data Architecture Problem | Sudhir Hasbe, Neo4j

Failed to add items

Add to cart failed.

Add to wishlist failed.

Remove from wishlist failed.

Follow podcast failed

Unfollow podcast failed