Chatbots Behaving Badly cover art

Chatbots Behaving Badly

Chatbots Behaving Badly

Written by: Chatbots Behaving Badly by Markus Brinsa
Listen for free

About this listen

They were supposed to make life easier. Instead, they flirted with your customers, hallucinated facts, and advised small business owners to break the law. We’re not here to worship the machines. We’re here to poke them, question them, and laugh when they break. Welcome to Chatbots Behaving Badly — a podcast about the strange, hilarious, and sometimes terrifying ways AI gets it wrong. New episodes drop every Tuesday, covering the strange, brilliant, and dangerous world of generative AI — from hallucinations to high-stakes decisions in healthcare. This isn’t another hype-fest. It’s a podcast for people who want to understand where we’re really heading — and who’s watching the machines.

markusbrinsa.substack.comMarkus Brinsa
Economics
Episodes
  • Confidently Wrong - The Hallucination Numbers Nobody Likes to Repeat
    Jan 13 2026

    Confident answers are easy. Correct answers are harder.

    This episode takes a hard look at LLM “hallucinations” through the numbers that most people avoid repeating. A researcher from the Epistemic Reliability Lab explains why error rates can spike when a chatbot is pushed to answer instead of admit uncertainty, how benchmarks like SimpleQA and HalluLens measure that trade-off, and why some systems can look “helpful” while quietly getting things wrong.

    Along the way: recent real-world incidents where AI outputs created reputational and operational fallout, why “just make it smarter” isn’t a complete fix, and what it actually takes to reduce confident errors in production systems without breaking the user experience.

    This episode is based on the articles “Hallucination Rates in 2025 - Accuracy, Refusal, and Liability” (https://seikouri.com/hallucination-rates-in-2025-accuracy-refusal-and-liability) and “The Lie Rate - Hallucinations Aren’t a Bug. They’re a Personality Trait.” (https://chatbotsbehavingbadly.com/the-lie-rate-hallucinations-aren-t-a-bug-they-re-a-personali) by Markus Brinsa



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit markusbrinsa.substack.com
    Show More Show Less
    14 mins
  • The Day Everyone Got Smarter and Nobody Did
    Jan 6 2026

    This episode digs into the newest workplace illusion: AI-powered expertise that looks brilliant on the surface and quietly hollow underneath. Generative tools are polishing emails, reports, and “strategic” decks so well that workers feel more capable while their underlying skills slowly erode. At the same time, managers are convinced that AI is a productivity miracle—often based on research they barely understand and strategy memos quietly ghostwritten by the very systems they are trying to evaluate.

    Through an entertaining, critical conversation, the episode explores how this illusion of expertise develops, why “human in the loop” is often just a comforting fiction, and how organizations accumulate cognitive debt when they optimize for AI usage instead of real capability. It also outlines what a saner approach could look like: using AI as a sparring partner rather than a substitute for thinking, protecting spaces where humans still have to do the hard work themselves, and measuring outcomes that actually matter instead of counting how many times someone clicked the chatbot.

    The episode is based on the article “The Day Everyone Got Smarter, and Nobody Did” by Markus Brinsa.

    https://chatbotsbehavingbadly.com/the-day-everyone-got-smarter-and-nobody-did



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit markusbrinsa.substack.com
    Show More Show Less
    18 mins
  • Chatbots Crossed The Line
    Dec 9 2025

    This episode of Chatbots Behaving Badly looks past the lawsuits and into the machinery of harm. Together with clinical psychologist Dr. Victoria Hartman, we explain why conversational AI so often “feels” therapeutic while failing basic mental-health safeguards. We break down sycophancy (optimization for agreement), empathy theater (human-like cues without duty of care), and parasocial attachment (bonding with a system that cannot repair or escalate). We cover the statistical and product realities that make crisis detection hard—low base rates, steerable personas, evolving jailbreaks—and outline what a care-first design would require: hard stops at early risk signals, human handoffs, bounded intimacy for minors, external red-teaming with veto power, and incentives that prioritize safety over engagement. Practical takeaways for clinicians, parents, and heavy users close the show: name the limits, set fences, and remember that tools can sound caring—but people provide care.

    The episode is based on the article “Chatbots Crossed the Line” by Markus Brinsa.

    https://chatbotsbehavingbadly.com/chatbots-crossed-the-line



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit markusbrinsa.substack.com
    Show More Show Less
    11 mins
No reviews yet