Episodes

  • Module 6: RAG | Chunking - Where You Cut Decides What Gets Found
    Apr 29 2026

    This episode is about chunking, the quiet step in a RAG pipeline that decides whether your system retrieves the right answer or a confidently wrong one. It covers why the chunk is the real unit of retrieval, the tradeoff between context and precision, the main strategies teams use to split documents, and why testing your chunks against real questions matters more than picking the perfect size.

    Show More Show Less
    11 mins
  • Module 6: RAG | Data Ingestion - Before Your Documents Can Be Found
    Apr 27 2026

    This episode is about the step that every RAG system depends on. Before meaning can be stored or retrieved, your raw documents have to become clean text. What goes wrong here breaks the entire pipeline in ways that are surprisingly hard to catch.

    Show More Show Less
    12 mins
  • Module 6: RAG | Vector Databases - Where That Meaning Gets Stored
    Apr 27 2026

    This episode is about the infrastructure underneath every RAG system. It covers the purpose-built engine that stores all that meaning and searches millions of vectors in milliseconds, in a way no traditional database can. This is what makes retrieval fast enough to actually work in production.

    Show More Show Less
    10 mins
  • Module 6: RAG | Embeddings - Teaching Machines to Understand Meaning
    Apr 27 2026

    This episode is about the layer of RAG that makes semantic search possible. It covers how machines turn language into math that clusters similar ideas together, so a question and its answer can find each other even when they share no words in common. Without this, RAG is just keyword search with extra steps.

    Show More Show Less
    8 mins
  • Module 6: The RAG Pipeline - End to End
    Apr 25 2026

    This episode maps out the full RAG pipeline end to end using one concrete scenario, a defense contractor building an AI assistant for fighter jet maintenance crews. It walks through both phases of the architecture, offline and online, following a real question all the way from a raw document to a grounded answer. It also covers why the architecture is modular and closes with the four failure modes that quietly break RAG systems in production.

    Show More Show Less
    13 mins
  • Module 6: What is RAG and Why it Exists
    Apr 25 2026

    This episode kicks off Module 6 with RAG (Retrieval Augmented Generation), the #1 architecture every serious enterprise actually uses. Discover why regular LLMs hallucinate on your private data and high-stakes queries, and how RAG fixes it by forcing the model to retrieve real documents first.

    Show More Show Less
    9 mins
  • Module 5: Reasoning Models
    Apr 17 2026

    This episode covers reasoning models, the shift from manually guiding a model's thinking to letting the model reason through complex problems on its own before responding. It explains the concept of test-time compute, why reasoning models take longer but perform dramatically better on hard tasks, and how they change the way you should prompt. It walks through when to reach for a reasoning model versus a standard one, and closes by framing the full prompt engineering toolkit in context, from few-shot examples through reasoning models.

    Show More Show Less
    8 mins
  • Module 5: Structured Output and the Language of Software
    Apr 17 2026

    This episode covers structured output, how you get a model to respond in predictable, machine-readable formats like JSON instead of natural language paragraphs. It walks through three approaches, from simply asking in the prompt, to JSON mode, to schema-based constraints, and explains why each level adds more reliability. It uses real-world examples to show how structured output turns AI from a conversation partner into a software component that can feed databases, trigger workflows, and drive automation. It closes with practical tips for writing schemas and validating output in production.

    Show More Show Less
    8 mins