The Memriq AI Inference Brief – Engineering Edition cover art

The Memriq AI Inference Brief – Engineering Edition

The Memriq AI Inference Brief – Engineering Edition

Written by: Keith Bourne
Listen for free

About this listen

The Memriq AI Inference Brief – Engineering Edition is a weekly deep dive into the technical guts of modern AI systems: retrieval-augmented generation (RAG), vector databases, knowledge graphs, agents, memory systems, and more. A rotating panel of AI engineers and data scientists breaks down architectures, frameworks, and patterns from real-world projects so you can ship more intelligent systems, faster.Copyright 2025 Memriq AI Politics & Government Self-Help Success
Episodes
  • Recursive Language Models: A Paradigm Shift for Agentic AI Scalability
    Jan 12 2026

    Discover how Recursive Language Models (RLMs) are fundamentally changing the way AI systems handle ultra-long contexts and complex reasoning. In this episode, we unpack why RLMs enable models to programmatically query massive corpora—two orders of magnitude larger than traditional transformers—delivering higher accuracy and cost efficiency for agentic AI applications.

    In this episode:

    - Explore the core architectural shift behind RLMs and how they externalize context via sandboxed Python environments

    - Compare RLMs against other long-context approaches like Gemini 1.5 Pro, Longformer, BigBird, and RAG

    - Dive into technical trade-offs including latency, cost variability, and verification overhead

    - Hear real-world use cases in legal discovery, codebase analysis, and research synthesis

    - Get practical tips on tooling with RLM official repo, Modal and Prime sandboxes, and hybrid workflows

    - Discuss open challenges and future research directions for optimizing RLM deployments

    Key tools and technologies mentioned:

    - Recursive Language Model (RLM) official GitHub repo

    - Modal and Prime sandboxed execution environments

    - GPT-5 and GPT-5-mini models

    - Gemini 1.5 Pro, Longformer, BigBird architectures

    - Retrieval-Augmented Generation (RAG)

    - Prime Intellect context folding

    - MemGPT, LLMLingua token compression

    Timestamps:

    00:00 - Introduction to Recursive Language Models and agentic AI

    03:15 - The paradigm shift: externalizing context and recursive querying

    07:30 - Benchmarks and performance comparisons with other long-context models

    11:00 - Under the hood: how RLMs orchestrate recursive sub-LLM calls

    14:20 - Real-world applications: legal, code, and research use cases

    16:45 - Technical trade-offs: latency, cost, and verification

    18:30 - Toolbox and best practices for engineers

    20:15 - Future directions and closing thoughts

    Resources:

    "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition

    This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Stay tuned and keep pushing the boundaries of AI engineering with Memriq Inference Digest!

    Show More Show Less
    21 mins
  • Evaluating Agentic AI: DeepEval, RAGAS & TruLens Frameworks Compared
    Jan 5 2026

    # Evaluating Agentic AI: DeepEval, RAGAS & TruLens Frameworks Compared

    In this episode of Memriq Inference Digest - Engineering Edition, we explore the cutting-edge evaluation frameworks designed for agentic AI systems. Dive into the strengths and trade-offs of DeepEval, RAGAS, and TruLens as we unpack how they address multi-step agent evaluation challenges, production readiness, and integration with popular AI toolkits.

    In this episode:

    - Compare DeepEval’s extensive agent-specific metrics and pytest-native integration for development testing

    - Understand RAGAS’s knowledge graph-powered synthetic test generation that slashes test creation time by 90%

    - Discover TruLens’s production-grade observability with hallucination detection via the RAG Triad framework

    - Discuss hybrid evaluation strategies combining these frameworks across the AI lifecycle

    - Learn about real-world deployments in fintech, e-commerce, and enterprise conversational AI

    - Hear expert insights from Keith Bourne on calibration and industry trends

    Key tools & technologies mentioned:

    DeepEval, RAGAS, TruLens, LangChain, LlamaIndex, LangGraph, OpenTelemetry, Snowflake, Datadog, Cortex AI, DeepTeam

    Timestamps:

    00:00 - Introduction to agentic AI evaluation frameworks

    03:00 - Key metrics and evaluation challenges

    06:30 - Framework architectures and integration

    10:00 - Head-to-head comparison and use cases

    14:00 - Deep technical overview of each framework

    17:30 - Real-world deployments and best practices

    19:30 - Open problems and future directions

    Resources:

    1. "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition
    2. This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Show More Show Less
    20 mins
  • Model Context Protocol: The Universal AI Integration Standard Explained
    Dec 15 2025

    Discover how the Model Context Protocol (MCP) is revolutionizing AI systems integration by simplifying complex multi-tool interactions into a scalable, open standard. In this episode, we unpack MCP’s architecture, adoption by industry leaders, and its impact on engineering workflows.

    In this episode:

    - What MCP is and why it matters for AI/ML engineers and infrastructure teams

    - The M×N integration problem and how MCP reduces it to M+N

    - Core primitives: Tools, Resources, and Prompts, and their roles in MCP

    - Technical deep dive into JSON-RPC 2.0 messaging, transports, and security with OAuth 2.1 + PKCE

    - Comparison of MCP with OpenAI Function Calling, LangChain, and custom REST APIs

    - Real-world adoption, performance metrics, and engineering trade-offs

    - Open challenges including security, authentication, and operational complexity

    Key tools & technologies mentioned:

    - Model Context Protocol (MCP)

    - JSON-RPC 2.0

    - OAuth 2.1 with PKCE

    - FastMCP Python SDK, MCP TypeScript SDK

    - agentgateway by Solo.io

    - OpenAI Function Calling

    - LangChain

    Timestamps:

    00:00 — Introduction to MCP and episode overview

    02:30 — The M×N integration problem and MCP’s solution

    05:15 — Why MCP adoption is accelerating

    07:00 — MCP architecture and core primitives explained

    10:00 — Head-to-head comparison with alternatives

    12:30 — Under the hood: protocol mechanics and transports

    15:00 — Real-world impact and usage metrics

    17:30 — Challenges and security considerations

    19:00 — Closing thoughts and future outlook

    Resources:

    • "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition
    • This podcast is brought to you by Memriq.ai - AI consultancy and content studio building tools and resources for AI practitioners.

    Show More Show Less
    20 mins
No reviews yet