Your First AI at Home

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

Your First AI at Home

Listen for free

View show details

LIMITED TIME OFFER | Get 2 Months for ₹5/month

About this listen

Domesticating AI — S01E01: Your First AI at HomeHosts: Miriah Peterson, Matt Sharp, Chris BrousseauThis episode is your practical on-ramp to running AI at home: why inference engines matter, what to install first, and how to make “local AI” feel stable instead of fragile. The hosts start with a hardware + market reality check (tinygrad’s tinybox-style “AI server appliance” idea and the ongoing memory/RAM crunch), then break down what an inference engine actually does, how popular runtimes compare (llama.cpp, vLLM, Ollama, TGI), and a sane starter workflow for getting from “downloaded a model” to “usable local AI.”Inference engines are the “runtime”: model loading, tokenization, KV cache/context handling, and the serving layer.Pick your engine based on your goal: tinkering (llama.cpp) vs serving throughput (vLLM/TGI) vs it-just-works packaging (Ollama).You don’t need a brand-new rig to start, but RAM/VRAM constraints will shape everything.Use leaderboards as a hint, then validate with your own small eval prompts that match your workload.If you’re exposing anything beyond your LAN: reverse proxy + TLS + don’t casually open ports.0:00 Intro + host chaos + what the show is1:08 News: tinygrad / “AI server appliance” thinking (tinybox vibes)2:44 News: RAM prices + the memory crunch for builders8:26 Main: building your first AI at home (why now)8:49 What is an inference engine?12:30 Engines compared: llama.cpp vs vLLM vs Ollama vs TGI15:42 Do you need to buy a new computer? (CPU vs GPU realities)25:32 Models for home: fit-to-hardware, quantization, context34:37 Leaderboards vs evals: picking models you can trust44:00 Community + meetups + where to follow45:22 Outro — “Keep your AI on a leash”News / contextTom’s Hardware: TinyBox production + multi-GPU appliance concept (Tom's Hardware)Reuters: AI-driven memory shortage / supply-chain crunch (Reuters)IDC: 2026 device impacts from the memory shortage (IDC)Inference enginesllama.cpp (GGML org) (GitHub)vLLM OpenAI-compatible server (docs.vllm.ai)Ollama docs (quickstart) (Ollama Documentation)Hugging Face Text Generation Inference (TGI) (GitHub)Miriah Peterson: Software engineer, Go educator, and community builder focused on production-first AI. Runs SoyPete Tech (streams + writing + open-source).Matt Sharp: AI Engineer/Strategist, co-author of LLMs in Production, MLOps practitioner. Writes The Data Pioneer. (thedatapioneer.substack.com)Chris Brousseau: NLP practitioner, co-author of LLMs in Production, VP of AI at VEOX. You can find him as IMJONEZZ. (veox.ai)SoyPete Tech (YouTube): (youtube.com)SoyPete Tech (Substack): (soypetetech.substack.com)Matt’s Substack (The Data Pioneer): (thedatapioneer.substack.com)Chris on YouTube (IMJONEZZ): (youtube.com)LLMs in Production (book): (Manning Publications)

No reviews yet