Rahul Parundekar on production RAG

Production RAG end to end — when to fine-tune vs. augment, tricks to squeeze accuracy out of retrieval, multimodal RAG, on-device LLMs, and why the open-source wave is reshaping the infrastructure conversation.

Show notes

00:00 — Intro
03:46 — Rahul’s journey in AI
06:51 — Challenges in AI: then and now
12:15 — Understanding RAG systems
22:34 — RAG and fine-tuning: a combined approach
31:09 — Challenges and tricks to improve RAG accuracy
33:16 — Optimizing AI infrastructure and hardware
35:48 — On-device LLMs and future applications
38:33 — Small models for specific tasks
43:51 — Exploring multimodal RAG
48:38 — Open source vs. closed source models
54:33 — Future of AI and agentic systems
58:50 — Closing remarks