EP 03 Sep 18, 2024 59 min
Rahul Parundekar on production RAG
With Rahul Parundekar, Founder, AI Hero (ex-LlamaIndex, Toyota)
Production RAG end to end — when to fine-tune vs. augment, tricks to squeeze accuracy out of retrieval, multimodal RAG, on-device LLMs, and why the open-source wave is reshaping the infrastructure conversation.
Show notes
- 00:00 — Intro
- 03:46 — Rahul’s journey in AI
- 06:51 — Challenges in AI: then and now
- 12:15 — Understanding RAG systems
- 22:34 — RAG and fine-tuning: a combined approach
- 31:09 — Challenges and tricks to improve RAG accuracy
- 33:16 — Optimizing AI infrastructure and hardware
- 35:48 — On-device LLMs and future applications
- 38:33 — Small models for specific tasks
- 43:51 — Exploring multimodal RAG
- 48:38 — Open source vs. closed source models
- 54:33 — Future of AI and agentic systems
- 58:50 — Closing remarks