EP 03 Sep 18, 2024 59 min

Rahul Parundekar on production RAG

With Rahul Parundekar, Founder, AI Hero (ex-LlamaIndex, Toyota)

Production RAG end to end — when to fine-tune vs. augment, tricks to squeeze accuracy out of retrieval, multimodal RAG, on-device LLMs, and why the open-source wave is reshaping the infrastructure conversation.

Show notes

  • 00:00 — Intro
  • 03:46 — Rahul’s journey in AI
  • 06:51 — Challenges in AI: then and now
  • 12:15 — Understanding RAG systems
  • 22:34 — RAG and fine-tuning: a combined approach
  • 31:09 — Challenges and tricks to improve RAG accuracy
  • 33:16 — Optimizing AI infrastructure and hardware
  • 35:48 — On-device LLMs and future applications
  • 38:33 — Small models for specific tasks
  • 43:51 — Exploring multimodal RAG
  • 48:38 — Open source vs. closed source models
  • 54:33 — Future of AI and agentic systems
  • 58:50 — Closing remarks