0% found this document useful (0 votes)

314 views42 pages

Enhancing RAG Systems with LangChain

Uploaded by

luckymishra0734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

314 views42 pages

Enhancing RAG Systems with LangChain

Uploaded by

luckymishra0734

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Improving Real-World RAG Systems

Key Challenges & Practical Solutions

Dipanjan (DJ) Sarkar

Head of Community & Principal AI Scientist at Analytics Vidhya
Published Author, Google Developer Expert & Cloud Champion Innovator
Slides & Code
[Link]
Understanding RAG Systems
What is a RAG System?

APIs

Response

Raw Files
User

Vector Stores
Query

Databases
RAG System Architecture - Data Indexing
RAG System Architecture - Search and Generation
RAG System Challenges & Practical Solutions
Key Failure or Pain Points in a RAG System

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Missing Content

• Missing Content means the relevant context

to answer the question is not present in the
database

• Leads to the model giving a wrong answer

and hallucinating

• End users end up being frustrated with

irrelevant or wrong responses

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Solutions for Missing Content

Better data cleaning using tools like Better prompting to constrain the Agentic RAG with search tools to get
[Link] to ensure we extract model to NOT answer the question if live information for question with no
good quality data the context is irrelevant context data

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
• Better Data Cleaning

• Better Prompting

Hands-on Demo • Agentic RAG with Tools

• Get the notebook from HERE

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Missed Top Ranked

• Missed Top Ranked means context

documents don’t appear in the top retrieval
results

• Leads to the model not able to answer the

question

• Documents to answer the question are

present but failed to get retrieved due to
poor retrieval strategy

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Not in Context

• Not in Context means documents with the

answer are present during initial retrieval
but did not make it into the final context for
generating an answer

• Bad retrieval, reranking and consolidation

strategies lead to missing out on the right
documents in context

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Not Extracted

• Not extracted means the LLM struggles to

extract the correct answer from the
provided context even if it has the answer

• This occurs when there is too much

unnecessary information, noise or
contradicting information in the context

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Incorrect Specificity

• Output response is too vague and is not

detailed or specific enough
• Vague or generic queries might lead to not
getting the right context and response

• Wrong chunking or bad retrieval can lead to

this problem

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
• Use Better Chunking Strategies

• Hyperparameter Tuning - Chunking & Retrieval

Solutions for Missed • Use Better Embedder Models

Top Ranked, Not in
Context & Incorrect • Use Advanced Retrieval Strategies
Specificity
• Use Context Compression Strategies

• Use Better Reranker Models

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Experiment with Various Chunking Strategies

Splitter Type Description

RecursiveCharacterText Recursively splits text into larger chunks based on several defined
characters. Tries to keep related pieces of text next to each other.
Splitter LangChain’s recommended way to start splitting text
CharacterTextSplitter Splits text based on a user defined character. One of the simpler text splitters

tiktoken Splits text based on tokens using trained LLM tokenizers like GPT-4
spaCy Splits text using the tokenizer from the popular NLP library - spaCy
Splits text based on tokens using trained open LLM tokenizers available
SentenceTransformers
from the popular sentence-transformers library
The unstructured library allows various splitting and chunking strategies
[Link]
including splitting text based on key sections and titles
Hyperparameter Tuning - Chunking & Retrieval

Generate
LLM
Answer

Context:
-----------

Question: Chunk Top_K=K Eval

----------- Vector DB
Size = C Sim_thresh = S Metrics
Answer
-----------

C Value K Value S Value Space

Space Space
500 5 0.2
1,000 8 0.3
2,000 10 0.5
Better Embedder Models - MTEB Leaderboard
Better Embedder Models - Experiment Yourself

Embedding
hello world -0.027 -0.001 -0.020 ....... -0.023
models
Text Text as vector

• Newer Embedder Models will be trained on more data and often better

• Don’t just go by benchmarks, use and experiment on your data

• Do not use commercial models if data privacy is important

Advanced Retrieval Strategies

• Semantic Similarity Thresholding

• Multi-query Retrieval

• Hybrid Search (Keyword + Semantic)

• Reranking

• Chained Retrieval
Better Reranker Models

• Rerankers are fine-tuned cross-encoder

transformer models

• These models take in a pair of documents

(Query, Document) and return back a
relevance score

• Models fine-tuned on more pairs and

released recently will usually be better
Context Compression Strategies

• LLM prompt-based Context

Compression
• Extractor: Filters out content from context document
not related to query
• Filter: Filters out whole context documents not
related to query

• Microsoft LLMLingua Prompt

Compression
Solutions for Missed Top Ranked, Not in Context,
Not Extracted & Incorrect Specificity

• Effect of Embedder Models

• Advanced Retrieval Strategies

Hands-on Demo
• Chained Retrieval with Rerankers

• Context Compression Strategies

• Get the notebook from HERE

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Wrong Format

• The output response is in the wrong format

• It happens when you tell the LLM to return the

response in a specific format e.g JSON and it fails to
do so

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Solutions for Wrong Format

Powerful LLMs have native support for Better Prompting and Output Parsers Structured Output Frameworks
response formats e.g OpenAI supports
JSON outputs

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Solutions for Wrong Format - Native LLM Support
Solutions for Wrong Format - Output Parsers & Better Prompting

• LangChain allows to convert the raw LLM

response into a more consumable format by
using Output Parsers.

• There exists a variety of parsers including:

• String parser
• CSV parser
• Pydantic parser
• JSON parser
Solutions for Wrong Format - Structured Output Frameworks
Solutions for Wrong Format

• Native LLM Support

Hands-on Demo • Output Parsers

• Get the notebook from HERE

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Problem: Incomplete

• Incomplete means the generated response is

incomplete

• This could be because of poorly worded questions,

lack of right context retrieved, bad reasoning Reader

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
Solutions for Incomplete

Use Better LLMs like GPT-4o, Claude 3.5 or Build Agentic Systems with Tool Use if
Gemini 1.5 necessary

Use Advanced Prompting Techniques like Rewrite User Query and Improve Retrieval -
Chain-of-Thought, Self-Consistency HyDE

Source: Seven Failure Points When Engineering a Retrieval Augmented Generation System
HyDE - Hypothetical Document Embedding
Other Practical Solutions from recent Research Papers
which actually work!
RAG vs. Long Context LLMs

• Long Context LLMs often outperform RAG but are very

expensive in terms of computing and cost

• Hybrid approach where you can use an LLM to reflect

and see if the RAG answer is good enough or route to
Long Context LLM
RAG vs Long Context LLMs - Self-Router RAG
Agentic Corrective RAG
• Step 1:
• Retrieve context documents from vector database from the input query

• Step 2:
• Use an LLM to check if retrieved documents are relevant to input
question

• Step 3:
• If all documents are relevant (Correct), no specific action needed

• Step 4:
• If some or all documents are not relevant (Ambiguous OR Incorrect),
rephrase the query and search the web to get relevant context
information

• Step 5:
• Send rephrased query and context documents or information to the
LLM for response generation

Source: Corrective Retrieval Augmented Generation; [Link]

Agentic Corrective RAG

Source: [Link]
Agentic Self-Reflection RAG

Source: Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection; [Link]
Retrieval Augmented Fine-tuning (RAFT)

Source: RAFT: Adapting Language Model to Domain Specific RAG; [Link]

Recent LLMs to Check Hallucinations

• GPT-4o from OpenAI

• Lynx from PatronusAI

Source: Lynx: An Open Source Hallucination Evaluation Model; [Link]

Build an evaluation dataset and
RAG is still very much a
always evaluate your RAG
retrieval problem
system

Explore various chunking and Even with Long Context LLMs,

retrieval strategies, don’t stick RAG isn’t going anywhere (for
Key Takeaways to default settings now)

Agentic RAG systems and

domain-specific fine-tuned
RAG systems are the future

RAG
No ratings yet
RAG
33 pages
LlamaIndex vs LangGraph: A Comparison
No ratings yet
LlamaIndex vs LangGraph: A Comparison
38 pages
Chunking Strategies for RAG in 2025
100% (1)
Chunking Strategies for RAG in 2025
15 pages
Understanding Generative AI Agents
No ratings yet
Understanding Generative AI Agents
84 pages
Understanding Model Context Protocol (MCP)
No ratings yet
Understanding Model Context Protocol (MCP)
5 pages
LLCP: Using @langgraph/runner in NPM
No ratings yet
LLCP: Using @langgraph/runner in NPM
23 pages
Generative AI Course Overview and Projects
No ratings yet
Generative AI Course Overview and Projects
5 pages
Context vs. Prompt Engineering Guide
100% (1)
Context vs. Prompt Engineering Guide
15 pages
Advanced LangChain AI Assistant Framework
No ratings yet
Advanced LangChain AI Assistant Framework
7 pages
Advanced RAG Pipelines with Google Gemini
No ratings yet
Advanced RAG Pipelines with Google Gemini
11 pages
Python Fundamentals for Agentic Systems
No ratings yet
Python Fundamentals for Agentic Systems
91 pages
Comprehensive Survey on Agentic AI
No ratings yet
Comprehensive Survey on Agentic AI
46 pages
Building Effective LLM Agents
No ratings yet
Building Effective LLM Agents
16 pages
Internship Report: Data Science at Coding Cloud
No ratings yet
Internship Report: Data Science at Coding Cloud
72 pages
LangChain: A Comprehensive Guide
0% (1)
LangChain: A Comprehensive Guide
11 pages
Senior Python Developer Resume
No ratings yet
Senior Python Developer Resume
5 pages
Multi-Agent Systems with AutoGen Core
No ratings yet
Multi-Agent Systems with AutoGen Core
9 pages
In-Depth Analysis of Graph-Based RAG in A Unified Framework
No ratings yet
In-Depth Analysis of Graph-Based RAG in A Unified Framework
21 pages
RAG Evaluation with Phoenix Evals
No ratings yet
RAG Evaluation with Phoenix Evals
25 pages
Free Download: Applied AI Course on Transformers
No ratings yet
Free Download: Applied AI Course on Transformers
50 pages
LLMOps Engineer 6-Month Roadmap
No ratings yet
LLMOps Engineer 6-Month Roadmap
2 pages
Agentic AI Lecture Notes
No ratings yet
Agentic AI Lecture Notes
22 pages
Foundations of AI and Machine Learning
No ratings yet
Foundations of AI and Machine Learning
25 pages
LLM-Based Multi-Agent Systems Survey
No ratings yet
LLM-Based Multi-Agent Systems Survey
15 pages
LangChain and RAG Overview
No ratings yet
LangChain and RAG Overview
32 pages
Building Ambient Agents with LangGraph
No ratings yet
Building Ambient Agents with LangGraph
20 pages
Advanced LLM Prompt Engineering Guide
No ratings yet
Advanced LLM Prompt Engineering Guide
16 pages
Introduction to AI Agents and Systems
No ratings yet
Introduction to AI Agents and Systems
21 pages
MUNS AI Agent Skills Assessment
No ratings yet
MUNS AI Agent Skills Assessment
2 pages
End-to-End Evaluation of RAG Systems
No ratings yet
End-to-End Evaluation of RAG Systems
8 pages
Self-RAG: Enhancing RAG with Reflection
No ratings yet
Self-RAG: Enhancing RAG with Reflection
12 pages
LLM Interview Questions & Course Guide
No ratings yet
LLM Interview Questions & Course Guide
10 pages
Chunking Strategies in RAG Systems
No ratings yet
Chunking Strategies in RAG Systems
9 pages
AI Engineer Profile: Expertise in Generative AI
No ratings yet
AI Engineer Profile: Expertise in Generative AI
2 pages
Language-Driven Multi-Agent Learning Framework
No ratings yet
Language-Driven Multi-Agent Learning Framework
9 pages
Understanding Agentic AI Systems
No ratings yet
Understanding Agentic AI Systems
11 pages
AI Engineer Resume of Harsh Patel
No ratings yet
AI Engineer Resume of Harsh Patel
1 page
Evaluating Agentic RAG ROI and Costs
No ratings yet
Evaluating Agentic RAG ROI and Costs
1 page
Optimizing RAG Systems for Production
No ratings yet
Optimizing RAG Systems for Production
45 pages
RAG Techniques and Applications Overview
No ratings yet
RAG Techniques and Applications Overview
14 pages
Top 50 LLM Interview Questions Guide
No ratings yet
Top 50 LLM Interview Questions Guide
12 pages
Understanding AI vs AI Agents
No ratings yet
Understanding AI vs AI Agents
42 pages
GraphReader with Neo4j and LangGraph
No ratings yet
GraphReader with Neo4j and LangGraph
49 pages
Building Multi-Agent Systems with AutoGen
No ratings yet
Building Multi-Agent Systems with AutoGen
66 pages
LLM Fine-Tuning Techniques Overview
No ratings yet
LLM Fine-Tuning Techniques Overview
24 pages
RAG: Definition and Limitations
100% (2)
RAG: Definition and Limitations
11 pages
Aligning Open Language Models Overview
No ratings yet
Aligning Open Language Models Overview
77 pages
Build Your Own JARVIS AI Roadmap
No ratings yet
Build Your Own JARVIS AI Roadmap
2 pages
Six Steps to Master AI Techniques
100% (1)
Six Steps to Master AI Techniques
10 pages
Understanding AI Agents and LLM Pipelines
No ratings yet
Understanding AI Agents and LLM Pipelines
185 pages
Seizing the Agentic AI Advantage
No ratings yet
Seizing the Agentic AI Advantage
34 pages
Future of Agentic AI: Small Models Rise
No ratings yet
Future of Agentic AI: Small Models Rise
4 pages
Best Practices for Prompt Design
No ratings yet
Best Practices for Prompt Design
45 pages
CS 8001: Agentic AI Essentials Syllabus
No ratings yet
CS 8001: Agentic AI Essentials Syllabus
7 pages
Introduction to Large Language Models
No ratings yet
Introduction to Large Language Models
104 pages
Agentic AI Frameworks Overview
No ratings yet
Agentic AI Frameworks Overview
12 pages
Future of Agentic AI: Small Language Models
No ratings yet
Future of Agentic AI: Small Language Models
31 pages
RAG vs Agentic RAG Explained
No ratings yet
RAG vs Agentic RAG Explained
52 pages
Enhancing RAG Systems: Key Solutions
100% (1)
Enhancing RAG Systems: Key Solutions
43 pages
RAG and LLMs in Semantic Search
No ratings yet
RAG and LLMs in Semantic Search
16 pages
Senior Principal Machine Learning Engineer
No ratings yet
Senior Principal Machine Learning Engineer
1 page
Generative AI Models for Long Sequences
No ratings yet
Generative AI Models for Long Sequences
13 pages
Quantum Technologies 2024 Overview
No ratings yet
Quantum Technologies 2024 Overview
9 pages
Essential LLM Guardrails for AI Safety
100% (1)
Essential LLM Guardrails for AI Safety
12 pages
Hands-on Agentic Corrective RAG Guide
100% (1)
Hands-on Agentic Corrective RAG Guide
5 pages
DR_SDK Background Task Errors Log
No ratings yet
DR_SDK Background Task Errors Log
1 page
Teori Bahasa & Otomata Tugas 1
No ratings yet
Teori Bahasa & Otomata Tugas 1
46 pages
Types and Examples of Online Platforms
100% (2)
Types and Examples of Online Platforms
21 pages
Kerala RSDMS Framework Development
No ratings yet
Kerala RSDMS Framework Development
6 pages
Miniclip Eight Ball Pool Initialization
No ratings yet
Miniclip Eight Ball Pool Initialization
795 pages
Tally Function Key Guide
No ratings yet
Tally Function Key Guide
3 pages
Threads Beginner To Advanced Complete Guide
No ratings yet
Threads Beginner To Advanced Complete Guide
6 pages
Web-Based Student Result Management
No ratings yet
Web-Based Student Result Management
27 pages
Overhead Cost Accounting Test Script
No ratings yet
Overhead Cost Accounting Test Script
57 pages
Python Programming in AI Practical File
No ratings yet
Python Programming in AI Practical File
3 pages
Cellular Call Setup and SMS Process
No ratings yet
Cellular Call Setup and SMS Process
15 pages
Quran E-Book and Media Archive
No ratings yet
Quran E-Book and Media Archive
5 pages
IGEL Device Registration Guide for Users
No ratings yet
IGEL Device Registration Guide for Users
2 pages
Mastering Tinder: Codes for Success
100% (3)
Mastering Tinder: Codes for Success
20 pages
TrinityCore Setup Guide for WoW 3.3.5a
No ratings yet
TrinityCore Setup Guide for WoW 3.3.5a
32 pages
Computer Architecture Fundamentals
No ratings yet
Computer Architecture Fundamentals
11 pages
Mobile Pen Testing with ADB Guide
No ratings yet
Mobile Pen Testing with ADB Guide
20 pages
Mass Update of Condition Pricing in SAP
No ratings yet
Mass Update of Condition Pricing in SAP
3 pages
Arduino Wireless Communication - NRF24L01 Tutorial
No ratings yet
Arduino Wireless Communication - NRF24L01 Tutorial
45 pages
GAPM Manual
No ratings yet
GAPM Manual
240 pages
GE Lightspeed 16 Slice CT Scanner Overview
No ratings yet
GE Lightspeed 16 Slice CT Scanner Overview
3 pages
Introduction to Data Mining Concepts
No ratings yet
Introduction to Data Mining Concepts
74 pages
Resume Writing Strategies for Job Seekers
100% (1)
Resume Writing Strategies for Job Seekers
6 pages
Expert System for Inventory Management
No ratings yet
Expert System for Inventory Management
43 pages
MongoDB and NoSQL Database Insights
No ratings yet
MongoDB and NoSQL Database Insights
297 pages
DLI Watchman® DCX™ XRT
No ratings yet
DLI Watchman® DCX™ XRT
2 pages
Mogamul Tamil Novel PDF Download
0% (4)
Mogamul Tamil Novel PDF Download
3 pages
Mastering Mail Merge in Word
No ratings yet
Mastering Mail Merge in Word
3 pages
WEEK1
No ratings yet
WEEK1
7 pages
Mettler Toledo UC3 WLAN Linux Manual
No ratings yet
Mettler Toledo UC3 WLAN Linux Manual
10 pages