Skip to main content

Agentic RAG

Overview

Agentic RAG extends the classic RAG pattern by introducing an agent capable of reasoning, planning, and iteratively interacting with the vector store. Instead of a single retrieve → generate pass, Agentic RAG allows the model to:
  • decide what to retrieve next
  • perform multiple retrieval steps
  • refine its understanding iteratively
  • optionally interact with external tools (MCP)
In RAGLight, Agentic RAG is designed to stay:
  • explicit
  • debuggable
  • close to standard RAG semantics

How Agentic RAG differs from standard RAG

Standard RAG

Question

Retrieve (once)

Generate
  • single retrieval step
  • fixed context
  • no planning

Agentic RAG

Question

Agent
  ↳ Retrieve
  ↳ Think
  ↳ Retrieve again
  ↳ Think

Final Answer
  • iterative retrieval
  • reasoning between steps
  • adaptive use of context
The agent controls when and how retrieval happens.

Agentic RAG in RAGLight

RAGLight provides a dedicated high-level API:
  • AgenticRAGPipeline
This pipeline builds on top of the same components as RAG:
  • embeddings
  • vector store
  • LLM
But wraps them in an agent loop.

Option 1: AgenticRAGPipeline (simple API)

AgenticRAGPipeline is the recommended way to experiment with Agentic RAG. Use it when you want:
  • reasoning-aware retrieval
  • minimal boilerplate
  • fast experimentation

Basic example

from raglight.rag.simple_agentic_rag_api import AgenticRAGPipeline
from raglight.config.agentic_rag_config import AgenticRAGConfig
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings
from raglight.models.data_source_model import FolderSource

Settings.setup_logging()

knowledge_base = [
    FolderSource(path="./data"),
]

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.DEFAULT_EMBEDDINGS_MODEL,
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
)

config = AgenticRAGConfig(
    provider=Settings.MISTRAL,
    model="mistral-large-2411",
    k=10,
    max_steps=4,
    system_prompt=Settings.DEFAULT_AGENT_PROMPT,
    knowledge_base=knowledge_base,
)

pipeline = AgenticRAGPipeline(config, vector_store_config)

pipeline.build()

response = pipeline.generate(
    "Explain how Agentic RAG differs from standard RAG"
)
print(response)

What happens during execution

At runtime, the agent:
  1. receives the user question
  2. decides whether retrieval is needed
  3. queries the vector store
  4. reasons over retrieved context
  5. optionally repeats steps 2–4
  6. produces a final answer
The loop stops when:
  • the agent reaches max_steps
  • or it decides it has enough context

Key configuration parameters

Agentic RAG introduces additional parameters compared to standard RAG.

max_steps

max_steps=4
Controls how many reasoning / retrieval iterations the agent can perform.

k

k=10
Defines how many documents are retrieved at each step.

system_prompt

The agent prompt defines:
  • reasoning structure
  • tool usage rules
  • when retrieval should be invoked
RAGLight exposes default prompts via:
Settings.DEFAULT_AGENT_PROMPT
You are encouraged to inspect and customize them.

Option 2: Builder-based Agentic RAG

Under the hood, Agentic RAG is still built from the same primitives. Using the Builder API allows:
  • custom agent loops
  • manual control over tools
  • experimentation with reasoning strategies
(This approach is recommended only for advanced users.)

MCP integration (tools)

Agentic RAG can be extended with external tools via MCP servers. This allows the agent to:
  • execute code
  • query databases
  • fetch live data
Example configuration:
config = AgenticRAGConfig(
    provider=Settings.OPENAI,
    model="gpt-4o",
    k=10,
    mcp_config=[
        {"url": "http://127.0.0.1:8001/sse"}
    ],
)
Tools are invoked only when the agent decides they are relevant.

When to use Agentic RAG

Agentic RAG is useful when:
  • a single retrieval pass is insufficient
  • questions require exploration or refinement
  • reasoning over large knowledge bases
  • combining retrieval with tools
For simple Q&A, standard RAG is often faster and cheaper.

Summary

  • Agentic RAG introduces an agent loop on top of RAG
  • Retrieval becomes iterative and reasoning-driven
  • RAGLight provides a simple AgenticRAGPipeline API
  • Configuration stays explicit and debuggable
  • Agentic RAG shines on complex, multi-step questions