Skip to main content

Query Reformulation

Overview

In a multi-turn conversation, users often ask follow-up questions that reference previous context:
“How does the Builder pattern work?” “And for Bedrock?”
The second question makes no sense to the vector store in isolation. Query reformulation solves this by rewriting the question into a self-contained query before retrieval:
“How does the Builder pattern work with AWS Bedrock in RAGLight?”
This dramatically improves retrieval accuracy in conversational RAG.

How it works

When reformulation=True, the pipeline adds a reformulate step before retrieval:
User Question

Reformulate (LLM call using conversation history)

Standalone Question

Vector Store (similarity search)

Retrieved Documents

LLM Generation

Final Answer
The same LLM configured for generation is used for reformulation. If there is no conversation history yet (first turn), the question is passed through unchanged — no extra LLM call is made. The reformulated question is logged at INFO level so you can inspect what the model produced.

Configuration

Reformulation is enabled by default. You can disable it explicitly if needed.

Via RAGConfig (simple API)

from raglight.config.rag_config import RAGConfig
from raglight.config.settings import Settings

# Enabled by default — no change needed
config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
)

# Disable explicitly
config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
    reformulation=False,
)

Via the Builder API

from raglight.rag.builder import Builder
from raglight.config.settings import Settings

rag = (
    Builder()
    .with_embeddings(Settings.HUGGINGFACE, model_name="all-MiniLM-L6-v2")
    .with_vector_store(
        Settings.CHROMA,
        persist_directory="./myDb",
        collection_name="my_collection",
    )
    .with_llm(Settings.OLLAMA, model_name="llama3.1:8b")
    .build_rag(k=5, reformulation=True)  # True by default
)

When reformulation helps

ScenarioBenefit
Multi-turn conversationsResolves pronoun/reference ambiguity
Follow-up questionsMakes implicit context explicit for retrieval
Short or vague queriesExpands the query for better recall

When to disable it

ScenarioReason
Single-turn Q&ANo history to leverage, avoids unnecessary LLM call
Very fast LLMs onlyAdds one LLM roundtrip per query
Strict cost controlEach reformulation consumes tokens

Summary

  • Reformulation rewrites follow-up questions into standalone queries
  • Enabled by default in RAGConfig and Builder.build_rag()
  • Uses the same LLM as generation — no extra model needed
  • No-op on the first turn (no history)
  • Disable via reformulation=False