> ## Documentation Index
> Fetch the complete documentation index at: https://docs.raglight.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Query Reformulation

> Automatically rewrite follow-up questions into standalone queries before retrieval.

# Query Reformulation

## Overview

In a multi-turn conversation, users often ask follow-up questions that reference previous context:

> *"How does the Builder pattern work?"*
> *"And for Bedrock?"*

The second question makes no sense to the vector store in isolation. **Query reformulation** solves this by rewriting the question into a self-contained query before retrieval:

> *"How does the Builder pattern work with AWS Bedrock in RAGLight?"*

This dramatically improves retrieval accuracy in conversational RAG.

***

## How it works

When `reformulation=True`, the pipeline adds a **reformulate** step before retrieval:

```
User Question
   ↓
Reformulate (LLM call using conversation history)
   ↓
Standalone Question
   ↓
Vector Store (similarity search)
   ↓
Retrieved Documents
   ↓
LLM Generation
   ↓
Final Answer
```

The **same LLM** configured for generation is used for reformulation. If there is no conversation history yet (first turn), the question is passed through unchanged — no extra LLM call is made.

The reformulated question is logged at `INFO` level so you can inspect what the model produced.

***

## Configuration

Reformulation is **enabled by default**. You can disable it explicitly if needed.

### Via `RAGConfig` (simple API)

```python theme={null}
from raglight.config.rag_config import RAGConfig
from raglight.config.settings import Settings

# Enabled by default — no change needed
config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
)

# Disable explicitly
config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
    reformulation=False,
)
```

### Via the Builder API

```python theme={null}
from raglight.rag.builder import Builder
from raglight.config.settings import Settings

rag = (
    Builder()
    .with_embeddings(Settings.HUGGINGFACE, model_name="all-MiniLM-L6-v2")
    .with_vector_store(
        Settings.CHROMA,
        persist_directory="./myDb",
        collection_name="my_collection",
    )
    .with_llm(Settings.OLLAMA, model_name="llama3.1:8b")
    .build_rag(k=5, reformulation=True)  # True by default
)
```

***

## When reformulation helps

| Scenario                 | Benefit                                       |
| :----------------------- | :-------------------------------------------- |
| Multi-turn conversations | Resolves pronoun/reference ambiguity          |
| Follow-up questions      | Makes implicit context explicit for retrieval |
| Short or vague queries   | Expands the query for better recall           |

## When to disable it

| Scenario            | Reason                                              |
| :------------------ | :-------------------------------------------------- |
| Single-turn Q\&A    | No history to leverage, avoids unnecessary LLM call |
| Very fast LLMs only | Adds one LLM roundtrip per query                    |
| Strict cost control | Each reformulation consumes tokens                  |

***

## Summary

* Reformulation rewrites follow-up questions into standalone queries
* Enabled by default in `RAGConfig` and `Builder.build_rag()`
* Uses the same LLM as generation — no extra model needed
* No-op on the first turn (no history)
* Disable via `reformulation=False`
