Skip to main content

Hybrid Search

Overview

By default, RAGLight retrieves documents using semantic search — it embeds the query and finds the closest vectors in the store. Hybrid search extends this by combining two complementary retrieval strategies:
  • Semantic search — captures meaning and context (via embeddings + ChromaDB)
  • BM25 — captures exact keyword matches (via the BM25Okapi algorithm)
The two result lists are merged using Reciprocal Rank Fusion (RRF), a rank aggregation algorithm that is both simple and robust.
Semantic search alone can struggle when:
  • queries contain rare technical terms or acronyms
  • documents are sparse or domain-specific
  • the user expects an exact term to appear in the answer
BM25 alone misses synonyms and paraphrasing. Hybrid search combines both to improve retrieval quality across a wider range of query types.

How it works

At retrieval time, hybrid search follows these steps:
Query

Semantic search  →  top 2k documents
BM25 search      →  top 2k documents

Reciprocal Rank Fusion (RRF)

Top k documents (deduplicated, reranked)
Why 2k? Fetching more candidates from each method before fusion ensures that good results that rank lower in one list still have a chance to surface in the final top k. RRF score formula:
score(doc) = Σ  1 / (k_rrf + rank)
where k_rrf = 60 (a standard constant that smooths rank decay), and the sum runs over each ranked list the document appears in.

Configuration

Via VectorStoreConfig (simple API)

from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.DEFAULT_EMBEDDINGS_MODEL,
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
    search_type=Settings.SEARCH_HYBRID,  # "semantic" | "bm25" | "hybrid"
)
Pass this config to RAGPipeline or AgenticRAGPipeline — the rest of the pipeline is unchanged.

Via the Builder API (advanced)

from raglight.rag.builder import Builder
from raglight.config.settings import Settings

rag = (
    Builder()
    .with_embeddings(
        Settings.HUGGINGFACE,
        model_name=Settings.DEFAULT_EMBEDDINGS_MODEL,
    )
    .with_vector_store(
        Settings.CHROMA,
        persist_directory="./defaultDb",
        collection_name=Settings.DEFAULT_COLLECTION_NAME,
        search_type=Settings.SEARCH_HYBRID,
    )
    .with_llm(Settings.OLLAMA, model_name=Settings.DEFAULT_LLM)
    .build_rag(k=5)
)

Search modes

RAGLight supports three search modes, all configured via search_type:
ValueConstantBehavior
"semantic"Settings.SEARCH_SEMANTICVector similarity search only (default)
"bm25"Settings.SEARCH_BM25Keyword search only
"hybrid"Settings.SEARCH_HYBRIDBM25 + semantic, fused with RRF
The default is "semantic" — existing code requires no changes to keep its current behavior.

Full example

from raglight.rag.simple_rag_api import RAGPipeline
from raglight.config.rag_config import RAGConfig
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings
from raglight.models.data_source_model import FolderSource

Settings.setup_logging()

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.DEFAULT_EMBEDDINGS_MODEL,
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
    search_type=Settings.SEARCH_HYBRID,
)

config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
    knowledge_base=[FolderSource(path="./data")],
    k=5,
)

pipeline = RAGPipeline(config, vector_store_config)
pipeline.build()

response = pipeline.generate("What are the key concepts in these documents?")
print(response)

BM25 index persistence

The BM25 index is built from the same documents stored in ChromaDB. It is automatically:
  • populated when documents are ingested via pipeline.build() or vector_store.ingest()
  • saved to {persist_directory}/bm25_{collection_name}.json
  • reloaded on next startup from that file
No additional setup is required. The BM25 index stays in sync with the vector store automatically.

Summary

  • Hybrid search combines semantic and BM25 retrieval with Reciprocal Rank Fusion
  • Enable it by setting search_type=Settings.SEARCH_HYBRID in VectorStoreConfig
  • The default remains "semantic" — no breaking change for existing code
  • The BM25 index is persisted automatically alongside ChromaDB data
  • Use hybrid search when your knowledge base contains technical terms, acronyms, or sparse text