Vector Stores
Overview
A vector store is the component responsible for storing embeddings and performing similarity search. In RAGLight, vector stores are a core abstraction that sit between:- the ingestion pipeline (documents → chunks → embeddings)
- and the retrieval step used by RAG / Agentic RAG pipelines
- orchestrates document ingestion
- applies document processors and chunking
- stores embeddings and metadata
- exposes retrieval APIs used by higher-level pipelines
VectorStore.
How vector stores work in RAGLight
At a high level, the lifecycle looks like this:- Documents are ingested from folders or repositories
- Files are processed using document processors (PDF, code, text, …)
- Content is chunked and embedded
- Embeddings are stored in one or more collections
- At query time, the vector store retrieves the most relevant chunks
Collections and data separation
RAGLight separates stored data into two logical collections:- Main collection: regular document chunks (text, docs, code blocks)
- Class collection: extracted class or signature documents (for code)
collection_name, this results in:
collection_namecollection_name_classes
Available vector stores
RAGLight supports two vector store backends:| Backend | Constant | Extra | Notes |
|---|---|---|---|
Chroma (ChromaVS) | Settings.CHROMA | raglight[chroma] | Requires a C++ compiler on Windows |
Qdrant (QdrantVS) | Settings.QDRANT | raglight[qdrant] | Pure Python — Windows-friendly |
Chroma Vector Store
What is Chroma?
Chroma is a lightweight vector database that works well for:- local-first experimentation
- persistent on-disk indexes
- client/server deployments
ChromaVS implementation.
Local vs Remote usage
Chroma can be used in two distinct ways in RAGLight.Local mode (embedded)
In local mode, Chroma runs embedded in your application and persists data to disk. This mode is selected when you provide apersist_directory.
- prototyping locally
- iterating quickly on embeddings or chunking
- working on a single machine
Remote mode (client/server)
In remote mode, RAGLight connects to an external Chroma server over HTTP. This mode is selected when you provide bothhost and port.
- Chroma runs as a standalone service
- RAGLight acts as a client
- Persistence is managed by the server
- multiple services need to share the same index
- you deploy RAGLight in containers
- you want a long-lived vector database
- either
persist_directory - or
host+port
Qdrant Vector Store
What is Qdrant?
Qdrant is a pure-Python vector database that works well for:- Windows environments (no C++ compiler required)
- containerised or cloud deployments
- client/server (remote) setups
QdrantVS implementation.
Local mode (on-disk)
Remote mode (client/server)
Ingestion pipeline (shared logic)
The ingestion pipeline is implemented in the abstractVectorStore base class and reused by all backends.
During ingestion:
- directories are walked recursively
- ignored folders are filtered out
- document processors are selected per file
- documents are chunked and embedded
- chunks are stored in the main collection
- class/signature documents are stored in the class collection
Retrieval APIs
Vector stores expose two main retrieval methods:Standard retrieval
Class-based retrieval
How pipelines use vector stores
At the pipeline level, vector stores are treated as a black box with a simple contract:- given a query → return relevant documents
Search modes
Both Chroma and Qdrant support three retrieval strategies, configured viasearch_type in VectorStoreConfig:
| Value | Behavior |
|---|---|
"semantic" | Vector similarity search (default) |
"bm25" | Keyword-based BM25 search |
"hybrid" | BM25 + semantic, fused with Reciprocal Rank Fusion |
When to rebuild your index
You must rebuild the vector store if you change:- the embedding model
- the embedding provider
- document processors or chunking logic
- the set of ingested documents
Summary
- Vector stores handle ingestion, storage, and retrieval.
- RAGLight separates chunks and class documents into dedicated collections.
- Chroma and Qdrant are both supported — swap via
database=Settings.CHROMAordatabase=Settings.QDRANT. - Both backends work locally (on-disk) or remotely (client/server).
- Switching between backends does not affect pipeline or retrieval logic.
- All three search modes (semantic, BM25, hybrid) are available on both backends.