Skip to main content

Vector Stores

Overview

A vector store is the component responsible for storing embeddings and performing similarity search. In RAGLight, vector stores are a core abstraction that sit between:
  • the ingestion pipeline (documents → chunks → embeddings)
  • and the retrieval step used by RAG / Agentic RAG pipelines
A vector store in RAGLight is not just a passive database. It actively:
  • orchestrates document ingestion
  • applies document processors and chunking
  • stores embeddings and metadata
  • exposes retrieval APIs used by higher-level pipelines
All vector store implementations inherit from a shared base class: VectorStore.

How vector stores work in RAGLight

At a high level, the lifecycle looks like this:
  1. Documents are ingested from folders or repositories
  2. Files are processed using document processors (PDF, code, text, …)
  3. Content is chunked and embedded
  4. Embeddings are stored in one or more collections
  5. At query time, the vector store retrieves the most relevant chunks
This logic is intentionally explicit and shared across implementations.

Collections and data separation

RAGLight separates stored data into two logical collections:
  • Main collection: regular document chunks (text, docs, code blocks)
  • Class collection: extracted class or signature documents (for code)
For a given collection_name, this results in:
  • collection_name
  • collection_name_classes
This design enables different retrieval strategies depending on the use case.

Available vector stores

RAGLight is designed to support multiple vector store backends. At the moment, the available implementation is:
  • Chroma (ChromaVS)
The API is designed so additional backends (e.g. Qdrant, Weaviate) can be added without changing ingestion or RAG logic.

Chroma Vector Store

What is Chroma?

Chroma is a lightweight vector database that works well for:
  • local-first experimentation
  • persistent on-disk indexes
  • client/server deployments
RAGLight integrates Chroma via the ChromaVS implementation.

Local vs Remote usage

Chroma can be used in two distinct ways in RAGLight.

Local mode (embedded)

In local mode, Chroma runs embedded in your application and persists data to disk. This mode is selected when you provide a persist_directory.
vector_store_config = VectorStoreConfig(
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name="default",
)
Use this mode when:
  • prototyping locally
  • iterating quickly on embeddings or chunking
  • working on a single machine

Remote mode (client/server)

In remote mode, RAGLight connects to an external Chroma server over HTTP. This mode is selected when you provide both host and port.
vector_store_config = VectorStoreConfig(
    database=Settings.CHROMA,
    host="localhost",
    port=8000,
    collection_name="default",
)
In this setup:
  • Chroma runs as a standalone service
  • RAGLight acts as a client
  • Persistence is managed by the server
This is useful when:
  • multiple services need to share the same index
  • you deploy RAGLight in containers
  • you want a long-lived vector database
RAGLight enforces this configuration strictly:
  • either persist_directory
  • or host + port
Mixing both is not allowed.

Ingestion pipeline (shared logic)

The ingestion pipeline is implemented in the abstract VectorStore base class and reused by all backends. During ingestion:
  • directories are walked recursively
  • ignored folders are filtered out
  • document processors are selected per file
  • documents are chunked and embedded
  • chunks are stored in the main collection
  • class/signature documents are stored in the class collection
Ingestion is parallelized to speed up indexing.

Retrieval APIs

Vector stores expose two main retrieval methods:

Standard retrieval

docs = vector_store.similarity_search(question, k=5)
Used for most RAG queries.

Class-based retrieval

docs = vector_store.similarity_search_class(question, k=5)
Used when working with codebases or structured symbols. Both methods support filtering and dynamic collection selection when needed.

How pipelines use vector stores

At the pipeline level, vector stores are treated as a black box with a simple contract:
  • given a query → return relevant documents
For example, the RAG pipeline performs:
retrieved_docs = vector_store.similarity_search(question, k=self.k)
The retrieved documents are then injected into the prompt and passed to the LLM.

When to rebuild your index

You must rebuild the vector store if you change:
  • the embedding model
  • the embedding provider
  • document processors or chunking logic
  • the set of ingested documents
This ensures stored embeddings remain consistent with retrieval.

Summary

  • Vector stores handle ingestion, storage, and retrieval.
  • RAGLight separates chunks and class documents into dedicated collections.
  • Chroma is the current backend, usable locally or as a remote service.
  • Switching between local and remote modes does not affect pipeline logic.
  • The design is extensible to future vector store implementations.