Vector Stores

Overview

A vector store is the component responsible for storing embeddings and performing similarity search. In RAGLight, vector stores are a core abstraction that sit between:

the ingestion pipeline (documents → chunks → embeddings)
and the retrieval step used by RAG / Agentic RAG pipelines

A vector store in RAGLight is not just a passive database. It actively:

orchestrates document ingestion
applies document processors and chunking
stores embeddings and metadata
exposes retrieval APIs used by higher-level pipelines

All vector store implementations inherit from a shared base class: VectorStore.

How vector stores work in RAGLight

At a high level, the lifecycle looks like this:

Documents are ingested from folders or repositories
Files are processed using document processors (PDF, code, text, …)
Content is chunked and embedded
Embeddings are stored in one or more collections
At query time, the vector store retrieves the most relevant chunks

This logic is intentionally explicit and shared across implementations.

Collections and data separation

RAGLight separates stored data into two logical collections:

Main collection: regular document chunks (text, docs, code blocks)
Class collection: extracted class or signature documents (for code)

For a given collection_name, this results in:

collection_name
collection_name_classes

This design enables different retrieval strategies depending on the use case.

Available vector stores

RAGLight is designed to support multiple vector store backends. At the moment, the available implementation is:

Chroma (ChromaVS)

The API is designed so additional backends (e.g. Qdrant, Weaviate) can be added without changing ingestion or RAG logic.

Chroma Vector Store

What is Chroma?

Chroma is a lightweight vector database that works well for:

local-first experimentation
persistent on-disk indexes
client/server deployments

RAGLight integrates Chroma via the ChromaVS implementation.

Local vs Remote usage

Chroma can be used in two distinct ways in RAGLight.

Local mode (embedded)

In local mode, Chroma runs embedded in your application and persists data to disk. This mode is selected when you provide a persist_directory.

vector_store_config = VectorStoreConfig(
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name="default",
)

Use this mode when:

prototyping locally
iterating quickly on embeddings or chunking
working on a single machine

Remote mode (client/server)

In remote mode, RAGLight connects to an external Chroma server over HTTP. This mode is selected when you provide both host and port.

vector_store_config = VectorStoreConfig(
    database=Settings.CHROMA,
    host="localhost",
    port=8000,
    collection_name="default",
)

In this setup:

Chroma runs as a standalone service
RAGLight acts as a client
Persistence is managed by the server

This is useful when:

multiple services need to share the same index
you deploy RAGLight in containers
you want a long-lived vector database

RAGLight enforces this configuration strictly:

either persist_directory
or host + port

Mixing both is not allowed.

Ingestion pipeline (shared logic)

The ingestion pipeline is implemented in the abstract VectorStore base class and reused by all backends. During ingestion:

directories are walked recursively
ignored folders are filtered out
document processors are selected per file
documents are chunked and embedded
chunks are stored in the main collection
class/signature documents are stored in the class collection

Ingestion is parallelized to speed up indexing.

Retrieval APIs

Vector stores expose two main retrieval methods:

Standard retrieval

docs = vector_store.similarity_search(question, k=5)

Used for most RAG queries.

Class-based retrieval

docs = vector_store.similarity_search_class(question, k=5)

Used when working with codebases or structured symbols. Both methods support filtering and dynamic collection selection when needed.

How pipelines use vector stores

At the pipeline level, vector stores are treated as a black box with a simple contract:

given a query → return relevant documents

For example, the RAG pipeline performs:

retrieved_docs = vector_store.similarity_search(question, k=self.k)

The retrieved documents are then injected into the prompt and passed to the LLM.

When to rebuild your index

You must rebuild the vector store if you change:

the embedding model
the embedding provider
document processors or chunking logic
the set of ingested documents

This ensures stored embeddings remain consistent with retrieval.

Summary

Vector stores handle ingestion, storage, and retrieval.
RAGLight separates chunks and class documents into dedicated collections.
Chroma is the current backend, usable locally or as a remote service.
Switching between local and remote modes does not affect pipeline logic.
The design is extensible to future vector store implementations.

Core Concepts

Pipelines

Vector Stores

Vector Stores

Overview

How vector stores work in RAGLight

Collections and data separation

Available vector stores

Chroma Vector Store

What is Chroma?

Local vs Remote usage

Local mode (embedded)

Remote mode (client/server)

Ingestion pipeline (shared logic)

Retrieval APIs

Standard retrieval

Class-based retrieval

How pipelines use vector stores

When to rebuild your index

Summary

Core Concepts

Pipelines

​Vector Stores

​Overview

​How vector stores work in RAGLight

​Collections and data separation

​Available vector stores

​Chroma Vector Store

​What is Chroma?

​Local vs Remote usage

​Local mode (embedded)

​Remote mode (client/server)

​Ingestion pipeline (shared logic)

​Retrieval APIs

​Standard retrieval

​Class-based retrieval

​How pipelines use vector stores

​When to rebuild your index

​Summary

Vector Stores

Overview

How vector stores work in RAGLight

Collections and data separation

Available vector stores

Chroma Vector Store

What is Chroma?

Local vs Remote usage

Local mode (embedded)

Remote mode (client/server)

Ingestion pipeline (shared logic)

Retrieval APIs

Standard retrieval

Class-based retrieval

How pipelines use vector stores

When to rebuild your index

Summary