Embeddings
Overview
Embeddings convert text (and sometimes other modalities) into vectors so your vector store can perform similarity search. In RAGLight, embeddings are a first-class configuration choice:- They are independent from the LLM used for generation
- They can be local or hosted
- They can be swapped without changing your pipeline logic
Why it matters
Your embedding model strongly impacts:- Retrieval quality (what gets retrieved)
- Latency and indexing speed
- Memory footprint and storage size
- Multilingual performance
How embeddings are configured in RAGLight
Embeddings are configured in the vector store configuration, not in the RAG config. In other words:RAGConfigcontrols generation (LLM provider, model, prompts)VectorStoreConfigcontrols indexing + retrieval (embeddings provider, model, storage)
Available embedding providers
RAGLight supports multiple embedding providers:- HuggingFace
- Ollama
- vLLM
- OpenAI
- Google Gemini
HuggingFace (local, default)
HuggingFace embeddings are a great default for local-first RAG.Google Gemini (hosted)
Gemini can be used for embeddings viaSettings.GOOGLE_GEMINI.
OpenAI (hosted)
Use OpenAI embeddings by selecting the OpenAI provider and a compatible embedding model.Ollama (local)
If your Ollama setup provides an embeddings model, you can use it as your embedding provider.vLLM (server)
If you expose an embeddings endpoint through vLLM (typically OpenAI-compatible), you can point RAGLight to your server.Recommended defaults
If you are starting out, these defaults usually work well:- Embeddings:
all-MiniLM-L6-v2(fast and compact) - Vector store: Chroma (local persistence)
- LLM: Ollama (local)
Common pitfalls
Embeddings and LLM are different models
It’s common to assume that the LLM model also embeds documents. In RAGLight, embeddings are configured separately.Changing embeddings requires re-indexing
If you changeembedding_model or provider, you must rebuild your vector store:
Summary
- Embeddings drive retrieval quality.
- They are configured in
VectorStoreConfig. - Providers are swappable and independent from the LLM.
- Changing embeddings requires rebuilding the index.