> ## Documentation Index
> Fetch the complete documentation index at: https://docs.raglight.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Embeddings

> Configure embedding models and providers to index and retrieve your documents.

# Embeddings

## Overview

**Embeddings** convert text (and sometimes other modalities) into vectors so your vector store
can perform similarity search.

In RAGLight, embeddings are a first-class configuration choice:

* They are **independent** from the LLM used for generation
* They can be **local or hosted**
* They can be swapped without changing your pipeline logic

This separation is intentional: you may want fast local embeddings for indexing while using a
strong hosted model for generation (or the opposite).

***

## Why it matters

Your embedding model strongly impacts:

* Retrieval quality (what gets retrieved)
* Latency and indexing speed
* Memory footprint and storage size
* Multilingual performance

A great LLM cannot compensate for weak retrieval. If the wrong chunks are retrieved, the answer
will be wrong — even with a strong generation model.

***

## How embeddings are configured in RAGLight

Embeddings are configured in the **vector store configuration**, not in the RAG config.

In other words:

* `RAGConfig` controls **generation** (LLM provider, model, prompts)
* `VectorStoreConfig` controls **indexing + retrieval** (embeddings provider, model, storage)

Minimal example:

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.DEFAULT_EMBEDDINGS_MODEL,
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
)
```

***

## Available embedding providers

RAGLight supports multiple embedding providers:

* HuggingFace
* Ollama
* vLLM
* OpenAI
* Google Gemini
* AWS Bedrock

Below are concrete examples of how to configure each one.

***

## HuggingFace (local, default)

HuggingFace embeddings are a great default for local-first RAG.

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model="all-MiniLM-L6-v2",
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
)
```

***

## Google Gemini (hosted)

Gemini can be used for embeddings via `Settings.GOOGLE_GEMINI`.

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.GEMINI_EMBEDDING_MODEL,
    provider=Settings.GOOGLE_GEMINI,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
)
```

Make sure the API key is available:

```bash theme={null}
export GEMINI_API_KEY=your_key
```

***

## OpenAI (hosted)

Use OpenAI embeddings by selecting the OpenAI provider and a compatible embedding model.

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model="text-embedding-3-small",
    provider=Settings.OPENAI,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
    api_base=Settings.DEFAULT_OPENAI_CLIENT,
)
```

```bash theme={null}
export OPENAI_API_KEY=your_key
```

***

## Ollama (local)

If your Ollama setup provides an embeddings model, you can use it as your embedding provider.

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model="nomic-embed-text",
    provider=Settings.OLLAMA,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
    api_base=Settings.DEFAULT_OLLAMA_CLIENT,
)
```

***

## vLLM (server)

If you expose an embeddings endpoint through vLLM (typically OpenAI-compatible), you can point
RAGLight to your server.

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model="your-embedding-model",
    provider=Settings.VLLM,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
    api_base="http://localhost:8000",
)
```

***

## AWS Bedrock (hosted)

Use Amazon Titan or Cohere embeddings via AWS Bedrock. Authentication relies on the standard boto3 credential chain (env vars, `~/.aws/credentials`, or IAM role).

```python theme={null}
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.settings import Settings

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.AWS_BEDROCK_EMBEDDING_MODEL,  # amazon.titan-embed-text-v2:0
    provider=Settings.AWS_BEDROCK,
    database=Settings.CHROMA,
    persist_directory="./defaultDb",
    collection_name=Settings.DEFAULT_COLLECTION_NAME,
)
```

```bash theme={null}
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1
```

Supported models include `amazon.titan-embed-text-v2:0` and `cohere.embed-english-v3`.

***

## Recommended defaults

If you are starting out, these defaults usually work well:

* **Embeddings**: `all-MiniLM-L6-v2` (fast and compact)
* **Vector store**: Chroma (local persistence)
* **LLM**: Ollama (local)

You can then switch providers as your prototype evolves.

***

## Common pitfalls

### Embeddings and LLM are different models

It’s common to assume that the LLM model also embeds documents. In RAGLight, embeddings are
configured separately.

### Changing embeddings requires re-indexing

If you change `embedding_model` or `provider`, you must rebuild your vector store:

```python theme={null}
pipeline.build()
```

Otherwise, the stored vectors won’t match the new embedding space.

***

## Summary

* Embeddings drive retrieval quality.
* They are configured in `VectorStoreConfig`.
* Providers are swappable and independent from the LLM.
* Changing embeddings requires rebuilding the index.
