> ## Documentation Index
> Fetch the complete documentation index at: https://docs.raglight.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Observability with Langfuse

> Trace every RAG call end-to-end — retrieve, rerank, and generate — directly in your Langfuse dashboard.

# Observability with Langfuse

## Overview

RAGLight integrates with **Langfuse** to give you full visibility over your RAG pipeline. Every call to `generate()` or `generate_streaming()` produces a structured trace showing exactly what happened at each step.

<CardGroup cols={3}>
  <Card title="Retrieve" icon="magnifying-glass">
    See which documents were retrieved, from which collection, with which query.
  </Card>

  <Card title="Rerank" icon="arrow-up-wide-short">
    Inspect the reranking step when a CrossEncoder is active.
  </Card>

  <Card title="Generate" icon="bolt">
    Trace the LLM call — prompt, model, latency, and token counts.
  </Card>
</CardGroup>

***

## Installation

```bash theme={null}
pip install "raglight[langfuse]"
```

This installs `langfuse==4.0.0` alongside RAGLight.

***

## Configuration

Tracing is configured via `LangfuseConfig`, a dataclass that holds your Langfuse credentials.

```python theme={null}
from raglight.config.langfuse_config import LangfuseConfig

langfuse_config = LangfuseConfig(
    public_key="pk-lf-...",
    secret_key="sk-lf-...",
    host="http://localhost:3000",  # or your Langfuse Cloud URL
)
```

Pass this config to your pipeline — the rest is automatic.

***

## Usage with RAGPipeline

```python theme={null}
from raglight.rag.simple_rag_api import RAGPipeline
from raglight.config.rag_config import RAGConfig
from raglight.config.vector_store_config import VectorStoreConfig
from raglight.config.langfuse_config import LangfuseConfig
from raglight.config.settings import Settings

langfuse_config = LangfuseConfig(
    public_key="pk-lf-...",
    secret_key="sk-lf-...",
    host="http://localhost:3000",
)

config = RAGConfig(
    llm=Settings.DEFAULT_LLM,
    provider=Settings.OLLAMA,
    langfuse_config=langfuse_config,
)

vector_store_config = VectorStoreConfig(
    embedding_model=Settings.DEFAULT_EMBEDDINGS_MODEL,
    provider=Settings.HUGGINGFACE,
    database=Settings.CHROMA,
    persist_directory="./myDb",
    collection_name="my_collection",
)

pipeline = RAGPipeline(config, vector_store_config)
pipeline.build()

response = pipeline.generate("What is RAGLight?")
print(response)
```

***

## Usage with the Builder API

```python theme={null}
from raglight.rag.builder import Builder
from raglight.config.langfuse_config import LangfuseConfig
from raglight.config.settings import Settings

langfuse_config = LangfuseConfig(
    public_key="pk-lf-...",
    secret_key="sk-lf-...",
    host="http://localhost:3000",
)

rag = (
    Builder()
    .with_embeddings(Settings.HUGGINGFACE, model_name=Settings.DEFAULT_EMBEDDINGS_MODEL)
    .with_vector_store(Settings.CHROMA, persist_directory="./myDb", collection_name="my_collection")
    .with_llm(Settings.OLLAMA, model_name=Settings.DEFAULT_LLM)
    .build_rag(k=5, langfuse_config=langfuse_config)
)

rag.vector_store.ingest(data_path="./docs")
response = rag.generate("Explain the retrieval pipeline")
print(response)
```

***

## Streaming support

Langfuse tracing works identically for streaming. The trace is emitted when the stream completes.

```python theme={null}
for chunk in pipeline.generate_streaming("What is RAGLight?"):
    print(chunk, end="", flush=True)
# → full trace appears in Langfuse once the stream ends
```

All LLM providers support streaming traces: Ollama, OpenAI, Mistral, Gemini, LMStudio, and AWS Bedrock.

***

## Session ID

By default, a **UUID is generated once per `RAG` instance** and reused for every `generate()` call. This groups all turns of the same conversation under a single Langfuse session.

You can pin a custom session ID:

```python theme={null}
LangfuseConfig(
    public_key="pk-lf-...",
    secret_key="sk-lf-...",
    host="http://localhost:3000",
    session_id="my-session-42",
)
```

***

## Use with `raglight serve`

When using the REST API, pass Langfuse credentials as environment variables:

```bash .env theme={null}
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=http://localhost:3000
```

Then start the server:

```bash theme={null}
raglight serve
```

<Info>
  When `LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, and `LANGFUSE_HOST` (or `LANGFUSE_BASE_URL`) are all set, tracing is enabled automatically. If any of these are missing, RAGLight disables Langfuse entirely — **no connection attempt is made to localhost:3000**.
</Info>

***

## Run Langfuse locally

The fastest way to get Langfuse running locally is Docker Compose:

```bash theme={null}
git clone https://github.com/langfuse/langfuse.git
cd langfuse
docker-compose up
```

Langfuse will be available at `http://localhost:3000`.

***

## Summary

* Install with `pip install "raglight[langfuse]"`
* Pass `LangfuseConfig` to `RAGConfig` or `build_rag()`
* Both `generate()` and `generate_streaming()` are traced automatically
* All LLM providers are supported
* Sessions group all turns of a conversation together
* For `raglight serve`, set `LANGFUSE_*` env vars — no code changes needed
