Getting Started

This guide helps you go from zero setup to a working local RAG pipeline in just a few minutes. RAGLight is designed to be simple, explicit, and fast to experiment with. You can start either from the CLI or directly from Python.

Prerequisites

Before getting started, make sure you have:

Python 3.9+
A local or remote LLM provider
- Recommended for local use: Ollama
- Alternatives: LMStudio, vLLM, OpenAI, Mistral

Using Ollama (recommended)

Install Ollama and pull a model:

ollama pull llama3

Make sure Ollama is running:

ollama serve

Installation

Install RAGLight from PyPI:

pip install raglight

That’s it. No additional setup required.

Option 1 — Instant RAG with the CLI

The fastest way to get started is using the interactive CLI wizard.

raglight chat

The wizard will guide you through:

Selecting a local folder containing your documents
Choosing an embedding model
Choosing a vector store
Selecting an LLM provider and model
Configuring ignore folders (e.g. .venv, node_modules)

Once completed, RAGLight will:

Ingest your documents
Build the vector store
Start an interactive chat session

No Python code required.

Option 2 — Your First RAG Pipeline in Python

If you prefer explicit code, here is the minimal Python example.

1. Define your knowledge sources

from raglight.models.data_source_model import FolderSource

knowledge_base = [
    FolderSource(path="./data")  # Folder containing your documents
]

2. Create the RAG pipeline

from raglight.rag.simple_rag_api import RAGPipeline
from raglight.config.settings import Settings

pipeline = RAGPipeline(
    knowledge_base=knowledge_base,
    model_name="llama3",
    provider=Settings.OLLAMA,
    k=5
)

3. Build the pipeline

pipeline.build()

This step:

Parses your documents
Generates embeddings
Stores them in the vector database

4. Query your documents

response = pipeline.generate(
    "What is this project about?"
)

print(response)

You now have a fully working local RAG system.

What happens under the hood?

RAGLight keeps everything explicit:

Documents are ingested from your data sources
Embeddings are generated using the selected model
Vectors are stored in a vector database
Relevant chunks are retrieved at query time
The LLM generates an answer using retrieved context

Nothing is hidden or automatic unless you configure it.

Going further

Once you have a basic RAG running, you can explore:

Agentic RAG for multi-step retrieval and reasoning
RAT (Retrieval-Augmented Thinking) with reflection loops
Custom pipelines with the Builder API
Multimodal document ingestion (PDFs with images)
MCP integration for tool-augmented agents

Check the next sections of the documentation to dive deeper.

Next steps

Learn how each component works in the Core Concepts section
Explore ready-to-run examples in the examples/ folder
Customize your pipeline step by step

RAGLight is built for experimentation — start simple, then iterate.

Presentation

​Getting Started

​Prerequisites

​Using Ollama (recommended)

​Installation

​Option 1 — Instant RAG with the CLI

​Option 2 — Your First RAG Pipeline in Python

​1. Define your knowledge sources

​2. Create the RAG pipeline

​3. Build the pipeline

​4. Query your documents

​What happens under the hood?

​Going further

​Next steps