Skip to main content

Getting Started

This guide helps you go from zero setup to a working local RAG pipeline in just a few minutes. RAGLight is designed to be simple, explicit, and fast to experiment with. You can start either from the CLI or directly from Python.

Prerequisites

Before getting started, make sure you have:
  • Python 3.9+
  • A local or remote LLM provider
    • Recommended for local use: Ollama
    • Alternatives: LMStudio, vLLM, OpenAI, Mistral
Install Ollama and pull a model:
ollama pull llama3
Make sure Ollama is running:
ollama serve

Installation

Install RAGLight from PyPI:
pip install raglight
That’s it. No additional setup required.

Option 1 — Instant RAG with the CLI

The fastest way to get started is using the interactive CLI wizard.
raglight chat
The wizard will guide you through:
  • Selecting a local folder containing your documents
  • Choosing an embedding model
  • Choosing a vector store
  • Selecting an LLM provider and model
  • Configuring ignore folders (e.g. .venv, node_modules)
Once completed, RAGLight will:
  1. Ingest your documents
  2. Build the vector store
  3. Start an interactive chat session
No Python code required.

Option 2 — Your First RAG Pipeline in Python

If you prefer explicit code, here is the minimal Python example.

1. Define your knowledge sources

from raglight.models.data_source_model import FolderSource

knowledge_base = [
    FolderSource(path="./data")  # Folder containing your documents
]

2. Create the RAG pipeline

from raglight.rag.simple_rag_api import RAGPipeline
from raglight.config.settings import Settings

pipeline = RAGPipeline(
    knowledge_base=knowledge_base,
    model_name="llama3",
    provider=Settings.OLLAMA,
    k=5
)

3. Build the pipeline

pipeline.build()
This step:
  • Parses your documents
  • Generates embeddings
  • Stores them in the vector database

4. Query your documents

response = pipeline.generate(
    "What is this project about?"
)

print(response)
You now have a fully working local RAG system.

What happens under the hood?

RAGLight keeps everything explicit:
  1. Documents are ingested from your data sources
  2. Embeddings are generated using the selected model
  3. Vectors are stored in a vector database
  4. Relevant chunks are retrieved at query time
  5. The LLM generates an answer using retrieved context
Nothing is hidden or automatic unless you configure it.

Going further

Once you have a basic RAG running, you can explore:
  • Agentic RAG for multi-step retrieval and reasoning
  • RAT (Retrieval-Augmented Thinking) with reflection loops
  • Custom pipelines with the Builder API
  • Multimodal document ingestion (PDFs with images)
  • MCP integration for tool-augmented agents
Check the next sections of the documentation to dive deeper.

Next steps

  • Learn how each component works in the Core Concepts section
  • Explore ready-to-run examples in the examples/ folder
  • Customize your pipeline step by step
RAGLight is built for experimentation — start simple, then iterate.