Skip to main content

CLI

RAGLight ships with a full command-line interface. Three commands cover the main use cases:
CommandDescription
raglight chatInteractive RAG chat session in your terminal
raglight agentic-chatSame as chat, but with the Agentic RAG pipeline
raglight serveDeploy as a REST API (optionally with a Streamlit UI)

raglight chat

Starts an interactive terminal chat session backed by a RAG pipeline. On first launch, a setup wizard guides you through choosing your vector store, embeddings, and LLM. Subsequent runs can skip the wizard entirely via environment variables.
raglight chat
The wizard walks you through:
  1. Vector database — Chroma or Qdrant, local path or remote host
  2. Embeddings — provider and model
  3. LLM — provider, model, and API base URL
  4. Knowledge source — local folder or GitHub repository
  5. Indexing — option to reuse an existing index
  6. Chat loop — type your questions, get streamed responses
Responses are rendered as markdown in the terminal with streaming output.

Skip the wizard with env vars

Create a .env file (or export env vars) and the wizard is bypassed entirely:
.env
RAGLIGHT_LLM_PROVIDER=Ollama
RAGLIGHT_LLM_MODEL=llama3.1:8b
RAGLIGHT_EMBEDDINGS_PROVIDER=HuggingFace
RAGLIGHT_EMBEDDINGS_MODEL=all-MiniLM-L6-v2
RAGLIGHT_DB=Chroma
RAGLIGHT_PERSIST_DIR=./myDb
RAGLIGHT_COLLECTION=default
RAGLIGHT_DATA_PATH=./docs
Then just run:
raglight chat
RAGLight prints the active configuration and goes straight to the chat loop.

Commands in the chat loop

InputAction
Any textSend a question to the RAG pipeline
bye / exit / quitEnd the session

raglight agentic-chat

Same as raglight chat, but uses the Agentic RAG pipeline — the LLM can call tools, reason in multiple steps, and go beyond simple retrieval.
raglight agentic-chat
The setup wizard is identical. The difference is in the pipeline: the agent decides when to retrieve, can combine multiple retrievals, and produces richer answers for complex questions.
Agentic mode requires an LLM that supports tool calling (e.g. llama3.1, gpt-4o, mistral-large).
See the Agentic RAG page for a full explanation of the pipeline.

raglight serve

Starts a FastAPI REST API exposing your RAG pipeline over HTTP. Entirely configured by environment variables — no Python code required.
raglight serve
Add --ui to also launch the Streamlit chat interface:
raglight serve --ui
See the REST API page for the full reference — endpoints, configuration variables, Docker Compose setup, and more.

CLI options

OptionDefaultDescription
--host0.0.0.0Host to bind
--port8000Port to listen on
--reloadfalseEnable auto-reload (development)
--workers1Number of Uvicorn worker processes
--uifalseLaunch the Streamlit chat UI alongside the API
--ui-port8501Port for the Streamlit UI

Common environment variables

All three commands read the same RAGLIGHT_* environment variables:
VariableDefaultDescription
RAGLIGHT_LLM_PROVIDEROllamaLLM provider
RAGLIGHT_LLM_MODELllama3LLM model name
RAGLIGHT_LLM_API_BASEhttp://localhost:11434LLM API base URL
RAGLIGHT_EMBEDDINGS_PROVIDERHuggingFaceEmbeddings provider
RAGLIGHT_EMBEDDINGS_MODELall-MiniLM-L6-v2Embeddings model
RAGLIGHT_DBChromaVector store backend (Chroma or Qdrant)
RAGLIGHT_PERSIST_DIR./raglight_dbLocal persistence directory
RAGLIGHT_COLLECTIONdefaultCollection name
RAGLIGHT_K5Number of documents retrieved per query
RAGLIGHT_DATA_PATHPath to documents (skips wizard prompt)

Summary

  • raglight chat — terminal RAG chat with streaming markdown output
  • raglight agentic-chat — same but with tool-calling agent mode
  • raglight serve — REST API; add --ui for the web chat interface
  • All three share the same RAGLIGHT_* env vars — one .env file for everything