🚧 FastCMS is under active development — not ready for production use. APIs and features may change without notice.
FastCMS
Plugins

AI RAG Plugin

Retrieval-Augmented Generation — upload documents, ask questions in natural language.

AI RAG Plugin

The AI RAG (Retrieval-Augmented Generation) plugin lets you upload documents, automatically chunk and embed them, then ask questions in natural language. The LLM answers based on your actual data — no hallucinations.

Requires: AI Core plugin + AI Vectors plugin

Installation

cp -r ai_rag/ plugins/ai_rag/
cp -r ai_vectors/ plugins/ai_vectors/
cp -r ai_core/ plugins/ai_core/

How It Works

Ingestion

  1. You send document text to /ingest, or upload a file to /upload
  2. For file uploads, text is automatically extracted (supports .txt, .md, .html, .csv, .json, .pdf)
  3. The text is split into overlapping chunks (sentence-aware)
  4. Each chunk is embedded via the AI provider
  5. Chunks and embeddings are stored in the vector database

Querying

  1. You ask a question via the /ask endpoint
  2. The question is embedded
  3. The most relevant chunks are found via cosine similarity
  4. The chunks are sent to the LLM as context
  5. The LLM generates an answer grounded in your documents

API Endpoints

All endpoints are mounted at /api/v1/plugins/ai/rag/.

POST /ai/rag/ingest

Ingest a document — chunk it, embed it, store it.

curl -X POST http://localhost:8000/api/v1/plugins/ai/rag/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "knowledge-base",
    "text": "FastCMS is a Backend-as-a-Service built with FastAPI. It provides authentication, collections, file storage, real-time features, and more. The plugin system allows extending functionality...",
    "source": "readme.md",
    "chunk_size": 500,
    "chunk_overlap": 50
  }'

Response:

{
  "document_id": "doc-uuid",
  "chunks": 12,
  "collection": "knowledge-base",
  "source": "readme.md"
}

Parameters:

ParameterDefaultDescription
collectionrequiredCollection name for organizing documents
textrequiredThe document text to ingest
source""Source identifier (filename, URL, etc.)
chunk_size500Target characters per chunk (100–5000)
chunk_overlap50Characters of overlap between chunks (0–500)

POST /ai/rag/upload

Upload a file and ingest it into the RAG pipeline. The file content is automatically extracted based on format.

curl -X POST http://localhost:8000/api/v1/plugins/ai/rag/upload \
  -F "file=@readme.md" \
  -F "collection=knowledge-base" \
  -F "chunk_size=500" \
  -F "chunk_overlap=50"

Response (same as /ingest):

{
  "document_id": "doc-uuid",
  "chunks": 8,
  "collection": "knowledge-base",
  "source": "readme.md"
}

Parameters (multipart form):

ParameterDefaultDescription
filerequiredFile to upload (max 10MB)
collection"knowledge-base"Collection name
chunk_size500Target characters per chunk (100–5000)
chunk_overlap50Characters of overlap (0–500)

Supported file formats:

ExtensionExtraction Method
.txt, .text, .logPlain text (as-is)
.md, .markdownMarkdown → plain text (strips headers, bold, links, code blocks)
.html, .htmHTML → plain text (strips tags, ignores script/style)
.csvRows converted to "header: value" readable text
.jsonPretty-printed JSON
.pdfText extraction via PyPDF2 (requires pip install PyPDF2)

POST /ai/rag/ask

Ask a question against ingested documents.

curl -X POST http://localhost:8000/api/v1/plugins/ai/rag/ask \
  -H "Content-Type: application/json" \
  -d '{
    "collection": "knowledge-base",
    "question": "How does authentication work in FastCMS?",
    "limit": 5
  }'

Response:

{
  "answer": "FastCMS uses JWT-based authentication with bcrypt password hashing. It supports passwordless OTP login, email change flows, session management, and account lockout after failed attempts.",
  "sources": [
    {
      "text": "FastCMS provides JWT authentication with bcrypt...",
      "score": 0.9231,
      "record_id": "doc-uuid:3",
      "metadata": {"source": "readme.md", "chunk_index": 3}
    }
  ],
  "model": "gpt-4o-mini"
}

DELETE /ai/rag/collection/{name}

Delete all ingested documents for a collection.

Chunking Strategy

The chunker splits text intelligently:

  • Sentence-aware: Breaks at sentence boundaries (. ! ?), not mid-word
  • Overlapping: Configurable overlap ensures context isn't lost at chunk boundaries
  • Long sentences: Sentences exceeding chunk_size are split at word boundaries

Tuning Chunk Size

Use Casechunk_sizechunk_overlapWhy
Short FAQ docs20020Small, precise answers
General docs50050Good balance (default)
Technical docs1000100More context per chunk
Legal/dense text30050Precise retrieval needed

Example: Build a Knowledge Base

import httpx
from pathlib import Path

client = httpx.Client(base_url="http://localhost:8000")

# Option A: Upload files directly
for path in Path("docs/").glob("*.md"):
    with open(path, "rb") as f:
        client.post("/api/v1/plugins/ai/rag/upload", files={
            "file": (path.name, f, "text/markdown"),
        }, data={"collection": "docs"})

# Option B: Ingest text programmatically
client.post("/api/v1/plugins/ai/rag/ingest", json={
    "collection": "docs",
    "text": "FastCMS supports JWT auth, OAuth, and OTP login...",
    "source": "auth-notes.txt",
})

# Ask questions
response = client.post("/api/v1/plugins/ai/rag/ask", json={
    "collection": "docs",
    "question": "How do I configure webhooks?",
})
print(response.json()["answer"])

Admin UI

The AI Playground page (/admin/ai → RAG tab) provides a visual interface for ingestion and querying:

  • Paste Text mode — paste document text directly
  • Upload File mode — upload a file (.txt, .md, .html, .csv, .json, .pdf) via the file picker
  • Ask a Question — query your ingested documents
  • Vector Store — see collection stats

Dependencies

  • Required: AI Core plugin, AI Vectors plugin
  • Optional: pip install PyPDF2 — for PDF file upload support
  • No other additional pip packages — all text extractors use Python stdlib

On this page