API Reference

MariaDB AI RAG exposes a comprehensive RESTful API for programmatic interaction with the system. All API endpoints require authentication except for the login endpoint.

Authentication Endpoints

Login

POST /token

Purpose: Authenticates a user and provides a JWT token for subsequent API calls.

Request body:

{
  "username": "user@example.com",
  "password": "secure_password"
}

Response:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer"
}

Usage Example: Authentication should be performed before any other API calls. The returned JWT token must be included in the Authorization header of subsequent requests:

Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Document Management Endpoints

Upload Documents

Purpose: Uploads and processes one or more documents for ingestion into the system. Documents are processed asynchronously in the background.

Request: multipart/form-data with one or more file attachments

Request Parameters:

  • files: One or more files to upload (required)

Response:

Status Values:

  • pending: Document is queued for processing

  • completed: Document has been successfully processed

  • failed: Document processing failed (check error_message)

Usage Example: Upload one or more documents for ingestion.

Note: The endpoint accepts both single and multiple files. Documents are processed asynchronously, so the initial status will be pending. Use the document ID to check processing status later.

List Documents

Purpose: Retrieves a paginated list of all documents uploaded by the authenticated user.

Parameters:

  • skip (optional): Number of records to skip for pagination (default: 0)

  • limit (optional): Maximum number of records to return (default: 100)

Response:

Usage Example: Use this endpoint to monitor all documents in the system, check their processing status, or select documents for further operations.

Retrieve Document

Purpose: Retrieves detailed information about a specific document.

Response:

Usage Example: Use this endpoint to check the status of a specific document or retrieve its metadata.

Delete Documents

Purpose: Deletes multiple documents and their associated chunks and vector embeddings.

Request body:

Response:

Usage Example: Use this endpoint to remove documents that are no longer needed, freeing up storage space and improving search performance.

Chunking Endpoints

Chunk Documents (Batch)

Purpose: Processes multiple documents into chunks and creates vector embeddings for semantic search. Documents are processed asynchronously in the background.

Request body:

Chunking Methods:

  • recursive: Recursive text splitting (default)

  • sentence: Sentence-based chunking

  • token: Token-based chunking

  • semantic: Semantic similarity-based chunking (requires threshold)

Response:

Usage Example: Use this endpoint after document ingestion to prepare documents for semantic search. The chunking process divides documents into semantically meaningful segments and creates vector embeddings.

Note: For semantic chunking, the threshold parameter controls how similar adjacent chunks should be before they are merged.

Chunk All Documents

Purpose: Processes all documents in the system into chunks. Useful for batch processing or reprocessing all documents with new chunking parameters.

Request body:

Response:

Usage Example: Use this endpoint to reprocess all documents with new chunking settings.

Filter/Retrieve Chunks

Purpose: Retrieves chunks for specific documents. Use this to check if chunking has completed or to retrieve chunk data.

Request body:

Response: Array of chunk objects

Usage Example: Check if documents have been chunked and retrieve their chunks.

Retrieval and Search Endpoints

Semantic Retrieval

Purpose: Performs semantic search to retrieve relevant document chunks based on a query using vector similarity.

Request body:

Request Parameters:

  • query (required): The search query

  • top_k (optional): Number of results to return (default: 20)

  • document_ids (optional): Filter results to specific document IDs (default: all documents)

Response: Array of retrieval results

Response Fields:

  • id: Unique chunk identifier

  • document_id: ID of the source document

  • content: The chunk text content

  • metadata: Additional metadata about the chunk

  • distance: Vector distance (lower = more similar)

Usage Example: Use this endpoint to find semantically relevant information. The system converts your query into a vector embedding and finds the most similar chunks.

Purpose: Performs full-text search using MariaDB's FULLTEXT index to find relevant document chunks.

Request body:

Request Parameters:

  • query (required): The search query

  • top_k (optional): Number of results to return (default: 10)

  • document_ids (optional): Filter results to specific document IDs

Response: Array of search results

Response Fields:

  • id: Unique chunk identifier

  • document_id: ID of the source document

  • source: File path of the source document

  • content: The chunk text content

  • score: Relevance score (higher = more relevant)

Usage Example: Use this endpoint for keyword-based search when you need exact term matching.

Purpose: Combines semantic search (vector similarity) and full-text search using Reciprocal Rank Fusion (RRF) for optimal results.

Request body:

Request Parameters:

  • query (required): The search query

  • top_k (optional): Number of results to return (default: 20)

  • k (optional): RRF parameter for rank fusion (default: 60)

  • provider (optional): Embedding provider for semantic search

  • model (optional): Embedding model for semantic search

  • document_ids (optional): Filter results to specific document IDs

Response: Array of hybrid search results

Response Fields:

  • id: Unique chunk identifier

  • document_id: ID of the source document

  • source: File path of the source document

  • content: The chunk text content

  • metadata: Additional metadata about the chunk

  • distance: Vector distance from semantic search (lower = more similar)

  • score: Full-text relevance score (higher = more relevant)

Usage Example: Use this endpoint for the best of both worlds - combining semantic understanding with keyword matching.

Generate Text

Purpose: Generates a response to a query using a language model and the provided context chunks.

Request body:

Request Parameters:

  • query (required): The user's question or prompt

  • chunks (required): Array of context chunks to use for generation

  • llm_provider (optional): LLM provider - openai, anthropic, gemini, cohere, ollama, azure, bedrock

  • llm_model (optional): Specific model to use (e.g., gpt-4, claude-3-opus)

  • temperature (optional): Controls randomness (0.0-2.0, default: 0.7)

  • top_p (optional): Nucleus sampling parameter (0.0-1.0, default: 0.9)

  • max_tokens (optional): Maximum tokens to generate (1-8192, default: 1000)

Response:

Usage Example: Use this endpoint after retrieving relevant chunks to generate a coherent response based on the information in those chunks.

Asynchronous Generation

Purpose: Generates a response asynchronously, useful for long-running generation tasks.

Request body: Same as /generate

Response: Same as /generate

Usage Example: Use this endpoint for generation tasks that may take longer to complete.

Streaming Generation

Purpose: Generates a response with streaming output (Server-Sent Events), allowing for real-time display of results as tokens are generated.

Request body: Same as /generate

Response: Server-Sent Events (SSE) stream with the following event types:

Usage Example: Use this endpoint for a better user experience when generating longer responses, as it allows displaying partial results as they become available.

This page is: Copyright © 2025 MariaDB. All rights reserved.

spinner

Last updated

Was this helpful?