API Reference
MariaDB AI RAG exposes a comprehensive RESTful API for programmatic interaction with the system. All API endpoints require authentication except for the login endpoint.
Authentication Endpoints
Login
POST /tokenPurpose: Authenticates a user and provides a JWT token for subsequent API calls.
Request body:
{
"username": "user@example.com",
"password": "secure_password"
}Response:
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer"
}Usage Example: Authentication should be performed before any other API calls. The returned JWT token must be included in the Authorization header of subsequent requests:
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...Document Management Endpoints
Upload Documents
Purpose: Uploads and processes one or more documents for ingestion into the system. Documents are processed asynchronously in the background.
Request: multipart/form-data with one or more file attachments
Request Parameters:
files: One or more files to upload (required)
Response:
Status Values:
pending: Document is queued for processingcompleted: Document has been successfully processedfailed: Document processing failed (checkerror_message)
Usage Example: Upload one or more documents for ingestion.
Note: The endpoint accepts both single and multiple files. Documents are processed asynchronously, so the initial status will be pending. Use the document ID to check processing status later.
List Documents
Purpose: Retrieves a paginated list of all documents uploaded by the authenticated user.
Parameters:
skip(optional): Number of records to skip for pagination (default: 0)limit(optional): Maximum number of records to return (default: 100)
Response:
Usage Example: Use this endpoint to monitor all documents in the system, check their processing status, or select documents for further operations.
Retrieve Document
Purpose: Retrieves detailed information about a specific document.
Response:
Usage Example: Use this endpoint to check the status of a specific document or retrieve its metadata.
Delete Documents
Purpose: Deletes multiple documents and their associated chunks and vector embeddings.
Request body:
Response:
Usage Example: Use this endpoint to remove documents that are no longer needed, freeing up storage space and improving search performance.
Chunking Endpoints
Chunk Documents (Batch)
Purpose: Processes multiple documents into chunks and creates vector embeddings for semantic search. Documents are processed asynchronously in the background.
Request body:
Chunking Methods:
recursive: Recursive text splitting (default)sentence: Sentence-based chunkingtoken: Token-based chunkingsemantic: Semantic similarity-based chunking (requiresthreshold)
Response:
Usage Example: Use this endpoint after document ingestion to prepare documents for semantic search. The chunking process divides documents into semantically meaningful segments and creates vector embeddings.
Note: For semantic chunking, the threshold parameter controls how similar adjacent chunks should be before they are merged.
Chunk All Documents
Purpose: Processes all documents in the system into chunks. Useful for batch processing or reprocessing all documents with new chunking parameters.
Request body:
Response:
Usage Example: Use this endpoint to reprocess all documents with new chunking settings.
Filter/Retrieve Chunks
Purpose: Retrieves chunks for specific documents. Use this to check if chunking has completed or to retrieve chunk data.
Request body:
Response: Array of chunk objects
Usage Example: Check if documents have been chunked and retrieve their chunks.
Retrieval and Search Endpoints
Semantic Retrieval
Purpose: Performs semantic search to retrieve relevant document chunks based on a query using vector similarity.
Request body:
Request Parameters:
query(required): The search querytop_k(optional): Number of results to return (default: 20)document_ids(optional): Filter results to specific document IDs (default: all documents)
Response: Array of retrieval results
Response Fields:
id: Unique chunk identifierdocument_id: ID of the source documentcontent: The chunk text contentmetadata: Additional metadata about the chunkdistance: Vector distance (lower = more similar)
Usage Example: Use this endpoint to find semantically relevant information. The system converts your query into a vector embedding and finds the most similar chunks.
Full-Text Search
Purpose: Performs full-text search using MariaDB's FULLTEXT index to find relevant document chunks.
Request body:
Request Parameters:
query(required): The search querytop_k(optional): Number of results to return (default: 10)document_ids(optional): Filter results to specific document IDs
Response: Array of search results
Response Fields:
id: Unique chunk identifierdocument_id: ID of the source documentsource: File path of the source documentcontent: The chunk text contentscore: Relevance score (higher = more relevant)
Usage Example: Use this endpoint for keyword-based search when you need exact term matching.
Hybrid Search
Purpose: Combines semantic search (vector similarity) and full-text search using Reciprocal Rank Fusion (RRF) for optimal results.
Request body:
Request Parameters:
query(required): The search querytop_k(optional): Number of results to return (default: 20)k(optional): RRF parameter for rank fusion (default: 60)provider(optional): Embedding provider for semantic searchmodel(optional): Embedding model for semantic searchdocument_ids(optional): Filter results to specific document IDs
Response: Array of hybrid search results
Response Fields:
id: Unique chunk identifierdocument_id: ID of the source documentsource: File path of the source documentcontent: The chunk text contentmetadata: Additional metadata about the chunkdistance: Vector distance from semantic search (lower = more similar)score: Full-text relevance score (higher = more relevant)
Usage Example: Use this endpoint for the best of both worlds - combining semantic understanding with keyword matching.
Generate Text
Purpose: Generates a response to a query using a language model and the provided context chunks.
Request body:
Request Parameters:
query(required): The user's question or promptchunks(required): Array of context chunks to use for generationllm_provider(optional): LLM provider -openai,anthropic,gemini,cohere,ollama,azure,bedrockllm_model(optional): Specific model to use (e.g.,gpt-4,claude-3-opus)temperature(optional): Controls randomness (0.0-2.0, default: 0.7)top_p(optional): Nucleus sampling parameter (0.0-1.0, default: 0.9)max_tokens(optional): Maximum tokens to generate (1-8192, default: 1000)
Response:
Usage Example: Use this endpoint after retrieving relevant chunks to generate a coherent response based on the information in those chunks.
Asynchronous Generation
Purpose: Generates a response asynchronously, useful for long-running generation tasks.
Request body: Same as /generate
Response: Same as /generate
Usage Example: Use this endpoint for generation tasks that may take longer to complete.
Streaming Generation
Purpose: Generates a response with streaming output (Server-Sent Events), allowing for real-time display of results as tokens are generated.
Request body: Same as /generate
Response: Server-Sent Events (SSE) stream with the following event types:
Usage Example: Use this endpoint for a better user experience when generating longer responses, as it allows displaying partial results as they become available.
This page is: Copyright © 2025 MariaDB. All rights reserved.
Last updated
Was this helpful?

