Technical Architecture
Table of Contents
System Architecture
High-Level Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ Windows Host System │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Docker Desktop (WSL 2 Backend) │ │
│ │ │ │
│ │ ┌────────────────────────────────────────────────────────┐ │ │
│ │ │ Docker Network: ai-nexus-network │ │ │
│ │ │ (Bridge Driver) │ │ │
│ │ │ │ │ │
│ │ │ ┌──────────────────────────────────────────────────┐ │ │ │
│ │ │ │ ai-nexus Container (Ubuntu 24.04) │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ ┌────────────────────────────────────────────┐ │ │ │ │
│ │ │ │ │ Process 1: RAG API (PID: dynamic) │ │ │ │ │
│ │ │ │ │ - Framework: FastAPI │ │ │ │ │
│ │ │ │ │ - Server: Uvicorn (ASGI) │ │ │ │ │
│ │ │ │ │ - Bind: 0.0.0.0:8000 │ │ │ │ │
│ │ │ │ │ - Workers: 1 │ │ │ │ │
│ │ │ │ │ - Binary: /opt/rag-in-a-box/bin/rag-api │ │ │ │ │
│ │ │ │ └────────────────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ ┌────────────────────────────────────────────┐ │ │ │ │
│ │ │ │ │ Process 2: MCP Server (PID: dynamic) │ │ │ │ │
│ │ │ │ │ - Framework: FastAPI │ │ │ │ │
│ │ │ │ │ - Server: Uvicorn (ASGI) │ │ │ │ │
│ │ │ │ │ - Bind: 0.0.0.0:8002 │ │ │ │ │
│ │ │ │ │ - Workers: 1 │ │ │ │ │
│ │ │ │ │ - Binary: /opt/rag-in-a-box/bin/mcp-server│ │ │ │ │
│ │ │ │ └────────────────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ Startup: start-services.sh │ │ │ │
│ │ │ │ Health Check: 180s timeout, 10s interval │ │ │ │
│ │ │ └──────────────────┬────────────────────────────┘ │ │ │
│ │ │ │ │ │ │
│ │ │ │ MySQL Protocol (Port 3306) │ │ │
│ │ │ │ │ │ │
│ │ │ ┌──────────────────▼────────────────────────────┐ │ │ │
│ │ │ │ mysql-db Container (MariaDB 11) │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ ┌──────────────────────────────────────────┐ │ │ │ │
│ │ │ │ │ MariaDB Server │ │ │ │ │
│ │ │ │ │ - Version: 11.x │ │ │ │ │
│ │ │ │ │ - Storage Engine: InnoDB │ │ │ │ │
│ │ │ │ │ - Character Set: utf8mb4 │ │ │ │ │
│ │ │ │ │ - Collation: utf8mb4_unicode_ci │ │ │ │ │
│ │ │ │ │ - Page Size: 16KB │ │ │ │ │
│ │ │ │ │ - Row Format: Dynamic │ │ │ │ │
│ │ │ │ └──────────────────────────────────────────┘ │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ ┌──────────────────────────────────────────┐ │ │ │ │
│ │ │ │ │ Persistent Volume: mysql_data │ │ │ │ │
│ │ │ │ │ - Database: kb_chunks │ │ │ │ │
│ │ │ │ │ - Tables: documents_*, vdb_tbl_* │ │ │ │ │
│ │ │ │ │ - Indexes: Vector indexes │ │ │ │ │
│ │ │ │ └──────────────────────────────────────────┘ │ │ │ │
│ │ │ └────────────────────────────────────────────┘ │ │ │
│ │ └─────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
│ │
│ Port Mappings (Host → Container): │
│ - 8000:8000 (RAG API) │
│ - 8002:8002 (MCP Server) │
│ - 3306:3306 (MariaDB) │
└─────────────────────────────────────────────────────────────────────┘
External Services (Internet):
┌─────────────────────────────────────────────────┐
│ Google Generative AI API │
│ - Endpoint: generativelanguage.googleapis.com │
│ - Embedding: text-embedding-004 │
│ - LLM: gemini-2.0-flash │
└─────────────────────────────────────────────────┘Container Dependency Graph
Component Details
1. RAG API Component
Binary Location: /opt/rag-in-a-box/bin/rag-api
Responsibilities:
Document ingestion and processing
Text chunking and embedding generation
Vector storage and retrieval
Semantic search
RAG query processing
Authentication and authorization
Technology Stack:
Framework: FastAPI (Python)
ASGI Server: Uvicorn
Database Driver: PyMySQL / aiomysql
Embedding Client: Google Generative AI SDK
Document Processing: LangChain / Custom parsers
Endpoints:
Configuration Variables:
2. MCP Server Component
Binary Location: /opt/rag-in-a-box/bin/mcp-server
Responsibilities:
Model Context Protocol implementation
Database tool exposure
Vector store tool exposure
RAG tool exposure
Authentication and rate limiting
Technology Stack:
Framework: FastAPI (Python)
ASGI Server: Uvicorn
Protocol: MCP (Model Context Protocol)
Database Client: PyMySQL
Available Tools:
Core Tools:
health_check- Server health verificationget_server_status- Detailed server status
Database Tools:
list_databases- List all databaseslist_tables- List tables in databaseget_table_schema- Get table structureexecute_sql- Execute SQL queriescreate_database- Create new databasedrop_database- Delete database
Vector Store Tools:
create_vector_store- Create vector storedelete_vector_store- Delete vector storelist_vector_stores- List all vector storesinsert_docs_vector_store- Add documentssearch_vector_store- Semantic search
RAG Tools:
ingest_documents- Ingest documents via RAG APIgenerate_response- Generate RAG responses
Configuration Variables:
3. MariaDB Component
Image: mariadb:11
Configuration:
Database Schema:
Data Flow
Document Ingestion Flow
RAG Query Flow
Security Architecture
Authentication Flow
Security Keys
Critical Requirement: All three keys must be identical for unified authentication:
Key Generation (for production):
Security Features
JWT Authentication
Algorithm: HS256
Expiration: 30 minutes (configurable)
Unified token for RAG API and MCP Server
Rate Limiting
100 requests per minute (default)
Configurable per endpoint
CORS Configuration
Allowed origins: Configurable
Credentials: Supported
Methods: GET, POST, PUT, DELETE, OPTIONS
File Upload Security
Max file size: 200MB
Allowed extensions: .pdf, .txt, .docx, .md, .html, .csv, .json, .xml
Malware scanning: Optional
Quarantine: Enabled for suspicious files
Database Security
Parameterized queries (SQL injection prevention)
Connection pooling
Encrypted connections (optional)
Configuration Management
Configuration Modes
1. Standalone Mode
File: config.env.secure.local Usage: Direct environment variables Security: Secrets stored in file Best for: Development, single developer
2. Vault Mode
File: config.env.vault.local Usage: HashiCorp Vault integration Security: Secrets stored in Vault Best for: Team development, production-like
Vault Configuration:
3. 1Password Mode
File: config.env.1password.employee Usage: 1Password CLI references Security: Secrets in 1Password vault Best for: Enterprise with 1Password
1Password References:
4. HCP Vault Mode
File: config.env.hcp.live Usage: HashiCorp Cloud Platform Security: Cloud-managed secrets Best for: Production cloud deployments
API Specifications
RAG API Endpoints
POST /token
Description: Generate JWT authentication token
Request:
Response:
POST /ingest
Description: Upload and process documents
Headers:
Request:
Response:
POST /generate
Description: Generate RAG response
Headers:
Request:
Response:
Database Schema
Tables
documents_DEMO_gemini
vdb_tbl_DEMO_gemini
Vector Storage Format
Embedding Dimensions: 768 (float32) Storage Size: 768 × 4 bytes = 3,072 bytes per vector Format: Binary BLOB Encoding: IEEE 754 single-precision floating-point
Performance Characteristics
Resource Requirements
Per Container:
Performance Metrics
Document Ingestion:
Processing speed: ~5 documents/batch
Chunking: ~100 chunks/second
Embedding generation: ~32 chunks/batch
Total time: ~30-60 seconds per document (depends on size)
Query Performance:
Embedding generation: ~100-200ms
Similarity search: ~50-100ms (depends on dataset size)
LLM generation: ~1-3 seconds
Total response time: ~2-4 seconds
Scalability
Current Limits:
Max file size: 200MB
Max concurrent requests: 100/minute
Database connections: 10 (pool size)
Scaling Options:
Horizontal: Deploy multiple ai-nexus containers
Vertical: Increase container resources
Database: Use read replicas for queries
End of Technical Architecture Document
This page is: Copyright © 2025 MariaDB. All rights reserved.
Last updated
Was this helpful?

