Technical Architecture

Table of Contents


System Architecture

High-Level Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         Windows Host System                         │
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐    │
│  │              Docker Desktop (WSL 2 Backend)                 │    │
│  │                                                             │    │
│  │  ┌────────────────────────────────────────────────────────┐ │    │
│  │  │              Docker Network: ai-nexus-network          │ │    │
│  │  │                  (Bridge Driver)                       │ │    │
│  │  │                                                        │ │    │
│  │  │  ┌──────────────────────────────────────────────────┐  │ │    │
│  │  │  │      ai-nexus Container (Ubuntu 24.04)           │  │ │    │
│  │  │  │                                                  │  │ │    │
│  │  │  │  ┌────────────────────────────────────────────┐  │  │ │    │
│  │  │  │  │  Process 1: RAG API (PID: dynamic)         │  │  │ │    │
│  │  │  │  │  - Framework: FastAPI                      │  │  │ │    │
│  │  │  │  │  - Server: Uvicorn (ASGI)                  │  │  │ │    │
│  │  │  │  │  - Bind: 0.0.0.0:8000                      │  │  │ │    │
│  │  │  │  │  - Workers: 1                              │  │  │ │    │
│  │  │  │  │  - Binary: /opt/rag-in-a-box/bin/rag-api   │  │  │ │    │
│  │  │  │  └────────────────────────────────────────────┘  │  │ │    │
│  │  │  │                                                  │  │ │    │
│  │  │  │  ┌────────────────────────────────────────────┐  │  │ │    │
│  │  │  │  │  Process 2: MCP Server (PID: dynamic)      │  │  │ │    │
│  │  │  │  │  - Framework: FastAPI                      │  │  │ │    │
│  │  │  │  │  - Server: Uvicorn (ASGI)                  │  │  │ │    │
│  │  │  │  │  - Bind: 0.0.0.0:8002                      │  │  │ │    │
│  │  │  │  │  - Workers: 1                              │  │  │ │    │
│  │  │  │  │  - Binary: /opt/rag-in-a-box/bin/mcp-server│  │  │ │    │
│  │  │  │  └────────────────────────────────────────────┘  │  │ │    │
│  │  │  │                                                  │  │ │    │
│  │  │  │  Startup: start-services.sh                      │  │ │    │
│  │  │  │  Health Check: 180s timeout, 10s interval        │  │ │    │
│  │  │  └──────────────────┬────────────────────────────┘     │ │    │
│  │  │                     │                                  │ │    │
│  │  │                     │ MySQL Protocol (Port 3306)       │ │    │
│  │  │                     │                                  │ │    │
│  │  │  ┌──────────────────▼────────────────────────────┐     │ │    │
│  │  │  │      mysql-db Container (MariaDB 11)          │     │ │    │
│  │  │  │                                               │     │ │    │
│  │  │  │  ┌──────────────────────────────────────────┐ │     │ │    │
│  │  │  │  │  MariaDB Server                          │ │     │ │    │
│  │  │  │  │  - Version: 11.x                         │ │     │ │    │
│  │  │  │  │  - Storage Engine: InnoDB                │ │     │ │    │
│  │  │  │  │  - Character Set: utf8mb4                │ │     │ │    │
│  │  │  │  │  - Collation: utf8mb4_unicode_ci         │ │     │ │    │
│  │  │  │  │  - Page Size: 16KB                       │ │     │ │    │
│  │  │  │  │  - Row Format: Dynamic                   │ │     │ │    │
│  │  │  │  └──────────────────────────────────────────┘ │     │ │    │
│  │  │  │                                               │     │ │    │
│  │  │  │  ┌──────────────────────────────────────────┐ │     │ │    │
│  │  │  │  │  Persistent Volume: mysql_data           │ │     │ │    │
│  │  │  │  │  - Database: kb_chunks                   │ │     │ │    │
│  │  │  │  │  - Tables: documents_*, vdb_tbl_*        │ │     │ │    │
│  │  │  │  │  - Indexes: Vector indexes               │ │     │ │    │
│  │  │  │  └──────────────────────────────────────────┘ │     │ │    │
│  │  │  └────────────────────────────────────────────┘ │ │           │
│  │  └─────────────────────────────────────────────────┘ │           │
│  └───────────────────────────────────────────────────────┘          │
│                                                                     │
│  Port Mappings (Host → Container):                                  │
│  - 8000:8000  (RAG API)                                             │
│  - 8002:8002  (MCP Server)                                          │
│  - 3306:3306  (MariaDB)                                             │
└─────────────────────────────────────────────────────────────────────┘

External Services (Internet):
┌─────────────────────────────────────────────────┐
│  Google Generative AI API                       │
│  - Endpoint: generativelanguage.googleapis.com  │
│  - Embedding: text-embedding-004                │
│  - LLM: gemini-2.0-flash                        │
└─────────────────────────────────────────────────┘

Container Dependency Graph


Component Details

1. RAG API Component

Binary Location: /opt/rag-in-a-box/bin/rag-api

Responsibilities:

  • Document ingestion and processing

  • Text chunking and embedding generation

  • Vector storage and retrieval

  • Semantic search

  • RAG query processing

  • Authentication and authorization

Technology Stack:

  • Framework: FastAPI (Python)

  • ASGI Server: Uvicorn

  • Database Driver: PyMySQL / aiomysql

  • Embedding Client: Google Generative AI SDK

  • Document Processing: LangChain / Custom parsers

Endpoints:

Configuration Variables:

2. MCP Server Component

Binary Location: /opt/rag-in-a-box/bin/mcp-server

Responsibilities:

  • Model Context Protocol implementation

  • Database tool exposure

  • Vector store tool exposure

  • RAG tool exposure

  • Authentication and rate limiting

Technology Stack:

  • Framework: FastAPI (Python)

  • ASGI Server: Uvicorn

  • Protocol: MCP (Model Context Protocol)

  • Database Client: PyMySQL

Available Tools:

Core Tools:

  • health_check - Server health verification

  • get_server_status - Detailed server status

Database Tools:

  • list_databases - List all databases

  • list_tables - List tables in database

  • get_table_schema - Get table structure

  • execute_sql - Execute SQL queries

  • create_database - Create new database

  • drop_database - Delete database

Vector Store Tools:

  • create_vector_store - Create vector store

  • delete_vector_store - Delete vector store

  • list_vector_stores - List all vector stores

  • insert_docs_vector_store - Add documents

  • search_vector_store - Semantic search

RAG Tools:

  • ingest_documents - Ingest documents via RAG API

  • generate_response - Generate RAG responses

Configuration Variables:

3. MariaDB Component

Image: mariadb:11

Configuration:

Database Schema:


Data Flow

Document Ingestion Flow

RAG Query Flow


Security Architecture

Authentication Flow

Security Keys

Critical Requirement: All three keys must be identical for unified authentication:

Key Generation (for production):

Security Features

  1. JWT Authentication

    • Algorithm: HS256

    • Expiration: 30 minutes (configurable)

    • Unified token for RAG API and MCP Server

  2. Rate Limiting

    • 100 requests per minute (default)

    • Configurable per endpoint

  3. CORS Configuration

    • Allowed origins: Configurable

    • Credentials: Supported

    • Methods: GET, POST, PUT, DELETE, OPTIONS

  4. File Upload Security

    • Max file size: 200MB

    • Allowed extensions: .pdf, .txt, .docx, .md, .html, .csv, .json, .xml

    • Malware scanning: Optional

    • Quarantine: Enabled for suspicious files

  5. Database Security

    • Parameterized queries (SQL injection prevention)

    • Connection pooling

    • Encrypted connections (optional)


Configuration Management

Configuration Modes

1. Standalone Mode

File: config.env.secure.local Usage: Direct environment variables Security: Secrets stored in file Best for: Development, single developer

2. Vault Mode

File: config.env.vault.local Usage: HashiCorp Vault integration Security: Secrets stored in Vault Best for: Team development, production-like

Vault Configuration:

3. 1Password Mode

File: config.env.1password.employee Usage: 1Password CLI references Security: Secrets in 1Password vault Best for: Enterprise with 1Password

1Password References:

4. HCP Vault Mode

File: config.env.hcp.live Usage: HashiCorp Cloud Platform Security: Cloud-managed secrets Best for: Production cloud deployments


API Specifications

RAG API Endpoints

POST /token

Description: Generate JWT authentication token

Request:

Response:

POST /ingest

Description: Upload and process documents

Headers:

Request:

Response:

POST /generate

Description: Generate RAG response

Headers:

Request:

Response:


Database Schema

Tables

documents_DEMO_gemini

vdb_tbl_DEMO_gemini

Vector Storage Format

Embedding Dimensions: 768 (float32) Storage Size: 768 × 4 bytes = 3,072 bytes per vector Format: Binary BLOB Encoding: IEEE 754 single-precision floating-point


Performance Characteristics

Resource Requirements

Per Container:

Performance Metrics

Document Ingestion:

  • Processing speed: ~5 documents/batch

  • Chunking: ~100 chunks/second

  • Embedding generation: ~32 chunks/batch

  • Total time: ~30-60 seconds per document (depends on size)

Query Performance:

  • Embedding generation: ~100-200ms

  • Similarity search: ~50-100ms (depends on dataset size)

  • LLM generation: ~1-3 seconds

  • Total response time: ~2-4 seconds

Scalability

Current Limits:

  • Max file size: 200MB

  • Max concurrent requests: 100/minute

  • Database connections: 10 (pool size)

Scaling Options:

  • Horizontal: Deploy multiple ai-nexus containers

  • Vertical: Increase container resources

  • Database: Use read replicas for queries


End of Technical Architecture Document

This page is: Copyright © 2025 MariaDB. All rights reserved.

Last updated

Was this helpful?