VIDEX Storage Engine

The VIDEX storage engine is an aggregated, extensible engine suitable for what-if analyses in MariaDB. The name is derived from [VI]rtual in[DEX].

This document explains how to install and use VIDEX with MariaDB, including:

  • Installing/enabling the VIDEX plugin in MariaDB.

  • Running the VIDEX-Server (statistics service) as a container.

  • Running a one-shot videx-sync workflow to build a VIDEX database.

  • Comparing EXPLAIN between your original schema and the VIDEX schema.

The VIDEX server repository, with more examples and extension points, can be accessed herearrow-up-right. It contains an example dataset (TPC-H tiny)arrow-up-right.

Prerequisites

  • A running MariaDB server you can connect to (the target).

  • A MariaDB build that includes the VIDEX engine plugin (the MariaDB-VIDEX).

    • For a quick start, the target MariaDB and MariaDB-VIDEX can be the same instance.

  • Docker (to run VIDEX-Server and videx-sync).

What is VIDEX

VIDEX is a virtual/hypothetical index engine for what-if analysis.

  • Goal: Evaluate how potential indexes (and optimizer decisions such as join orders) would change query plans without creating real indexes on production data.

  • How it works: VIDEX replays optimizer / handler calls using statistics (cardinality, NDV, histograms, etc.) instead of reading table data. Complex statistics computation is offloaded to an external service (VIDEX-Server) via HTTP.

In practice, you keep your existing MariaDB schema and data as the target, and create a second schema whose tables use ENGINE=VIDEX. You then run EXPLAIN on both schemas and compare plans.

circle-info

Research note (AI4DB)

Estimating multi-column joint NDV (Number of Distinct Value) and cardinality is a challenging research problem. You can extend VIDEX by implementing your own estimation models in VIDEX-Server to improve accuracy for your workload.

Components and Roles

VIDEX typically involves the following roles:

  • Target MariaDB: your original database instance and schema (contains real data).

  • MariaDB with VIDEX plugin (MariaDB-VIDEX): a MariaDB instance that has the VIDEX storage engine plugin enabled.

    • It can be the same instance as the target MariaDB (common for a quick start).

    • Alternatively, it can be a separate MariaDB instance used only for what-if analysis.

  • VIDEX-Server: a standalone HTTP service that stores metadata/statistics and answers estimation requests from the VIDEX plugin.

This document focuses on a docker-based workflow for VIDEX-Server and videx-sync, so users can complete the VIDEX end-to-end flow with a few docker commands, while the MariaDB server itself is managed by users.

Prepare Sample Data

You can test VIDEX with the TPC-H tiny sample.

  1. Download tpch_tiny.sql.tar.gz from herearrow-up-right.

  2. Create a database, and import the data (replace <TARGET_HOST>, <TARGET_PORT>, etc. with your settings:

Install and Enable the VIDEX Plugin

Verify the Engine

Connect to your MariaDB instance and check whether the VIDEX engine is available:

If you see a row for VIDEX with SUPPORT as YES or DEFAULT, the engine is available.

Build-Time Enablement

If VIDEX is not present, build MariaDB with VIDEX enabled. The MariaDB server PR that introduces VIDEX is herearrow-up-right. Key dependencies and options are:

  • Dependencies:

    • libcurl (HTTP client)

    • zlib (compression)

  • CMake options:

    • -DPLUGIN_VIDEX=YES (enable plugin)

    • -DPLUGIN_VIDEX=STATIC (static)

    • -DPLUGIN_VIDEX=DYNAMIC (dynamic)

Example build configuration:

Install and Run VIDEX-Server in Docker

VIDEX-Server is a separate service providing statistics/estimation over HTTP, and MariaDB-VIDEX calls it to get statistics information for generating query plans.

Images

Public images are on Docker Hub: kangrongme/videx-server:0.2.0

Start the Server

Expose container port 5001:

When done, the service is reachable under http://<YOUR_HOST_IP>:5001 .

circle-info

Reachability note

Prefer using a routable IP address (your host/server IP) instead of localhost/127.0.0.1. This matters because both MariaDB-VIDEX (the plugin) and videx-sync need to reach VIDEX-Server. If any of them run inside a container, localhost/127.0.0.1 refers to that container itself (not your host), so the service isn't reachable via localhost.

Build the VIDEX Schema

The VIDEX-Server image supports two entry point modes:

  • server (default): start VIDEX-Server

  • sync: run a one-shot workflow to collect metadata from --target, then:

    • add metadata to VIDEX-Server

    • create virtual tables in --videx

Command Template

Notes:

  1. If --videx is not specified, a default database videx_{TARGET_DB} is created in --target.

  2. If you run a separate MariaDB-VIDEX instance, pass that instance as --videx.

  3. If your VIDEX-Server is not <TARGET_HOST>:5001, pass --videx_server explicitly.

Localhost Failures

Inside a container, localhost/127.0.0.1 refers to the container itself.

On Linux Docker Engine, you can reach the host via host.docker.internal using --add-host:

However, if MariaDB-VIDEX itself is also running in a container, reachability can become tricky. Using a routable IP is the most robust approach.

Configure the Plugin

On MariaDB, the VIDEX plugin exposes session system variables.

Example output:

  • videx_server_ip: critical. This is the address (host:port) that MariaDB-VIDEX uses to call VIDEX-Server.

Configure them for your current session before running EXPLAIN on ENGINE=VIDEX tables (replace <VIDEX_SERVER_HOST>:<VIDEX_SERVER_PORT> with your settings):

Quickstart

This section assumes:

  • Users already have one MariaDB instance running.

  • Users want to create the VIDEX schema in the same instance (which means “MariaDB-VIDEX == Target MariaDB”).

  • MariaDB is running and reachable.

  • The routable IP is something like 203.0.113.42 (example only).

  • MariaDB is reachable at, for instance, 203.0.113.42:15508 (example only).

  • VIDEX-Server is reachable at, for instance, 203.0.113.42:5001 (example only).

  • The default user/password credentials are: videx / password.

1

Start VIDEX-Server

2

Build VIDEX schema via videx-sync

3

Run EXPLAIN on the original schema

Run EXPLAIN on your original tables:

circle-info

Collecting histograms may change MariaDB’s cardinality estimates (for instance, histogram-based estimates rather than InnoDB index statistics). Since VIDEX aims to simulate InnoDB index engine behavior, users can set SET SESSION use_stat_tables = NEVER; to make EXPLAIN results more similar between InnoDB and the VIDEX engine.

The EXPLAIN output for the original schema is:

4

Run EXPLAIN on the VIDEX schema

Run EXPLAIN on the VIDEX schema (tables are ENGINE=VIDEX):

The EXPLAIN output for the VIDEX schema is:

Compare the output between Step 3 and Step 4.

Notes and Best Practices

  1. videx-sync can be time-consuming on large schemas, because it needs to collect metadata/statistics. The metadata collection method is extensible; the VIDEX source repository also discusses lighter-weight sampling approaches.

  2. Networking matters. Since VIDEX-Server is often in a container, localhost/127.0.0.1 may not refer to what you expect. The routable IP is recommended for reachability, as it ensures that both MariaDB-VIDEX and the container can reach it.

Last updated

Was this helpful?