MariaDB Xpand Columnar Indexes

Standard row-based tables can add a Columnar index that can be used to greatly speed up aggregation queries.

Overview

MariaDB Xpand supports adding a single Columnar index to each table, though the index can contain multiple columns.

Columnar indexes are available in Xpand 6.

This feature is compatible with Xpand Performance Topology.

Columnar indexes are best suited for queries that "scan" much or all of a table. This includes queries that perform aggregations on large amounts of data, queries that select a subset of columns from large tables, and queries that perform a full table scan of a fact table. Queries tend to match this pattern in analytical, online-analytical processing (OLAP), and data warehousing workloads.

With composite (multi-column) Columnar indexes, filtration is not dependent on column order. If a Columnar index is defined with COLUMNAR INDEX (a, b, c), and the query filters with WHERE b = 1 AND c = 2, the Columnar index can be used to filter the results.

Xpand compresses all Columnar indexes.

The cost_read_column system variable is used for costing columnar reads. The default value of cost_read_column in Xpand 5.3 was 0. For deployments upgrading to Xpand 6 from previous versions, the value of cost_read_column should be set to 1.

Columnar indexes do not support prefix lengths.

Columnar indexes cannot be created on temporary tables.

Columnar indexes cannot be created on generated columns.

Columnar indexes cannot be created on columns that use the JSON or BLOB data types.

Columnar indexes are a Technical Preview in Xpand 6. Software in technical preview should not be used for production workloads.

Columnar Indexes vs. Row-Based Secondary Indexes

MariaDB Xpand 6 supports both Columnar indexes and row-based secondary indexes.

MariaDB Xpand 6 supports the advantages of both row-based secondary indexes and Columnar indexes on the same table. Xpand tables can contain row-based secondary indexes and a single Columnar index, and both types of indexes can be used for queries concurrently: row-based secondary indexes can be used for transactional queries, and Columnar indexes can be used for analytical queries. All indexes are updated when the table's data changes, so both types of indexes work with the same data.

Types of Queries

Columnar indexes and row-based secondary indexes are best suited for different types of queries.

Row-based secondary indexes are best suited for queries that "seek" a specific subset of a table. This includes queries that search for a specific value and queries that filter on a small range of values. Queries tend to match this pattern in transactional or online-transactional processing (OLTP) workloads.

Columnar indexes are best suited for queries that "scan" much or all of a table. This includes queries that perform aggregations on large amounts of data, queries that select a subset of columns from large tables, and queries that perform a full table scan of a fact table. Queries tend to match this pattern in analytical, online-analytical processing (OLAP), and data warehousing workloads.

Filtration with Composite Indexes

For composite (multi-column) indexes, Columnar indexes and row-based secondary indexes support different filtration capabilities.

With composite (multi-column) row-based secondary indexes, filtration is highly dependent on column order. A composite (multi-column) row-based secondary index can only be used to filter on an indexed column if all preceding columns are also part of the filter condition. If a row-based secondary index is defined with INDEX (a, b, c), and the query filters with WHERE b = 1 AND c = 2, the index cannot be used to filter the results because the first indexed column is not part of the filter condition.

With composite (multi-column) Columnar indexes, filtration is not dependent on column order. If a Columnar index is defined with COLUMNAR INDEX (a, b, c), and the query filters with WHERE b = 1 AND c = 2, the Columnar index can be used to filter the results.

Delta Store

A Delta Store is used to decrease the overhead of updates to Columnar indexes. For each Columnar index, MariaDB Xpand 6 uses a Delta Store to store a subset of recently changes rows in a row-based format.

Since a Columnar index stores each column separately, updating a Columnar index can require more I/O than updating a row-based secondary index. Xpand optimizes updates to Columnar indexes by initially updating the Delta Store, and then periodically merging the Delta Store into the Columnar index as a batch operation. The batch operation allows Xpand to amortize the I/O over the whole batch, which uses system resources more efficiently than directly updating the Columnar index for every update.

Examples

Add a Columnar Index to a Table

To add a Columnar index to a table, use the ALTER TABLE statement and specify the new Columnar index with the ADD COLUMNAR INDEX clause:

ALTER TABLE hq_sales.invoices
   ADD COLUMNAR INDEX cols (invoice_date, invoice_total);

For best performance, MariaDB recommends loading data into the table prior to adding a Columnar index.

Create a Table with a Columnar Index

A table can be created with a Columnar index. However, for best performance, MariaDB recommends using the following procedure instead:

  1. Create the table without a Columnar index

  2. Load data into the table

  3. Add the columnar index

If you still want to create a table with a Columnar index, please keep in mind that the Columnar index can lead to performance issues for bulk data loads due to frequent Delta Store merges.

To create a table with a Columnar index, use the CREATE TABLE statement with the COLUMNAR INDEX clause:

CREATE TABLE hq_sales.invoices (
   invoice_id BIGINT UNSIGNED NOT NULL,
   branch_id INT NOT NULL,
   customer_id INT,
   invoice_date DATETIME(6),
   invoice_total DECIMAL(13, 2),
   payment_method ENUM('NONE', 'CASH', 'WIRE_TRANSFER', 'CREDIT_CARD', 'GIFT_CARD'),
   PRIMARY KEY (invoice_id),
   COLUMNAR INDEX cols (invoice_date, invoice_total)
);

Drop a Columnar Index from a Table

To drop a Columnar index from a table, use the ALTER TABLE statement and specify the DROP INDEX clause:

ALTER TABLE hq_sales.customers
   DROP INDEX customer_name;

Procedure: Add a Columnar Index

The following example creates an Xpand table without a Columnar index, and then adds a Columnar index to it:

  1. Connect to the server using MariaDB Client:

    $ mysql --user=root --password --no-auto-rehash
    
  2. Use the CREATE DATABASE statement to create the database:

    CREATE DATABASE hq_sales;
    
  3. Use the CREATE TABLE statement to create the table:

    CREATE TABLE hq_sales.invoices (
       invoice_id BIGINT UNSIGNED NOT NULL,
       branch_id INT NOT NULL,
       customer_id INT,
       invoice_date DATETIME(6),
       invoice_total DECIMAL(13, 2),
       payment_method ENUM('NONE', 'CASH', 'WIRE_TRANSFER', 'CREDIT_CARD', 'GIFT_CARD'),
       PRIMARY KEY (invoice_id)
    );
    
  4. Load data into the table

  5. Alter the table using the ALTER TABLE statement and specify the new Columnar index with the ADD COLUMNAR INDEX clause:

    ALTER TABLE hq_sales.invoices
       ADD COLUMNAR INDEX cols (invoice_date, invoice_total);