1 of 100

Analytics

MariaDB Enterprise offers powerful solutions to break down the barriers to insight. Whether you need to run ad hoc queries on massive datasets or power the most demanding AI workloads.

MariaDB ColumnStore

For fast, ad hoc analytics at scale, MariaDB ColumnStore is a powerful columnar database that can be deployed as a standalone analytics solution or integrated with MariaDB Enterprise Server to act as a powerful query accelerator. It stores data in a columnar format and can be distributed across a cluster of servers, allowing it to execute complex queries in parallel on petabytes of data.

This integration allows you to access your InnoDB data in near-real time, processing it directly in the ColumnStore engine to run fast, parallel OLAP queries straight from your transactional data. This eliminates the need to maintain a separate pipeline or use delayed batch inserts to analyze your live data.

MariaDB ColumnStore

MariaDB Exa

For the ultimate in analytical performance, the joint solution between MariaDB and Exasol connects your mission-critical transactional data to the world’s fastest analytics engine. Available on-premise or in the cloud on platforms like AWS and Microsoft Azure, this solution brings high-speed analytics to any environment.

MariaDB Exa erases the barrier between live operational data and high-speed analytics, leveraging Exasol’s massively parallel processing (MPP) and in-memory engine. It is the ideal solution for powering your most demanding analytics and AI/ML workloads with unmatched speed and efficiency.

MariaDB ColumnStore

Discover MariaDB ColumnStore, the powerful columnar storage engine for analytical workloads. Learn about its architecture, features, and how it enables high-performance data warehousing and analytics.

Quickstart Guides

MariaDB ColumnStore Quickstart Guides provide concise, Docker-friendly steps to quickly set up, configure, and explore the ColumnStore analytic engine.

MariaDB ColumnStore Guide MariaDB ColumnStore Hardware Guide

MariaDB ColumnStore Guide

Quickstart guide for MariaDB ColumnStore

Quickstart Guide: MariaDB ColumnStore

MariaDB ColumnStore is a specialized columnar storage engine designed for high-performance analytical processing and big data workloads. Unlike traditional row-based storage engines, ColumnStore organizes data by columns, which is highly efficient for analytical queries that often access only a subset of columns across vast datasets.

MariaDB ColumnStore Hardware Guide

This page details MariaDB ColumnStore hardware requirements (CPU, RAM, storage, and network).

Overview

MariaDB ColumnStore is designed for analytical workloads and scales linearly with hardware resources. While the performance generally improves with more CPU cores, memory, and servers, understanding the minimum hardware specifications is crucial for successful deployment, especially in development and production environments.

MariaDB ColumnStore's performance directly benefits from additional hardware resources:

More CPU cores enable greater parallel processing, improving query processing time.
More memory allows for more data caching (reducing I/O), and more servers enable a larger distributed architecture.
HDDs vs. SSDs: SSDs don't deliver as much benefit as you might assume because ColumnStore is optimized towards block streaming, which usually performs well enough on HDDs.
Bare metal vs. virtual servers: Bare metal servers are recommended — they provide additional performance because ColumnStore can fully consume CPU cores and memory.

Minimum Hardware Recommendations

The specifications differentiate between a basic development environment and a production-ready setup:

For Development Environments

CPU: A minimum of 8 CPU cores.
Memory (RAM): A minimum of 32 GB.
Storage: Local disk storage is acceptable for development purposes.

For Production Environments

CPU: A minimum of 64 CPU cores.
- This recommendation underscores the highly parallel nature of ColumnStore, which can effectively utilize a large number of cores for analytical processing.
Memory (RAM): A minimum of 128 GB.

Network Interconnectivity

Network interconnectivity plays a role for multi-server deployments.

Minimum Network: A minimum of a 1 Gigabit (1G) network is recommended.
- This facilitates efficient data transfer between nodes via TCP/IP for replication and query processing across the distributed architecture. For optimal performance in heavy-load scenarios, higher bandwidth (for instance, 10G or more) is highly beneficial.

Adhering to these minimum specifications will provide a baseline for ColumnStore functionality. For specific workload requirements, it's always advisable to conduct performance testing and scale hardware accordingly.

AWS Instance Sizes

For AWS, ColumnStore internal testing generally uses m4.4xlarge instance types as a cost-effective middle ground. The R4.8xlarge has also been tested, and performs about twice as fast for about twice the price.

ColumnStore Architecture

MariaDB ColumnStore uses a shared-nothing, distributed architecture with separate modules for SQL and storage, enabling scalable, high-performance analytics.

Topologies Overview

MariaDB offers varied deployment topologies by workload and technology, each named and diagrammed with benefits listed. Custom configurations are also supported.

MariaDB products can be deployed in many different topologies. The topologies described in this section are representative of the overall structure. MariaDB products can be deployed to form other topologies, leverage advanced product capabilities, or combine the capabilities of multiple topologies.

Topologies are the arrangements of nodes and links to achieve a purpose. This documentation describes a few of the many topologies that can be deployed using MariaDB database products.

We group topologies by workload (transactional, analytical, or hybrid) and technologies (Enterprise Spider). Single-node topologies are listed separately.

To help you select the correct topology:

ColumnStore Storage Engine

Overview

MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

For deployment instructions and available documentation, see "MariaDB Enterprise ColumnStore."

The ColumnStore storage engine has the following features:

Feature

Detail

Resources

Examples

Creating a ColumnStore Table

To create a ColumnStore table, use the statement with the ENGINE=ColumnStore option:

Multi-Node Configuration

To deploy a multi-node Enterprise ColumnStore deployment, a configuration similar to below is required:

Configure the Mandatory Utility User Account

To configure the mandatory utility user account, use the mcsSetConfig command:

ColumnStore Read Replicas

The ColumnStore Read Replica topology is an Alpha release. Do not use it in production without testing in your development environment first.

Overview

The Read Replicas feature in MariaDB ColumnStore enables horizontal scaling of read performance by incorporating read-only nodes into a multi-node cluster. These replicas differ from standard ColumnStore nodes, in that they don't run the WriteEngineServer process. This means Read Replica nodes cannot handle write operations directly — instead, any write queries attempted on a replica are automatically forwarded to a read-write (RW) node.

Replicas utilize shared storage with other nodes in the cluster, ensuring data consistency without duplication. A key requirement is maintaining at least one RW node — a cluster consisting solely of read replicas is not operational and cannot process reads or writes.

Read-only nodes are incompatible with S3 as the storage backend.

Additionally, there is no automatic promotion of a read replica to RW mode if the only RW node fails, which could lead to temporary downtime until manual intervention.

Key Features

Horizontal Read Scaling: Adds compute power for handling more read-intensive queries without impacting write performance.
Write Forwarding: Ensures writes on replicas are redirected to RW nodes, maintaining data integrity.
Shared Storage: Replicas access the same DBRoots as RW nodes, promoting efficiency and reducing storage overhead.

Key Commands

These commands require .

Add Read Replica. To introduce a read-only node for scaling reads, run this command:

Remove Node. To safely remove any node (RW or replica) from the cluster, run this command:

This reassigns resources as needed without cluster disruption.

Verify Status. To monitor the cluster's health and node roles, issue:

Limitations

Node addition is restricted to private IPs only.
Incompatible with S3 storage, limiting use to shared file systems.
No automatic failover or promotion mechanism if the sole RW node goes down, requiring manual recovery.

How-To

Prerequisites

Ensure shared storage is mounted on all nodes (at /var/lib/columnstore/data1 for non-s3 configuration), to ensure data consistency across RW nodes and read replicas.

Refer to for exact mount points details.

Installation and Setup

Set Up MariaDB Repository

Run the following to add the MariaDB repository (adjust "11.4" to the latest stable version):

See for additional details about the ES repo setup.

Install Packages

ColumnStore System Databases

When using ColumnStore, MariaDB Server creates a series of system databases used for operational purposes.

Database

Description

ColumnStore Query Processing

Clients issue a query to the MariaDB Server, which has the ColumnStore storage engine installed. MariaDB Server parses the SQL, identifies the involved ColumnStore tables, and creates an initial logical query execution plan.

Using the ColumnStore storage engine interface (ha_columnstore), MariaDB Server converts involved table references into ColumnStore internal objects. These are then handed off to the ExeMgr, which is responsible for managing and orchestrating query execution across the cluster.

The ExeMgr analyzes the query plan and translates it into a distributed ColumnStore execution plan. It determines the necessary query steps and the execution order, including any required parallelization.

The ExeMgr then references the extent map to identify which PrimProc instances hold the relevant data segments. It applies extent elimination to exclude any PrimProc nodes whose extents do not match the query’s filter criteria.

MariaDB Enterprise Columnstore Locking

Overview

MariaDB Enterprise ColumnStore minimizes locking for analytical workloads, bulk data loads, and online schema changes.

Lockless Reads

Managing ColumnStore

Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.

Deployment

Installing ColumnStore

This section provides instructions for installing and configuring MariaDB ColumnStore. It covers various deployment scenarios, including single- and multi-node setups with both local and S3 storage.

MariaDB ColumnStore Hardware Guide Single-Node Localstorage Multi-Node Localstorage Single-Node S3 Multi-Node S3

Step 5: Bulk Import of Data

Overview

This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.

This step bulk imports data to Enterprise ColumnStore.

The instructions were tested against ColumnStore 23.10.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

Import the Schema

Before data can be imported into the tables, create a matching schema.

On the primary server, create the schema:

For each database that you are importing, create the database with the statement:

For each table that you are importing, create the table with the statement:

Import the Data

Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

cpimport

MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

To import your data from a TSV (tab-separated values) file, on the primary server run :

LOAD DATA INFILE

When data is loaded with the statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

To import your data from a TSV (tab-separated values) file, on the primary server use statement:

Import from Remote Database

MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

To import your data from a remote MariaDB database:

Next Step

Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

This page was step 5 of 5.

This procedure is complete.

Step 6: Install MariaDB MaxScale

Overview

This page details step 6 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

This step installs MariaDB MaxScale 22.08. ColumnStore Object Storage requires 1 or more MaxScale nodes.

Step 5: Bulk Import of Data

Overview

This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

This step bulk imports data to Enterprise ColumnStore.

Upgrading ColumnStore

Managing ColumnStore Database Environment

Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.

Node Maintenance for MariaDB Enterprise Columnstore

Rejoining a Node

To rejoin a node with Enterprise ColumnStore, perform the following procedure.

Performing Rejoin in MaxScale

The node can be configured to rejoin in MaxScale using :

Use or another supported REST client.
Call a module command using the call command command.
As the first argument, provide the name for the module, which is .
As the second argument, provide the module command, which is rejoin .
As the third argument, provide the name of the monitor.
As the fourth argument, provide the name of the server.

For example:

Checking Replication Status with MaxScale

MaxScale is capable of checking the status of using :

List the servers using the list servers command, like this:

If the node properly rejoined, the State column of the node shows Slave, Running.

Switchover of the Primary Node

To switchover to a new primary node with Enterprise ColumnStore, perform the following procedure.

Performing Switchover in MaxScale

The primary node can be switched in MaxScale using :

Use or another supported REST client.
Call a module command using the call command command.
As the first argument, provide the name for the module, which is .
As the second argument, provide the module command, which is switchover .
As the third argument, provide the name of the monitor.

For example:

With the above syntax, MaxScale will choose the most up-to-date replica to be the new primary.

If you want to manually select a new primary, provide the server name of the new primary as the fourth argument:

Checking the Replication Status with MaxScale

MaxScale is capable of checking the status of using :

List the servers using the list servers command, like this:

If switchover was properly performed, the State column of the new primary shows Master, Running.

View and Clear Table Locks

MariaDB Enterprise ColumnStore acquires table locks for some operations, and it provides utilities to view and clear those locks.

MariaDB Enterprise ColumnStore acquires table locks for some operations, such as:

DDL statements
DML statements
Bulk data loads

If an operation fails, the table lock does not always get released. If you try to access the table, you can see errors like the following:

To solve this problem, MariaDB Enterprise ColumnStore provides two utilities to view and clear the table locks:

cleartablelock
viewtablelock

Viewing Table Locks

The viewtablelock utility shows table locks currently held by MariaDB Enterprise ColumnStore:

To view all table locks:

To view table locks for a specific table, specify the database and table:

Clearing Table Locks

The cleartablelock utility clears table locks currently held by MariaDB Enterprise ColumnStore.

To clear a table lock, specify the lock ID shown by the viewtablelock utility:

Backup & Restore

MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.

Backup and Restore Overview

Overview

MariaDB Enterprise ColumnStore supports backup and restore.

System of Record

ColumnStore Table Size Limitations

MariaDB ColumnStore has a hard limit of 4096 columns per table.

However, it's likely that you run into other limitations before hitting that limit, including:

Row size limit of tables. This varies, depending on the storage engine you're using. For example, which indirectly limits the number of columns.
Size limit of .frm

Security

MariaDB ColumnStore uses MariaDB Server’s security—encryption, access control, auditing, and firewall—for secure analytics.

ColumnStore Security Vulnerabilities

Use Cases

MariaDB ColumnStore is ideal for real-time analytics and complex queries on large datasets across industries.

About MariaDB ColumnStore

MariaDB ColumnStore is a columnar storage engine that utilizes a massively parallel distributed data architecture. It's a columnar storage system built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.

is available as a storage engine for MariaDB Server. Before then, it is available as a separate download.

Release notes and other documentation for ColumnStore is also available in the Enterprise docs section of the MariaDB website. For example:

High Availability

MariaDB ColumnStore ensures high availability with multi-node setups and shared storage, while MaxScale adds monitoring and failover for continuous analytics.

Query Tuning

MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.

Collecting Statistics with ANALYZE TABLE

Overview

In MariaDB Enterprise ColumnStore 6, the uses optimizer statistics in its query planning process.

ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.

In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:

The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.

Query Plans and Optimizer Trace

MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.

Execution Plan (CSEP)

Overview

The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.

When the select handler receives the SELECT_LEX object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."

Viewing the CSEP

The CSEP for a given query can be viewed by performing the following:

Calling the calSetTrace(1) function:

Executing the query:

Calling the calGetTrace() function:

Clients & Tools

MariaDB ColumnStore supports standard MariaDB tools, BI connectors (e.g., Tableau, Power BI), data ingestion (cpimport, Kafka), and REST APIs for admin.

StorageManager

The ColumnStore StorageManager manages columnar data storage and retrieval, optimizing analytical queries.

Certified S3 Object Storage Providers

Hardware (On Premises)

Quantum ActiveScale
(Formerly known as CleverSafe)

Cloud (IaaS)

Software-Based

Due to the frequent code changes and deviation from the AWS standards, none are approved at this time.

Sample storagemanager.cnf

# Sample storagemanager.cnf

[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3

[S3]
region = us-west-1
bucket = my_columnstore_bucket
endpoint = s3.amazonaws.com
aws_access_key_id = AKIAR6P77BUKULIDIL55
aws_secret_access_key = F38aR4eLrgNSWPAKFDJLDAcax0gZ3kYblU79

[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000

[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cache

Note: A region is required even when using an on-premises solution like ActiveScale due to header expectations within the API.

Using StorageManager With IAM Role

AWS IAM Role Configuration

From Columnstore 5.5.2, you can use AWS IAM roles in order to connect to S3 buckets without explicitly entering credentials into the storagemanager.cnf config file.

You need to modify the IAM role of your Amazon EC2 instance to allow for this. Please follow the AWS before beginning this process.

It is important to note that you must update the AWS S3 endpoint based on your chosen region; otherwise, you might face delays in propagation as discussed

Data Ingestion Methods & Tools

Learn about data ingestion for MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.

ColumnStore provides several mechanisms to ingest data:

provides the fastest performance for inserting data and ability to route data to particular PrimProc nodes. Normally, this should be the default choice for loading data .
provides another means of bulk inserting data.

Job Steps

Overview

When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.

Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:

Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.

As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.

For additional information, see ".".

Batch Primitive Step (BPS)

Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT list).

In calGetTrace() output, a batch primitive step is abbreviated BPS.

Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.

Cross Engine Step (CES)

Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.

In calGetTrace() output, a cross-engine step is abbreviated CES.

Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.

For additional information, refer to the ""

Dictionary Structure Step (DSS)

Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.

In calGetTrace() output, a dictionary structure step is abbreviated DSS.

Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.

Dictionary structure steps can require a lot of I/O for a couple of reasons:

Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.
Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.

It is generally recommended to avoid queries that will cause dictionary scans.

For additional information, see "Avoid Creating Long String Columns".

Hash Join Step (HJS)

Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.

In calGetTrace() output, a hash join step is abbreviated HJS.

Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:

Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.

For additional information, see "" and "".

Having Step (HVS)

Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.

In calGetTrace() output, a having step is abbreviated HVS.

Subquery Step (SQS)

Enterprise ColumnStore defines a subquery step to evaluate a subquery.

In calGetTrace() output, a subquery step is abbreviated SQS.

Tuple Aggregation Step (TAS)

Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.

In calGetTrace() output, a tuple aggregation step is abbreviated TAS.

Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.

Tuple Annexation Step (TNS)

Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.

In calGetTrace() output, a tuple annexation step is abbreviated TNS.

Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.

In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

For additional information, see "".

Tuple Union Step (TUS)

Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.

In calGetTrace() output, a tuple union step is abbreviated TUS.

Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Tuple Constant Step (TCS)

Enterprise ColumnStore defines a tuple constant step to evaluate constant values.

In calGetTrace() output, a tuple constant step is abbreviated TCS.

Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

Window Function Step (WFS)

Enterprise ColumnStore defines a window function step to evaluate window functions.

In calGetTrace() output, a window function step is abbreviated WFS.

Window function steps are evaluated locally by the on the initiator/aggregator node.

Performance Concepts

Introduction

The high level components of the ColumnStore architecture are:

PrimProc: PrimProc (Primitives Processor) is responsible for parsing the SQL requests into an optimized set of primitive job steps executed by one or more servers. PrimProc is thus responsible for query optimization and orchestration of query execution by the servers. While every instance has their own PrimProc in a multi-server deployment, each query begins and ends on the same PrimProc it originated from. A database load balancer such as MariaDB MaxScale can be deployed to appropriately balance external requests against individual servers. PrimProc also executes granular job steps received from the server (mariadbd) in a multi-threaded manner. ColumnStore allows distribution of the work across many servers.
Extent Maps: ColumnStore maintains metadata about each column in a shared distributed object known as the Extent Map. The primary node references the Extent Map to help assist in generating the correct primitive job steps. The primary node server references the Extent Map to identify the correct disk blocks to read. Each column is made up of one or more files and each file can contain multiple extents. As much as possible the system attempts to allocate contiguous physical storage to improve read performance.
Storage: ColumnStore can use either local storage or shared storage (e.g. SAN or EBS) to store data. Using shared storage allows for data processing to fail over to another node automatically in case of a server failing.

Data Loading

The system supports full MVCC ACID transactional logic via Insert, Update, and Delete statements. The MVCC architecture allows for concurrent query and DML / batch load. Although DML is supported, the system is optimized more for batch inserts and so larger data loads should be achieved through a batch load. The most flexible and optimal way to load data is via the cpimport tool. This tool optimizes the load path and can be run centrally or in parallel on each server.

If the data contains a time or (time correlated ascending value) column then significant performance gains will be achieved if the data is sorted by this field and also typically queried with a where clause on that column. This is because the system records a minimum and maximum value for each extent providing for a system maintained range partitioning scheme. This allows the system to completely eliminate scanning an extent map if the query includes a where clause for that field limiting the results to a subset of extent maps.

Query Execution

MariaDB ColumnStore has its own query optimizer and execution engine distinct from the MariaDB server implementation. This allows for scaling out query execution to multiple servers, and to optimize for handling data stored as columns rather than rows. As such, the factors influencing query performance are very different:

A query is first parsed by the MariaDB server (mariadbd) process and passed through to the ColumnStore storage engine. This passes the request onto the PrimProc process which is responsible for optimizing and orchestrating execution of the query. The PrimProc module's optimizer creates a series of batch primitive steps that are executed on all nodes in the cluster. Since multiple servers can be deployed, this allows for scale-out execution of the queries. The optimizer attempts to process query execution in parallel. However, certain operations inherently must be executed centrally, for example final result ordering. Filtering, joins, aggregates, and GROUP BY clauses are general.y pushed down and executed in parallel in PrimProc on all servers. In PrimProc, batch primitive steps are performed at a granular level where individual threads operate on individual 1K-8K blocks within an extent. This enables a larger multi-core server to be fully consumed and scale out within a single server. The current batch primitive steps available in the system include:

Single Column Scan: Scan one or more Extents for a given column based on a single column predicate, including operators like =, <>, IN (list), BETWEEN, and ISNULL. See the first scan section of for additional details on tuning this.
Additional Single Column Filters: Project additional columns for any rows found by a previous scan and apply additional single column predicates as needed. Access of blocks is based on row identifier, going directly to the blocks. See the additional column read section of

ColumnStore Query Execution Paradigms

The following items should be considered when thinking about query execution in ColumnStore vs a row based store such as InnoDB.

Data Scanning and Filtering

ColumnStore is optimized for large scale aggregation / OLAP queries over large data sets. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries. Instead ColumnStore gains performance by only scanning necessary columns, utilizing system maintained partitioning, and utilizing multiple threads and servers to scale query response time.

Since ColumnStore only reads the necessary columns to resolve a query, only include the necessary columns required. For example, SELECT * is significantly slower than SELECT col1, col2 FROM tbl.

Datatype size is important. If say you have a column that can only have values 0 through 100 then declare this as a tinyint as this will be represented with 1 byte rather than 4 bytes for int. This reduces the I/O cost by 4 times.

For string types, an important threshold is CHAR(9) and VARCHAR(8) or greater. Each column storage file uses a fixed number of bytes per value. This enables fast positional lookup of other columns to form the row. Currently the upper limit for columnar data storage is 8 bytes. So. for strings longer than this, the system maintains an additional 'dictionary' extent where the values are stored. The columnar extent file then stores a pointer into the dictionary. For example, it is more expensive to read and process a VARCHAR(8) column than a CHAR(8) column. Where possible, you get better performance if you can utilize shorter strings, especially if you avoid the dictionary lookup. All TEXT/BLOB data types in ColumnStore 1.1 onward utilize a dictionary and do a multiple-block 8KB lookup to retrieve that data if required. The longer the data, the more blocks are retrieved, and the greater is a potential performance impact.

In a row-based system, adding redundant columns adds to the overall query cost, but in a columnar system a cost is only occurred if the column is referenced. Therefore, additional columns should be created to support different access paths. For instance, store a leading portion of a field in one column to allow for faster lookups, but additionally store the long-form value as another column. Scans on a shorter code or a leading-portion column are faster.

ColumnStore distributes function application across all nodes for greater performance, but this requires a distributed implementation of the function in addition to the MariaDB server implementation. See for the full list.

Its important to note that ColumnStore does not have a cost based optimizer, so for optimal extent elimination and performance, your first WHERE clause predicate order should be based on the same column order that the data is imported by. Example: Most use cases with a date column benefit from a natural sort. (Today's data are being inserted after yesterday's data.) Having the first column to filter by date helps efficiently filter through records. WHERE DATE='x' outperforms a query based on a column with random values as the first predicate. Compare different query plans using calSetTrace and calGetTrace. Optimizing for the lowest PIO/LIO and highest PBE. See also .

Joins

Hash joins are utilized by ColumnStore to optimize for large scale joins and avoid the need for indexes and the overhead of nested loop processing. ColumnStore maintains table statistics so as to determine the optimal join order. This is implemented by first identifying the small table side (based on extent map data) and materializing the necessary rows from that table for the join. If the size of this is less than the configuration setting PmMaxMemorySmallSide, the join is pushed down into PrimProc for distributed in-memory processing. Otherwise, the larger side rows is not processed in a distributed manner for joining, and only the WHERE clause on that side is executed across all PrimProc modules in the cluster. If the join is too large for memory, disk-based join can be enabled to allow the query to complete.

Aggregations

Similarly to scalar functions ColumnStore distributes aggregate evaluation as much as possible. However some post processing is required to combine the final results. Enough memory must exist to handle queries with a very large number of values in the aggregate columns.

Aggregation performance is also influenced by the number of distinct aggregate column values. Generally, the same number of rows with 100 distinct values computes faster than 10000 distinct values. This is due to increased memory management as well as transfer overhead.

SELECT COUNT() is internally optimized to be SELECT COUNT(COL-N), where COL-N is the column that uses the least number of bytes for storage. For example it would be pick a CHAR(1) column over int column because CHAR(1) uses 1 byte for storage and int uses 4 bytes. The implementation still honors ANSI semantics in that SELECT COUNT() will include nulls in the total count as opposed to an explicit SELECT(COL-N) which excludes NULL

ORDER BY and LIMIT

ORDER BY and LIMIT are implemented at the very end by the mariadbd server process on the temporary result set table. This means that the unsorted results must be fully retrieved before either are applied. The performance overhead of this is minimal on small to medium results, but for larger results, it can be significant.

Complex Queries

Subqueries are executed in sequence thus the subquery intermediate results must be materialized and then the join logic applies with the outer query.

Window functions are executed as part of final aggregation in PrimProc due to the need for ordering of the window results. The ColumnStore window function engines uses a dedicated faster sort process.

Partitioning

Automated system partitioning of columns is provided by ColumnStore. As data is loaded into extent maps, the system will capture and maintain min/max values of column data in that extent map. New rows are appended to each extent map until full at which point a new extent map is created. For column values that are ordered or semi-ordered this allows for very effective data partitioning. By using the min and max values, entire extent maps can be eliminated and not read to filter data. This generally works particularly well for time dimension / series data or similar values that increase over time.

Mariadb Enterprise Columnstore Query Evaluation

Overview

MariaDB Enterprise ColumnStore is a smart storage engine designed to efficiently execute analytical queries using distributed query execution and massively parallel processing (MPP) techniques.

Scalability

MariaDB Enterprise ColumnStore is designed to achieve vertical and horizontal scalability for production analytics using distributed query execution and massively parallel processing (MPP) techniques.

Enterprise ColumnStore evaluates each query as a sequence of job steps using sophisticated techniques to get the best performance for complex analytical queries. Some types of job steps are designed to scale with the system's resources. As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute those types of job steps.

Enterprise ColumnStore stores each column on disk in extents. The storage format is designed to maintain scalability, even as the table grows. If an operation does not read parts of a large table, I/O costs are reduced. Enterprise ColumnStore uses a technique called extent elimination that compares the maximum and minimum values in the extent map to the query's conditions, and it avoids scanning extents that don't satisfy the conditions.

Enterprise ColumnStore provides exceptional scalability for analytical queries. Enterprise ColumnStore's design supports targeted scale-out to address increased workload requirements, whether it is a larger query load or increased storage and query processing capacity.

Horizontal Scalability

MariaDB Enterprise ColumnStore provides horizontal scalability by executing some types of job steps in a distributed manner using multiple nodes.

When Enterprise ColumnStore is evaluating a job step, the ExeMgr process or facility on the initiator/aggregator node requests the PrimProc process on each node to perform the job step on different extents in parallel. As more nodes are added, Enterprise ColumnStore can perform more work in parallel.

Enterprise ColumnStore also uses massively parallel processing (MPP) techniques to speed up some types of job steps. For some types of aggregation operations, each node can perform an initial local aggregation, and then the initiator/aggregator node only needs to combine the local results and perform a final aggregation. This technique can be very efficient for some types of aggregation operations, such as for queries that use the AVG(), COUNT(), or SUM() aggregate functions.

Vertical Scalability

MariaDB Enterprise ColumnStore provides vertical scalability by executing some types of job steps in a multi-threaded manner using a thread pool.

When the PrimProc process on a node receives work, it executes the job step on an extent in a multi-threaded manner using a thread pool. Each thread operates on a different block within the extent. As more CPUs are added, Enterprise ColumnStore can work on more blocks in parallel.

Extent Elimination

MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as table size increases.

Most databases are row-based databases that use manually-created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.

Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into . Enterprise ColumnStore stores the minimum and maximum values for each extent in the . Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.

When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.

Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.

Custom Select Handler

The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.

All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.

For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.

The generic select handler is not optimal for Enterprise ColumnStore, because:

Enterprise ColumnStore selects data by column, but the generic selects handler selects data by row
Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not
Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not

Smart Storage Engine

The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

If a storage engine implements a custom select handler, it is known as a smart storage engine.

As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.

Configure the Select Handler

The ColumnStore storage engine can use either the custom select handler or the generic select handler. The select handler can be configured using the columnstore_select_handler system variable:

Value

Description

Joins

MariaDB Enterprise ColumnStore performs join operations using hash joins.

By default, hash joins are performed in memory.

Configure In-Memory Joins

MariaDB Enterprise ColumnStore can be configured to allocate more memory for hash joins.

The relevant configuration options are:

Section

Option

Description

For example, to configure Enterprise ColumnStore to use more memory for hash joins using the mcsSetConfig utility:

Configure Disk-Based Joins

MariaDB Enterprise ColumnStore can be configured to perform disk-based joins.

The relevant configuration options are:

Section

Option

Description

For example, to configure Enterprise ColumnStore to perform disk-based joins using the mcsSetConfig utility:

Aggregations

MariaDB Enterprise ColumnStore performs aggregation operations on all nodes in a distributed manner, and then all nodes send their results to a single node, which combines the results and performs the final aggregation.

By default, aggregation operations are performed in memory.

Configure Disk-Based Aggregations

In Enterprise ColumnStore 5.6.1 and later, disk-based aggregations can be configured.

The relevant configuration options are:

Section

Option

Description

For example, to configure Enterprise ColumnStore to perform disk-based aggregations using the mcsSetConfig utility:

Query Planning

The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its own queries using the .

MariaDB Enterprise ColumnStore's query planning is divided into two steps:

ES provides the query's SELECT_LEX object to the . The custom select handler builds a ColumnStore Execution Plan (CSEP).
The custom select handler provides the CSEP to the on the same node. ExeMgr performs and creates a job list.

ExeMgr Process/Facility

The ColumnStore storage engine provides the CSEP to the ExeMgr process or facility on the same node, which will act as the initiator/aggregator node for the query.

Starting with MariaDB Enterprise ColumnStore 22.08, the ExeMgr facility has been integrated into the PrimProc process, so it is no longer a separate process.

ExeMgr performs multiple tasks:

Performs extent elimination.
Views the optimizer statistics.
Transforms the CSEP to a job list, which consists of job steps.

Query Evaluation Process

When Enterprise ColumnStore executes a query, it goes through the following process:

The client or application sends the query to MariaDB MaxScale's listener port.
The query is processed by the Read/Write Split Router (readwritesplit) service associated with the listener.
The service routes the query to the ES TCP port on a ColumnStore node.

The handler interface builds a SELECT_LEX object to represent the query.
The handler interface provides the SELECT_LEX object to the ColumnStore storage engine's select handler.
The select handler transforms the SELECT_LEX

ExeMgr transforms the CSEP into a job list, which consists of job steps.
ExeMgr evaluates each job step sequentially.

If it is a non-distributed job step, ExeMgr evaluates the job step itself.
If it is a distributed job step, ExeMgr provides the job step to the PrimProc process on each node. The PrimProc process on each node evaluates the job step in a multi-threaded manner using a thread pool. After the PrimProc process on each node evaluates its job step, the results are returned to ExeMgr on the initiator/aggregator node as a Row Group.

After all job steps are evaluated, ExeMgr returns the results to ES.
ES returns the results to MaxScale.
MaxScale returns the results to the client or application.

Adding a Node

Adding a Node to MariaDB Enterprise ColumnStore

To add a new node to Enterprise ColumnStore, perform the following procedure.

Deploying Enterprise ColumnStore

Before you can add a node to Enterprise ColumnStore, confirm that the Enterprise ColumnStore software has been deployed on the node in the desired topology.

For additional information, see "".

Backing Up MariaDB Data Directory on the Primary Server

Before the new node can be added, its MariaDB data directory must be consistent with the Primary Server. To ensure that it is consistent, take a backup of the Primary Server:

The instructions below show how to perform a backup using .

On the Primary Server, take a full backup:
Confirm successful completion of the backup operation.
On the Primary Server, prepare the backup:
Confirm successful completion of the prepare operation.

Restoring the Backup on the New Node

To make the new node consistent with the Primary Server, restore the new backup on the new node:

On the Primary Server, copy the backup to the new node:
On the new node, restore the backup using .
On the new node, fix the file permissions of the restored backup:

Starting the Enterprise ColumnStore Services

The Enterprise Server. Enterprise ColumnStore, and CMAPI services can be started using the systemctl command. In case the services were started during the installation process, use the restart command.

Perform the following procedure on the new node:

Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
Start and disable the MariaDB Enterprise ColumnStore service, so that it does not start automatically upon reboot:
Note
The Enterprise ColumnStore service should not be enabled in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not require starting automatically upon reboot.

Configuring MariaDB Replication

MariaDB Enterprise ColumnStore requires MariaDB Replication, which must be configured.

Get the GTID position that corresponds to the restored backup.
If the backup was taken with , this position will be located in xtrabackup_binlog_info:
The GTID position from the above output is 0-1-2001,1-2-5139.
Connect to the Replica Server using using the root@localhost

Adding the Node to Enterprise ColumnStore

The new node must be added to Enterprise ColumnStore using :

Add the node using the endpoint path
Use a , such as curl
Format the JSON output using jq for enhanced readability

For example, if the primary node's host name is mcs1 and the new node's IP address is 192.0.2.3:

In ES 10.5.10-7 and later:
In ES 10.5.9-6 and earlier:

Example output:

Checking Enterprise ColumnStore Status

To confirm that the node was properly added, the status of Enterprise ColumnStore should be checked using :

Check the status using the endpoint path

For example, if the primary node's host name is mcs1:

Example output:

Adding a Server to MaxScale

A server object for the new node must also be added to MaxScale using :

Use or another supported REST client
Add the server object using the create server command
As the first argument, provide a name for the server

For example:

Verifying the Server in MaxScale

To confirm that the server object was properly added, the server objects should be checked using :

Show the server objects using the show servers command

For example:

Linking to Monitor in MaxScale

The server object for the new node must be linked to the monitor using :

Link a server object to the monitor using the link monitor command
As the first argument, provide the name of the monitor
As the second argument, provide the name of the server

Checking the Monitor in MaxScale

To confirm that the server object was properly linked to the monitor, the monitor should be checked using :

Show the monitors using the show monitors command

For example:

Linking to Service in MaxScale

The server object for the new node must be linked to the service using :

Link the server object to the service using the link service command
As the first argument, provide the name of the service
As the second argument, provide the name of the server

Checking the Service in MaxScale

To confirm that the server object was properly linked to the service, the service should be checked using :

Show the services using the show services command

For example:

Checking the Replication Status with MaxScale

MaxScale is capable of checking the status of using :

List the servers using the list servers command

For example:

If the new node is properly replicating, then the State column will show Slave, Running.

Upgrading MariaDB Enterprise ColumnStore (Alpha)

This page documents an Alpha version of the upgrade procedure using the mcs install_es command. Behavior may change. Validate in a non‑production environment first.

This guide explains how to upgrade MariaDB Enterprise Server (ES) and MariaDB Enterprise ColumnStore across all nodes in a cluster using the unified mcs command-line tool that you have to run only once.

The mcs command must be run as root. Either become root, or prefix the mcs commands on this page with sudo.

The mcs install_es command:

Validates your MariaDB Enterprise Repository access using an ES API token.
Stops ColumnStore and MariaDB services in a controlled sequence.
Installs/configures the ES repository for the target version.

Prerequisites

Administrative privileges on all cluster nodes (package installation and service management required).
A valid ES API token with access to the MariaDB Enterprise Repository.
Network access from the nodes to the MariaDB Enterprise Repository endpoints.

Related docs:

General backup and restore guidance:

Always back up your data before upgrading. While the tool performs a pre‑upgrade backup of DBRM and configs, it is not a substitute for a full database backup.

Command Overview

The command can target a specific ES version, or use the latest tested version (currently latest 10.6 version).

Install latest tested version (if you omit the --version option, mcs uses the latest version):

Install a specific version:

Proceed even if nodes report different installed package versions (use the majority version as baseline):

Options summary:

--token TEXT: ES API Token to use for the upgrade (required) — get it .
-v, --version TEXT: ES version to install; if omitted or set to latest, upgrades to the latest tested version.

Before you Begin

Stop or pause write workloads and heavy ingestion (e.g., cpimport, large INSERT/LOAD DATA jobs).
Drain or put traffic managers/proxies (for example, MaxScale) into maintenance/drain mode.
Ensure you have administrative/SSH and package manager access on all nodes.

What mcs install_es Does

Validate token and target version.

If --version=latest, the tool resolves the latest tested ES version.
If a specific version is requested, it is validated against the repository. Some versions could exists only for specific operating systems.

Post-Upgrade Checks

Run mcs cluster status to verify all services are up and the cluster is healthy. In case of a failure:
- Verify CMAPI readiness on all nodes (for example, via mcs or an external monitoring tool).
Run a quick smoke test:

Downgrades

Downgrades are supported up to MariaDB 10.6.9-5 and ColumnStore 22.08.4.
When downgrading, the tool doesn't automatically restart services. Complete these steps manually:
1. Start MariaDB on each node (for example, via your service manager).

Downgrades can cause data loss or cluster inconsistency if not planned and validated. Always test and ensure backups are restorable.

Verification and Logs

After a successful upgrade, or after downgrading and a manual restart:

Validate that CMAPI is ready on all nodes: mcs cmapi is-ready
Check ColumnStore and MariaDB services are running and the cluster is healthy: mcs cluster status

The mcs install_es command writes a detailed run log to:

/var/tmp/mcs_cli_install_es.log

If CMAPI readiness times out or services do not start cleanly, review:

CMAPI logs: /var/log/mariadb/columnstore/cmapi_server.log
Service logs on each node: /var/log/mariadb/columnstore/
The install_es log file (/var/tmp/mcs_cli_install_es.log

Known Issues and Limitations (Alpha/Beta)

Mixed package versions across nodes.
- If nodes report different installed versions of Server/ColumnStore/CMAPI, the command fails with a mismatch message.
- You can force continuation with --ignore-mismatch; the tool uses the majority version per package as the baseline, but this carries risk—align versions whenever possible.

Troubleshooting

Re‑run with -v/--verbose to enable console debug logging.
Inspect /var/tmp/mcs_cli_install_es.log for the complete sequence and API responses.
If package repository installation fails, verify token validity and outbound access from all nodes.

Environment and Network Requirements

Cluster state: ColumnStore cluster should be healthy before starting.
Node access: All nodes must be reachable (SSH/admin access) and responsive.
Disk space: Ensure sufficient free space for package downloads and pre-upgrade backups.

Additional Usage Example (Downgrade)

Downgrades can be destructive.

This prompts for confirmation. After downgrade, services are not restarted automatically; start MariaDB and the ColumnStore cluster manually and verify health.

Recovery Procedures

If the upgrade fails or CMAPI does not become ready on all nodes:

Review the detailed log at /var/tmp/mcs_cli_install_es.log for errors.
Check service status on each node:
- systemctl status mariadb

Best Practices

Prior to upgrading:
- Create a full backup and verify restore procedures.
- Test the process in staging with similar topology/data.

Support and Reporting Issues

Contact MariaDB Support if you encounter unexpected failures, data issues, or performance regressions. Provide:

The complete log file: /var/tmp/mcs_cli_install_es.log .
The mcs review logs: mcs review --logs .
The exact command used (with parameters, masking sensitive values).

ColumnStore System Variables

Variables

columnstore_cache_inserts

Scope: Global Dynamic: No (requires restart) Command Line: --columnstore-cache-inserts[={0|1}] Default: OFF Description: The feature can be enabled or disabled at the global level. When enabled, INSERT operations are directed to a memory‑optimized Aria cache table, which serves as a temporary buffer before the data is flushed into ColumnStore storage.

columnstore_cache_flush_threshold

Scope: Global / Session Dynamic: Yes Command Line: --columnstore-cache-flush-threshold=# Default: 500000 Description: Specifies the number of cached rows that trigger an automatic flush from the Aria cache table to the ColumnStore table. For tuning guidance, see .

columnstore_cache_use_import

Scope: Global Dynamic: Yes Command Line: --columnstore-cache-use-import[={0|1}] Default: OFF Description: When the insert cache is enabled, flush operations utilize the cpimport utility to achieve improved performance. When the feature is disabled, flushes are executed using ColumnStore’s internal batch processing mode. For details on performance trade-offs, see .

columnstore_diskjoin_force_run

Controls whether disk joins are forced to run even if they are not estimated to be the most efficient execution plan. This can be useful for debugging purposes or for situations where the optimizer's estimates are not accurate.
Scope: global, session
Data type:

columnstore_diskjoin_max_partition_tree_depth

Sets the maximum depth of the partition tree that can be used for disk joins. A higher value allows for more complex joins, but may also increase the memory usage and execution time.
Scope: global, session
Data type:

columnstore_max_allowed_in_values

Sets the maximum number of values that can be used in an IN predicate on a Columnstore table. This limit helps to prevent performance issues caused by queries with a large number of IN values.
Scope: global, session
Data type:

columnstore_max_pm_join_result_count

Sets the maximum number of rows that can be returned by a parallel merge join on a Columnstore table. This limit helps to prevent memory issues caused by joins that return a large number of rows.
Scope: global, session
Data type:

Command line: Yes
Scope: global, session
Data type:
Default value: 2

Command line: Yes
Scope: global, session
Data type:
Default value: 8

Command line: Yes
Scope: global, session
Data type:
Default value: 100

Command line: Yes
Scope: global, session
Data type:
Default value: 0

Command line: Yes
Scope: global, session
Data type:
Default value: 0

Command line: Yes
Scope: global, session
Data type:
Default value: OFF

Command line: Yes
Scope: global, session
Data type:
Default value: 7

Command line: Yes
Scope: global, session
Data type:
Default value: 17

Command line: Yes
Scope: global, session
Data type:
Default value: 0

infinidb_ordered_only

Command line: Yes
Scope: global, session
Data type:
Default value: OFF

infinidb_string_scan_threshold

Command line: Yes
Scope: global, session
Data type:
Default value: 10

infinidb_stringtable_threshold

Command line: Yes
Scope: global, session
Data type:
Default value: 20

Command line: Yes
Scope: global, session
Data type:
Default value: 0

Command line: Yes
Scope: global, session
Data type:
Default value: OFF

Command line: Yes
Scope: global, session
Data type:
Default value: ON

infinidb_varbin_always_hex

Command line: Yes
Scope: global, session
Data type:
Default value: ON

Command line: Yes
Scope: global, session
Data type:
Default value: 1

Compression Mode

MariaDB ColumnStore has the ability to compress data. This is controlled through a compression mode, which can be set as a default for the instance or set at the session level.

To set the compression mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

where n is:

compression is turned off. Any subsequent table create statements run will have compression turned off for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned off for that column unless any statement override has been performed.
compression is turned on. Any subsequent table create statements run will have compression turned on for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned on for that column unless any statement override has been performed. ColumnStore uses snappy compression in this mode.

ColumnStore Decimal-to-Double Math

MariaDB ColumnStore has the ability to change intermediate decimal mathematical results from decimal type to double. The decimal type has approximately 17-18 digits of precision, but a smaller maximum range. Whereas the double type has approximately 15-16 digits of precision, but a much larger maximum range.

In typical mathematical and scientific applications, the ability to avoid overflow in intermediate results with double math is likely more beneficial than the additional two digits of precisions. In banking applications, however, it may be more appropriate to leave in the default decimal setting to ensure accuracy to the least significant digit.

Enable/Disable Decimal-to-Double Math

The infinidb\_double\_for\_decimal\_math variable is used to control the data type for intermediate decimal results. This decimal for double math may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.

To enable/disable the use of the decimal to double math at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

where n is:

off (disabled, default)
on (enabled)

ColumnStore Decimal Scale

ColumnStore has the ability to support varied internal precision on decimal calculations. infinidb_decimal_scale is used internally by the ColumnStore engine to control how many significant digits to the right of the decimal point are carried through in suboperations on calculated columns. If, while running a query, you receive the message ‘aggregate overflow’, try reducing infinidb_decimal_scale and running the query again.

Note that, as you decrease infinidb_decimal_scale, you may see reduced accuracy in the least significant digit(s) of a returned calculated column. infinidb_use_decimal_scale is used internally by the ColumnStore engine to turn the use of this internal precision on and off. These two system variables can be set as a default for the instance or at session level.

Enable/Disable Decimal Scale

To enable/disable the use of the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

where n is off (disabled) or on (enabled).

Set Decimal Scale Level

To set the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is the amount of precision desired for calculations.

Disk-Based Joins

Introduction

Joins are performed in memory. When a join operation exceeds the memory allocated for query joins, the query is aborted with an error code IDB-2001.

Disk-based joins enable such queries to use disk for intermediate join data in case when the memory needed for join exceeds the memory limit. Although slower in performance as compared to a fully in-memory join, and bound by the temporary space on disk, it does allow such queries to complete.

Disk-based joins does not include aggregation and DML joins.

The following variables in the HashJoin element in the Columnstore.xml configuration file relate to disk-based joins. Columnstore.xml resides in /usr/local/mariadb/columnstore/etc/.

AllowDiskBasedJoin – Option to use disk-based joins. Valid values are Y (enabled) or N (disabled). Default is disabled.
TempFileCompression – Option to use compression for disk join files. Valid values are Y (use compressed files) or N (use non-compressed files).
TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /usr/local/mariadb/columnstore/tmp). Files (named infinidb-join-data*) in this directory will be created and cleaned on an as needed basis. The entire directory is removed and recreated by ExeMgr at startup.)

When using disk-based joins, it is strongly recommended that the TempFilePath reside on its own partition as the partition may fill up as queries are executed.

Per user join memory limit

In addition to the system wide flags, at SQL global and session level, the following system variables exists for managing per user memory limit for joins.

infinidb_um_mem_limit - A value for memory limit in MB per user. When this limit is exceeded by a join, it will switch to a disk-based join. By default, the limit is not set (value of 0).

For modification at the global level: In my.cnf file (typically /usr/local/mariadb/columnstore/mysql):

where value is the value in MB for in memory limitation per user.

For modification at the session level, before issuing your join query from the SQL client, set the session variable as follows.

Batch Insert Mode for INSERT Statements

Introduction

MariaDB ColumnStore has the ability to utilize the cpimport fast data import tool for non-transactional and SQL statements. Using this method results in a significant increase in performance in loading data through these two SQL statements. This optimization is independent of the storage engine used for the tables in the select statement.

Enable/Disable Using cpimport for Batch Insert

The infinidb_use_import_for_batchinsert variable is used to control if cpimport is used for these statements. This variable may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.

To enable/disable the use of the use cpimport for batch insert at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is:

0 (disabled)
1 (enabled)

Changing Default Delimiter for INSERT SELECT

The infinidb_import_for_batchinsert_delimiter variable is used internally by MariaDB ColumnStore on a non-transactional INSERT INTO SELECT FROM statement as the default delimiter passed to the cpimport tool. With a default value ascii 7, there should be no need to change this value unless your data contains ascii 7 values.

To change this variable value at the at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where ascii_value is an ASCII value representation of the delimiter desired.

Note that this setting may cause issues with multi byte character set data. It is recommended to utilize UTF8 files directly with cpimport.

Version Buffer File Management

If the following error is received, most likely with a transaction LOAD DATA INFILE or INSERT INTO SELECT, it is recommended to break up the load into multiple smaller chunks, increase the VersionBufferFileSize setting, consider a nontransactional LOAD DATA INFILE, or use cpimport.

The VersionBufferFileSize setting is updated in the ColumnStore.xml typically located under /usr/local/mariadb/columnstore/etc. This dictates the size of the version buffer file on disk which provides DML transactional consistency. The default value is '1GB' which reserves up to a 1 Gigabyte file size. Modify this on the primary node and restart the system if you require a larger value.

Local PrimProc Query Mode

MariaDB ColumnStore has the ability to query data from just a single node instead of the whole cluster. In order to accomplish this, the infinidb_local_query variable in the my.cnf configuration file is used and maybe set as a default at system wide or set at the session level.

Enable Local PrimProc Query During Installation

Local PrimProc query can be enabled system wide during the install process when running the install script postConfigure. Answer 'y' to this prompt during the install process:

Enable Local PrimProc Query System-Wide

To enable the use of the local PrimProc query at the instance level, specify infinidb_local_query =1 (enabled) in the my.cnf configuration file at /usr/local/mariadb/columnstore/mysql. The default is 0 (disabled).

Enable/Disable Local PrimProc Query at the Session Level

To enable/disable the use of the local PrimProc query at the session level, the following statement is used. Once the session has ended, any subsequent session will return to the default for the instance:

where n is:

0 (disabled)
1 (enabled)

At the session level, this variable applies only to executing a query on an individual . The PrimProc must be set up with the local query option during installation.

Local PrimProc Query Examples

Example 1 - SELECT from a single table on local PrimProc to import back on local PrimProc:

With the infinidb_local_query variable set to 1 (default with local PrimProc Query):

Example 2 - SELECT involving a join between a fact table on the PrimProc node and dimension table across all the nodes to import back on local PrimProc:

With the infinidb_local_query variable set to 0 (default with local PrimProc Query):

Create a script (i.e., extract_query_script.sql in our example) similar to the following:

The infinidb_local_query is set to 0 to allow query across all PrimProc nodes.

The query is structured so PrimProc gets the fact table data locally from the PrimProc node (as indicated by the use of the function), while the dimension table data is extracted from all the PrimProc nodes.

Then you can execute the script to pipe it directly into cpimport:

Operating Mode

ColumnStore has the ability to support full MariaDB query syntax through an operating mode. This operating mode may be set as a default for the instance or set at the session level. To set the operating mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

where n is:

a generic, highly compatible row-by-row processing mode. Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by MySQL using a nested loop join mechanism.
(the default) query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. Queries executed in this mode take advantage of distributed execution and typically result in higher performance.
auto-switch mode: ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.

MCS backup and restore commands

This page documents how to create and restore MariaDB Enterprise ColumnStore backups using the mcs CLI.

The mcs backup and mcs restore commands support the same workflows as the mcs_backup_manager.sh script, including:

Full and incremental backups
Local/shared storage and S3 storage topologies
Optional compression and parallelism
Separate DBRM (metadata) backup/restore workflows

The examples in this page assume the mcs command is available on the host and you run the backup/restore operations as root.

Before You Start

Identify Your Storage Topology

On a ColumnStore node, determine which StorageManager service is configured:

Example output:

service = LocalStorage
service = S3

Use service = LocalStorage when ColumnStore data lives on local/shared storage, and service = S3 when ColumnStore data is stored in object storage.

Estimate Backup Size

LocalStorage:

S3:

Backups

LocalStorage Topology Backups

Instructions

Run mcs backup as root on each node, starting with the primary node.
Use the same backup location on each node.

List Your Backups

Example output:

Quick Examples

Full backup:

Parallel backup:

Compressed backup:

Incremental backup (auto-select most recent full backup):

Save the backup to a remote host (SCP):

Online Backup Example

When you run a backup, by default the tooling performs polling checks and attempts to obtain a consistent point-in-time backup by:

checking for active writes
checking for running cpimport jobs
issuing write locks

You can skip these safety mechanisms with:

--skip-polls
--skip-locks
--skip-save-brm

Skipping polls/locks/BRM saving can be useful for certain workflows, but it increases the risk of capturing a partially-written state that complicates restore.

Incremental Backup Example

Before you can run an incremental backup, you need a full backup taken.

Then taking an incremental backup you need to define the full backup name to increment via flag --incremental xxxxx.

Incremental backups add ColumnStore deltas to an existing full backup. You can either:

specify the full backup folder name explicitly, or
use auto_most_recent will select the most recent directory defined in --backup-location to apply the incremental backup to the most recent full backup

Apply to the most recent full backup:

Apply to a specific full backup folder:

Cron Backup Example

Create a cron job (run as root) that takes periodic backups and appends logs:

Every Night Full Backup retaining the last 14 days:

Full backup once a week (Saturday night) w/ incremental backups all the other nights (keep 21 days)

LocalStorage Backup Flags

The most commonly used options are:

Flag / Option

Description

Notes

S3 Topology Backups

Instructions

Ensure the node has access to your S3 endpoint and credentials.
Run mcs backup with --storage S3 and a backup bucket (--backup-bucket).
Run it as root

If you're using an on-premise S3-compatible solution, you may need --endpoint-url (and sometimes --no-verify-ssl).

Quick Examples

Full backup:

Compressed backup (and skip copying bucket data if you only want local artifacts):

Incremental backup:

On-premise S3 endpoint: Key Flags for on premise buckets are the following:

-url - the local/ip address of the S3 provider. For example, minio defaults to port 9000, 127.0.0.1 would be used if minio is installed on the same machine running columnstore
--no-verify-ssl - used when ssl certs are not used/defined for the S3 provider/endpoint

Cron Backup Example

As with LocalStorage, you can schedule mcs backup in cron. Consider including --name-backup to avoid collisions.

S3 Backup Flags

The most commonly used S3-specific options are:

Flag / Option

Description

Notes

Restore

LocalStorage Topology Restore

Instructions

If Backup made only on Primary node on Clusters that do NOT save the backup to an NFS share, copy the primary nodes backup mysql & configs directory to all nodes.

List backups to find the folder name you want.
Restore on each node, starting with the primary node.

When running a columnstore backup, a restore.job file is created with a command compatible to run on each node to restore the backup.

List Your Backups to Restore

Quick Examples

Standard restore:

Compressed backup restore:

LocalStorage Restore Flags

Common options:

Flag / Option

Description

Notes

S3 Topology Restore

Instructions

Use the same backup bucket that contains the backup.
Restore on each node, starting with the primary node.

When running a columnstore backup, a restoreS3.job file is created with a command compatible to run on each node to restore the backup.

Quick Examples

Standard Restore:

On-premise S3 Endpoint:

Restoring to a New Bucket:

Key Flags for restoring to a new bucket

-nb - the name of the new bucket to copy the backup into and configure columnstore to use post restore
-nr - the name of the region of the new bucket to configure columnstore to use post restore
-nk - the key of the new bucket to configure columnstore to use post restore

S3 Restore Flags

Common options:

Flag / Option

Description

Notes

DBRM Backups

Both S3 and LocalStorage use the same commands for dbrm backups

DBRM backups are intended for backing up internal ColumnStore metadata only.

Instructions

Run mcs dbrm_backup as root with the appropriate flags as you need ONLY on the primary node

List Your dbrm Backups

Quick Examples

Standard `dbrm_backup`:

dbrm_backup before upgrade:

dbrm_backup Flags

Common options:

Flag / Option

Description

Notes

DBRM Restore

Instructions

Both S3 and LocalStorage use the same commands for dbrm restore.

DBRM backups are intended for backing up internal ColumnStore metadata only.

List available DBRM backups.
Restore from the selected folder.

List Your dbrm Restore Options

Quick Examples

Standard `dbrm_restore`:

`dbrm_restore` Flags

Common options:

Flag / Option

Description

Notes

ColumnStore Bulk Data Loading

Overview

cpimport is a high-speed bulk load utility that imports data into ColumnStore tables in a fast and efficient manner. It accepts as input any flat file containing data that contains a delimiter between fields of data (i.e. columns in a table). The default delimiter is the pipe (‘|’) character, but other delimiters such as commas may be used as well. The data values must be in the same order as the create table statement, i.e. column 1 matches the first column in the table and so on. Date values must be specified in the format 'yyyy-mm-dd'.

cpimport – performs the following operations when importing data into a MariaDB ColumnStore database:

Data is read from specified flat files.
Data is transformed to fit ColumnStore’s column-oriented storage design.
Redundant data is tokenized and logically compressed.

It is important to note that:

The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.

There are two primary steps to using the cpimport utility:

Optionally create a job file that is used to load data from a flat file into multiple tables.
Run the cpimport utility to perform the data import.

Syntax

The simplest form of cpimport command is

The full syntax is like this:

cpimport modes

Mode 1: Bulk Load from a central location with single data source file

In this mode, you run the cpimport from your primary node (mcs1). The source file is located at this primary location and the data from cpimport is distributed across all the nodes. If no mode is specified, then this is the default.

Example:

Mode 2: Bulk load from central location with distributed data source files

In this mode, you run the cpimport from your primary node (mcs1). The source data is in already partitioned data files residing on the PMs. Each PM should have the source data file of the same name but containing the partitioned data for the PM

Example:

Mode 3: Parallel distributed bulk load

In this mode, you run cpimport from the individual nodes independently, which will import the source file that exists on that node. Concurrent imports can be executed on every node for the same table.

Example:

Note:

The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.

Bulk loading data from STDIN

Data can be loaded from STDIN into ColumnStore by simply not including the loadFile parameter

Example:

Bulk loading from AWS S3

Similarly the AWS cli utility can be utilized to read data from an s3 bucket and pipe the output into cpimport allowing direct loading from S3. This assumes the aws cli program has been installed and configured on the host:

Example:

For troubleshooting connectivity problems remove the --quiet option which suppresses client logging including permission errors.

Bulk loading output of SELECT FROM Table(s)

Standard in can also be used to directly pipe the output from an arbitrary SELECT statement into cpimport. The select statement may select from non-columnstore tables such as or . In the example below, the db2.source_table is selected from, using the -N flag to remove non-data formatting. The -q flag tells the mysql client to not cache results which will avoid possible timeouts causing the load to fail.

Example:

Bulk loading from JSON

Let's create a sample ColumnStore table:

Now let's create a sample products.json file like this:

We can then bulk load data from JSON into Columnstore by first piping the data to and then to using a one-line command.

Example:

In this example, the JSON data is coming from a static JSON file, but this same method will work for, and output streamed from any datasource using JSON such as an API or NoSQL database. For more information on 'jq', please view the manual here .

Bulk loading into multiple tables

There are two ways multiple tables can be loaded:

Run multiple cpimport jobs simultaneously. Tables per import should be unique or for each import should be unique if using mode 3.
Use colxml utility: colxml creates an XML job file for your database schema before you can import data. Multiple tables may be imported by either importing all tables within a schema or listing specific tables using the -t option in colxml. Then, using cpimport, that uses the job file generated by colxml. Here is an example of how to use colxml and cpimport to import data into all the tables in a database schema

colxml syntax

Example usage of colxml

The following tables comprise a database name ‘tpch2’:

First, put delimited input data file for each table in /usr/local/mariadb/columnstore/data/bulk/data/import. Each file should be named .tbl.
Run colxml for the load job for the ‘tpch2’ database as shown here:

Now actually run cpimport to use the job file generated by the colxml execution

Handling Differences in Column Order and Values

If there are some differences between the input file and table definition then the colxml utility can be utilized to handle these cases:

Different order of columns in the input file from table order
Input file column values to be skipped / ignored.
Target table columns to be defaulted.

In this case run the colxml utility (the -t argument can be useful for producing a job file for one table if preferred) to produce the job xml file and then use this a template for editing and then subsequently use that job file for running cpimport.

Consider the following simple table example:

This would produce a colxml file with the following table element:

If your input file had the data such that hire_date comes before salary then the following modification will allow correct loading of that data to the original table definition (note the last 2 Column elements are swapped):

The following example would ignore the last entry in the file and default salary to it's default value (in this case null):

IgnoreFields instructs cpimport to ignore and skip the particular value at that position in the file.
DefaultColumn instructs cpimport to default the current table column and not move the column pointer forward to the next delimiter.

Both instructions can be used indepedently and as many times as makes sense for your data and table definition.

Binary Source Import

It is possible to import using a binary file instead of a CSV file using fixed length rows in binary data. This can be done using the '-I' flag which has two modes:

-I1 - binary mode with NULLs accepted Numeric fields containing NULL will be treated as NULL unless the column has a default value
-I2 - binary mode with NULLs saturated NULLs in numeric fields will be saturated

The following table shows how to represent the data in the binary format:

Datatype

Description

For NULL values the following table should be used:

Datatype

Signed NULL

Unsigned NULL

Date Struct

The spare bits in the Date struct "must" be set to 0x3E.

DateTime Struct

Working Folders & Logging

As of version 1.4, cpimport uses the /var/lib/columnstore/bulk folder for all work being done. This folder contains:

Logs
Rollback info
Job info
A staging folder

The log folder typically contains:

A typical log might look like this:

Prior to version 1.4, this folder was located at /usr/local/mariadb/columnstore/bulk.

Step 8: Test MariaDB MaxScale

Overview

This page details step 8 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

This step tests MariaDB MaxScale 22.08.

The instructions were tested against ColumnStore 23.10.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

Check Global Configuration

Use command to view the global MaxScale configuration.

This action is performed on the MaxScale node:

Output should align to the global MaxScale configuration in the new configuration file you created.

Check Server Configuration

Use the and commands to view the configured server objects.

This action is performed on the MaxScale node:

Obtain the full list of servers objects:

For each server object, view the configuration:

Output should align to the Server Object configuration you performed.

Check Monitor Configuration

Use the and commands to view the configured monitors.

This action is performed on the MaxScale node:

Obtain the full list of monitors:

For each monitor, view the monitor configuration:

Output should align to the MariaDB Monitor (mariadbmon) configuration you performed.

Check Service Configuration

Use the and commands to view the configured routing services.

This action is performed on the MaxScale node:

Obtain the full list of routing services:

For each service, view the service configuration:

Output should align to the or configuration you performed.

Test Application User

Applications should use a dedicated user account. The user account must be created on the primary server.

When users connect to MaxScale, MaxScale authenticates the user connection before routing it to an Enterprise Server node. Enterprise Server authenticates the connection as originating from the IP address of the MaxScale node.

The application users must have one user account with the host IP address of the application server and a second user account with the host IP address of the MaxScale node.

The requirement of a duplicate user account can be avoided by enabling the proxy_protocol parameter for MaxScale and the proxy_protocol_networks for Enterprise Server.

Create a User to Connect from MaxScale

This action is performed on the primary Enterprise ColumnStore node:

Connect to the primary Enterprise ColumnStore node:

Create the database user account for your MaxScale node:

Replace 192.0.2.10 with the relevant IP address specification for your MaxScale node.

Passwords should meet your organization's password policies.

Grant the privileges required by your application to the database user account for your MaxScale node:

The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

Create a User to Connect from the Application Server

This action is performed on the primary Enterprise ColumnStore node:

Create the database user account for your application server:

Replace 192.0.2.11 with the relevant IP address specification for your application server.

Passwords should meet your organization's password policies.

Grant the privileges required by your application to the d database user account for your application server:

The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

Test Connection with Application User

To test the connection, use the MariaDB Client from your application server to connect to an Enterprise ColumnStore node through MaxScale.

This action is performed on a client connected to the MaxScale node:

Test Connection with Read Connection Router

If you configured the Read Connection Router, confirm that MaxScale routes connections to the replica servers.

On the MaxScale node, use the command to view the available listeners and ports:

Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read Connection Router (in the example, 3308):

Use the application user credentials you created for the --user and --password options.

In each terminal, query the hostname and server_id system variable and option to identify to which you're connected:

Different terminals should return different values since MaxScale routes the connections to different nodes.

Since the router was configured with the slave router option, the Read Connection Router only routes connections to replica servers.

Test Write Queries with Read/Write Split Router

If you configured the Read/Write Split Router, confirm that MaxScale routes write queries on this router to the primary Enterprise ColumnStore node.

on the MaxScale node, use the command to view the available listeners and ports:

Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read/Write Split Router (in the example, 3307):

Use the application user credentials you created for the --user and --password options.

In one terminal, create the test table:

In each terminal, issue an insert.md statement to add a row to the example table with the values of the hostname and server_id system variable and option:

In one terminal, issue a SELECT statement to query the results:

While MaxScale is handling multiple connections from different terminals, it routed all connections to the current primary Enterprise ColumnStore node, which in the example is mcs1#.

Test Read Queries with Read/Write Split Router

If you configured the , confirm that MaxScale routes read queries on this router to replica servers.

On the MaxScale node, use the command to view the available listeners and ports:

In a terminal connected to your application server, use MariaDB Client to connect to the listener port for the (in the example, 3307):

Use the application user credentials you created for the --user and --password options.

Query the hostname and server_id to identify which server MaxScale routed you to.

Resend the query:

Confirm that MaxScale routes the SELECT statements to different replica servers.

For more information on different routing criteria, see slave_selection_criteria

Next Step

"Deploy ColumnStore Shared Local Storage Topology".

This page was step 8 of 9.

Upgrade Multi-Node MariaDB Enterprise ColumnStore from 6 to 23.10

These instructions detail the upgrade from MariaDB Enterprise ColumnStore 6 to MariaDB Enterprise ColumnStore 23.10 in a Multi-Node topology on a range of supported Operating Systems.

Set Replicas to Maintenance Mode

This action is performed for each replica server on the MaxScale node.

Prior to upgrading, the replica servers must be set to maintenance mode in MaxScale. The replicas can be set to maintenance mode in MaxScale using . If you are using , the replicas can be set to maintenance mode using the set server command:

As the first argument, provide the name for the server
As the second argument, provide maintenance as the state

Confirm Maintenance Mode is Set for Replicas

This action is performed on the MaxScale node.

Confirm that the replicas are set to maintenance mode in MaxScale using . If you are using , the state of the replicas can be viewed using the command:

If the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.

Disable GTID Strict Mode

This action is performed on each replica server.

The system variable must be disabled for this upgrade procedure. If the gtid_strict_mode system variable is enabled in any configuration files, disable it temporarily until the upgrade procedure is complete.

You can check if the gtid_strict_mode system variable is set in a configuration file by executing my_print_defaults command with the mysqld option:

If the gtid_strict_mode system variable is set, you can temporarily disable it by adding # in front of it in the configuration file, so that it will be treated as a comment and ignored:

Shutdown ColumnStore

Prior to upgrading, MariaDB Enterprise ColumnStore must be shutdown.

Stop Services

This action is performed on each ColumnStore node.

Prior to upgrading, several services must be stopped on each ColumnStore node:

Stop the service:
Stop the MariaDB Enterprise ColumnStore service:
Stop the MariaDB Enterprise Server service:

Upgrade to the New Version

MariaDB Corporation provides package repositories for YUM (RHEL, CentOS, Rocky Linux) and APT (Debian, Ubuntu).

Upgrade via YUM (RHEL, CentOS, Rocky Linux)

Retrieve your Customer Download Token at and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.
Configure the YUM package repository.
Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to .
To configure YUM package repositories:

Upgrade via APT (Debian, Ubuntu)

Retrieve your Customer Download Token at and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.
Configure the APT package repository.
Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to .
To configure APT package repositories:

Disable ColumnStore Service

This action is performed on each ColumnStore node.

After upgrading, the MariaDB Enterprise ColumnStore service should be stopped, since it will be controlled by :

CMAPI disables the Enterprise ColumnStore service in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not need to start automatically upon reboot.

Start Services

This action is performed on each ColumnStore node.

After upgrading, the service and the MariaDB Enterprise Server service must be started on each ColumnStore node:

Start the CMAPI service:
Start the MariaDB Enterprise Server service:

Write Binary Log

On the primary server, run to upgrade the data directory with binary logging enabled to update the system tables:

Start ColumnStore

After upgrading, MariaDB Enterprise ColumnStore must be started.

Enable GTID Strict Mode

This action is performed on each replica server.

If you temporarily disabled the system variable in the step, it can be re-enabled. If the gtid_strict_mode system variable was temporarily disabled in any configuration files, re-enable it.

Confirm ColumnStore Version

This action is performed on each ColumnStore node.

After upgrading, it is recommended to confirm the Enterprise ColumnStore version on each ColumnStore node. Connect to the node using and query the Columnstore_version status variable with :

Confirm ES Version

This action is performed on each ColumnStore node.

After upgrading, it is recommended to confirm the ES version on each ColumnStore node. Connect to the node using and query the system variable with :

Clear Maintenance Mode for Replicas

This action is performed for each replica server on the MaxScale node.

After the upgrade, maintenance mode for each replica has been cleared in MaxScale using . If you are using , maintenance mode can be cleared using the clear server command:

As the first argument, provide the name for the server
As the second argument, provide maintenance as the state

Confirm Maintenance Mode is Cleared for Replicas

This action is performed for each replica server on the MaxScale node.

Confirm that maintenance mode in MaxScale has been cleared for each replica using . If you are using , the state of the replicas can be viewed using the list servers command:

If the node is no longer in maintenance mode, then the State column will no longer show Maintenance as one of the states.

ColumnStore Storage Architecture

Overview

MariaDB Enterprise ColumnStore's storage architecture is designed to provide great performance for analytical queries.

Columnar Storage Engine

MariaDB Enterprise ColumnStore is a columnar storage engine for . MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.

Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads but are less performant for analytical workloads.

Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.

MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.

OLAP Workloads

MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to perform analytical or online analytical processing (OLAP) workloads.

OLAP workloads are generally characterized by ad hoc queries on very large data sets. Some other typical characteristics are:

Each query typically reads a subset of columns in the table
Most activity typically consists of read-only queries that perform aggregations, window functions, and various calculations
Analytical applications typically require only a few concurrent queries

OLAP workloads are typically required for:

Business intelligence (BI)
Health informatics
Historical data mining

Row-based storage engines have a disadvantage for OLAP workloads. Indexes are not usually very useful for OLAP workloads, because the large size of the data set and the ad hoc nature of the queries preclude the use of indexes to optimize queries.

Columnar storage engines are much better suited for OLAP workloads. MariaDB Enterprise ColumnStore is a columnar storage engine that is designed for OLAP workloads:

When a query reads a subset of columns in the table, Enterprise ColumnStore can reduce I/O by reading those columns and ignoring all others, because each column is stored separately
When most activity consists of read-only queries that perform aggregations, window functions, and various calculations, Enterprise ColumnStore is able to efficiently execute those queries using extent elimination, distributed query execution, and massively parallel processing (MPP) techniques
When only a few concurrent queries are required, Enterprise ColumnStore is able to maximize the use of system resources by using multiple threads and multiple nodes to perform work for each query

OLTP Workloads

MariaDB Enterprise Server has had excellent performance for transactional or online transactional processing (OLTP) workloads since the beginning.

OLTP workloads are generally characterized by a fixed set of queries using a relatively small data set. Some other typical characteristics are:

Each query typically reads and/or writes many columns in the table.
Most activity typically consists of small transactions that only read and/or write a small number of rows.
Transactional applications typically require many concurrent transactions.

OLTP workloads are typically required for:

Financial transactions performed by financial institutions and e-commerce sites.
Store inventory changes performed by brick-and-mortar stores and e-commerce sites.
Account metadata changes performed by many sites that stores personal data.

Row-based storage engines have several advantages for OLTP workloads:

When a query reads and/or writes many columns in the table, row-based storage engines can find all columns on a single page, so the I/O costs of the operation are low.
When a transaction reads/writes a small number of rows, row-based storage engines can use an index to find the page for each row without a full table scan.
When many concurrent transactions are operating, row-based storage engines can implement transactional isolation by storing multiple versions of changed rows.

is ES's default storage engine, and it is a highly performant row-based storage engine.

Hybrid Workloads

MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to function as a single-stack solution for workloads.

Hybrid workloads are characterized by a mix of transactional and analytical queries. Hybrid workloads are also known as "Smart Transactions", "Augmented Transactions" "Translytical", or "Hybrid Operational-Analytical Processing (HOAP)".

Hybrid workloads are typically required for applications that require real-time analytics that lead to immediate action:

Financial institutions use transactional queries to handle financial transactions and analytical queries to analyze the transactions for business intelligence.
Insurance companies use transactional queries to accept/process claims and analytical queries to analyze those claims for business opportunities or risks.
Health providers use transactional queries to track electronic health records (EHR) and analytical queries to analyze the EHRs to discover health trends or prevent adverse drug interactions.

MariaDB Enterprise Server provides multiple components to perform hybrid workloads:

For analytical queries, the Enterprise ColumnStore storage engine can be used.
For transactional queries, row-based storage engines, such as InnoDB, can be used.
For queries that reference both analytical and transactional data, ES's cross-engine join functionality can be used to join Enterprise ColumnStore tables with InnoDB tables.

Storage Options

MariaDB Enterprise ColumnStore supports multiple storage types:

Storage Type

Description

Deployment with S3-Compatible Storage

Deployment with Shared Storage

S3-Compatible Object Storage

MariaDB Enterprise ColumnStore supports S3-compatible object storage.

S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the to use (such as NFS) for high availability.

S3-compatible object storage is:

Compatible: Many object storage services are compatible with the Amazon S3 API.
Economical: S3-compatible object storage is often very low cost.
Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.

Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.

S3 API

MariaDB Enterprise ColumnStore can use any object store that is compatible with the Amazon S3 API.

Many object storage services are compatible with the Amazon S3 API, and compatible object storage services are available for cloud deployments and on-premises deployments, so vendor lock-in is not a concern.

Storage Manager

MariaDB Enterprise ColumnStore's Storage Manager enables remote S3-compatible object storage to be efficiently used. The Storage Manager uses a persistent local disk cache for read/write operations, so that network latency has minimal performance impact on Enterprise ColumnStore. In some cases, it will even perform better than local disk operations.

Enterprise ColumnStore only uses the Storage Manager when S3-compatible storage is used for data.

Storage Manager is configured using .

Storage Manager Directory

MariaDB Enterprise ColumnStore's Storage Manager directory is at the following path by default:

/var/lib/columnstore/storagemanager

To enable high availability when S3-compatible object storage is used, the Storage Manager directory should use and be mounted on every ColumnStore node.

Configure the S3 Storage Manager

When you want to use S3-compatible storage for Enterprise ColumnStore, you must configure Enterprise ColumnStore's S3 Storage Manager to use S3-compatible storage.

To configure Enterprise ColumnStore to use S3-compatible storage, edit /etc/columnstore/storagemanager.cnf:

The S3-compatible object storage options are configured under [S3]:

The bucket option must be set to the name of the bucket.
The endpoint option must be set to the endpoint for the S3-compatible object storage.
The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

The local cache options are configured under [Cache]:

The cache_size option is set to 2 GB by default.
The path option is set to /var/lib/columnstore/storagemanager/cache by default.

Ensure that the specified path has sufficient storage space for the specified cache size.

Shared Local Storage

MariaDB Enterprise ColumnStore can use shared local storage.

Shared local storage is required for high availability. The specific requirements depend on whether Enterprise ColumnStore is configured to use :

When S3-compatible object storage is used, Enterprise ColumnStore requires the to use shared local storage for high availability.

When S3-compatible object storage is not used, Enterprise ColumnStore requires the to use shared local storage for high availability.

The most common shared local storage options for on-premises and cloud deployments are:

NFS (Network File System)
GlusterFS

The most common shared local storage options for AWS (Amazon Web Services) deployments are:

EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)

The most common shared local storage option for GCP (Google Cloud Platform) deployments is:

Filestore

Shared Local Storage Options

The most common options for shared local storage are:

Shared Local Storage

Description

Directories Requiring Shared Local Storage for HA

Multi-node MariaDB Enterprise ColumnStore requires some directories to use shared local storage for high availability. The specific requirements depend on if MariaDB Enterprise ColumnStore is configured to use :

Using S3-Compatible Object Storage?

Directories to use Shared Local Storage

Recommended Storage Options

For best results, MariaDB Corporation would recommend the following storage options:

Environment

Object Storage For Data

Shared Local Storage For Storage Manager

Storage Format

MariaDB Enterprise ColumnStore's storage format is optimized for analytical queries.

DB Root Directories

MariaDB Enterprise ColumnStore stores data in DB Root directories when S3-compatible object storage is not configured.

In a multi-node Enterprise ColumnStore, each node has its own DB Root directory.

The DB Root directories are at the following path by default:

/var/lib/columnstore/dataN

The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:

/var/lib/columnstore/data1
/var/lib/columnstore/data2
/var/lib/columnstore/data3

To enable high availability for the DB Root directories, each directory should be mounted on every ColumnStore node using .

Extents

Each column in a table is stored in units called extents.

By default, each extent contains the column values for 8 million rows. The physical size of each extent can range from 8 MB to 64 MB. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent.

Each extent is stored in 8 KB blocks, and each block has a logical block identifier (LBID).

If a string column is longer than 8 characters, the value is stored in a separate dictionary file, and a pointer to the value is stored in the extent.

Segment Files

A segment file is used to store Enterprise ColumnStore data within a DB Root directory.

A segment file always contains two extents. When a segment file reaches its maximum size, Enterprise ColumnStore creates a new segment file.

The relevant configuration options are:

Option

Description

For example, to configure Enterprise ColumnStore to store more extents in each segment file using the mcsSetConfig utility:

Column Partitions

Enterprise ColumnStore automatically groups a column's segment files into column partitions.

On disk, each column partition is represented by a directory in the DB Root. The directory contains the segment files for the column partition.

By default, a column partition can contain four segment files, but you can configure Enterprise ColumnStore to store more segment files in each column partition. When a column partition reaches the maximum number of segment files, Enterprise ColumnStore creates a new column partition.

The relevant configuration options are:

Option

Description

For example, to configure Enterprise ColumnStore to store more segment files in each column partition using the mcsSetConfig utility:

Extent Map

Enterprise ColumnStore maintains an Extent Map to determine which values are located in each extent.

The Extent Map identifies each extent using its logical block identifier (LBID) values, and it maintains the minimum and maximum values within each extent.

The Extent Map is used to implement a performance optimization called .

The primary node has a master copy of the Extent Map. When Enterprise ColumnStore is started, the primary node copies the Extent Map to the replica nodes.

While Enterprise ColumnStore is running, each node maintains a copy of the Extent Map in its main memory, so that it can be accessed quickly without additional I/O.

If the Extent Map gets corrupted, the mcsRebuildEM utility can rebuild the Extent Map from the contents of the database file system. The mcsRebuildEM utility is available starting in MariaDB Enterprise ColumnStore 6.2.2.

Compression

Enterprise ColumnStore automatically compresses all data on disk using either Snappy or LZ4 compression. See the columnstore_compression_type system variable for how to select the desired compression type.

Since Enterprise ColumnStore stores a single column's data in each segment file, the data in each segment file tends to be very similar. Similar data usually allows for excellent compressibility. However, the specific data compression ratio will depend on a lot of factors, such as the randomness of the data and the number of distinct values.

Enterprise ColumnStore's compression strategy is tuned to optimize the performance of I/O-bound queries, because the decompression rate is optimized to maximize read performance.

Version Buffer

Enterprise ColumnStore uses the version buffer to store blocks that are being modified.

The version buffer is used for multiple tasks:

It is used to roll back a transaction.
It is used for multi-version concurrency control (MVCC). With MVCC, Enterprise ColumnStore can implement read snapshots, which allows a statement to have a consistent view of the database, even if some of the underlying rows have changed. The snapshot for a given statement is identified by the system change number (SCN).

The version buffer is split between data structures that are in-memory and on-disk.

The in-memory data structures are hash tables that keep track of in-flight transaction. The hash tables store the LBIDs for each block that is being modified by a transaction. The in-memory hash tables start at 4 MB, and they grow as-needed. The size of the hash tables increases as the number of modified blocks increases.

An on-disk version buffer file is stored in each DB Root. By default, the on-disk version buffer file is 1 GB, but you can configure Enterprise ColumnStore to use a different file size. The relevant configuration options are:

Option

Description

For example, to configure Enterprise ColumnStore to use a larger on-disk version buffer file using the mcsSetConfig utility:

Extent Elimination

Using the Extent Map, ColumnStore can perform logical range partitioning and only retrieve the blocks needed to satisfy the query. This is done through Extent Elimination, the process of eliminating Extents from the results that don't meet the given join and filter conditions of the query, which reduces the overall I/O operations.

In Extent Elimination, ColumnStore scans the columns in join and filter conditions. It then extracts the logical horizontal partitioning information of each extent along with the minimum and maximum values for the column to further eliminate Extents. To eliminate an Extent when a column scan involves a filter, that filter is compared to the minimum and maximum values stored in each extent for the column. If the filter value is outside the Extents minimum and maximum value range, ColumnStore eliminates the Extent.

This behavior is automatic and well suited for series, ordered, patterned and time-based data, where the data is loaded frequently and often referenced by time. Any column with clustered values is a good candidate for Extent Elimination.

Analytics

Analytics

hashtagMariaDB ColumnStore

hashtagMariaDB Exa

MariaDB ColumnStore

Quickstart Guides

MariaDB ColumnStore Guide

hashtagQuickstart Guide: MariaDB ColumnStore

MariaDB ColumnStore Hardware Guide

hashtagOverview

hashtagMinimum Hardware Recommendations

hashtagFor Development Environments

hashtagFor Production Environments

hashtagNetwork Interconnectivity

hashtagAWS Instance Sizes

hashtagSee Also

ColumnStore Architecture

Topologies Overview

ColumnStore Storage Engine

hashtagOverview

hashtagExamples

hashtagCreating a ColumnStore Table

hashtagMulti-Node Configuration

hashtagConfigure the Mandatory Utility User Account

ColumnStore Read Replicas

hashtagOverview

hashtagKey Features

hashtagKey Commands

hashtagLimitations

hashtagHow-To

hashtagPrerequisites

hashtagInstallation and Setup

ColumnStore System Databases

ColumnStore Query Processing

MariaDB Enterprise Columnstore Locking

hashtagOverview

hashtagLockless Reads

Managing ColumnStore

Deployment

Installing ColumnStore

Step 5: Bulk Import of Data

hashtagOverview

hashtagImport the Schema

hashtagImport the Data

hashtagcpimport

hashtagLOAD DATA INFILE

hashtagImport from Remote Database

hashtagNext Step

Step 6: Install MariaDB MaxScale

hashtagOverview

Step 5: Bulk Import of Data

hashtagOverview

Upgrading ColumnStore

Managing ColumnStore Database Environment

Node Maintenance for MariaDB Enterprise Columnstore

Rejoining a Node

hashtagPerforming Rejoin in MaxScale

hashtagChecking Replication Status with MaxScale

Switchover of the Primary Node

hashtagPerforming Switchover in MaxScale

hashtagChecking the Replication Status with MaxScale

View and Clear Table Locks

hashtagViewing Table Locks

hashtagClearing Table Locks

Backup & Restore

Backup and Restore Overview

hashtagOverview

hashtagSystem of Record

ColumnStore Table Size Limitations

Security

ColumnStore Security Vulnerabilities

Use Cases

About MariaDB ColumnStore

High Availability

Query Tuning

Collecting Statistics with ANALYZE TABLE

hashtagOverview

Query Plans and Optimizer Trace

Execution Plan (CSEP)

hashtagOverview

MariaDB ColumnStore

MariaDB Exa

Quickstart Guide: MariaDB ColumnStore

Overview

Minimum Hardware Recommendations

For Development Environments

For Production Environments

Network Interconnectivity

AWS Instance Sizes

See Also

Overview

Examples

Creating a ColumnStore Table

Multi-Node Configuration

Configure the Mandatory Utility User Account

Overview

Key Features

Key Commands

Limitations

How-To

Prerequisites

Installation and Setup

Overview

Lockless Reads

Overview

Import the Schema

Import the Data

cpimport

LOAD DATA INFILE

Import from Remote Database

Next Step

Overview

Overview

Performing Rejoin in MaxScale

Checking Replication Status with MaxScale

Performing Switchover in MaxScale

Checking the Replication Status with MaxScale

Viewing Table Locks

Clearing Table Locks

Overview

System of Record

Overview

Overview

Viewing the CSEP

Hardware (On Premises)

Cloud (IaaS)

Software-Based

AWS IAM Role Configuration

MariaDB ColumnStore

MariaDB Exa

Overview

Examples

Creating a ColumnStore Table

Multi-Node Configuration

Configure the Mandatory Utility User Account

Overview

Minimum Hardware Recommendations

For Development Environments

For Production Environments

Network Interconnectivity

AWS Instance Sizes

See Also

Quickstart Guide: MariaDB ColumnStore

Overview

Lockless Reads

Key Benefits

Architecture Concepts (Simplified)

Installation Overview

Basic Usage

Creating a ColumnStore Table

Inserting Data

Querying Data

See Also

Transactional (OLTP)

Primary/Replica Topology

Galera Cluster Topology

Analytical (OLAP, Data Warehousing, DSS)

ColumnStore Shared Local Storage Topology