Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
MariaDB ColumnStore Quickstart Guides provide concise, Docker-friendly steps to quickly set up, configure, and explore the ColumnStore analytic engine.
Discover MariaDB ColumnStore, the powerful columnar storage engine for analytical workloads. Learn about its architecture, features, and how it enables high-performance data warehousing and analytics.
MariaDB Enterprise offers powerful solutions to break down the barriers to insight. Whether you need to run ad hoc queries on massive datasets or power the most demanding AI workloads.
For fast, ad hoc analytics at scale, MariaDB ColumnStore is a powerful columnar database that can be deployed as a standalone analytics solution or integrated with MariaDB Enterprise Server to act as a powerful query accelerator. It stores data in a columnar format and can be distributed across a cluster of servers, allowing it to execute complex queries in parallel on petabytes of data.
This integration allows you to access your InnoDB data in near-real time, processing it directly in the ColumnStore engine to run fast, parallel OLAP queries straight from your transactional data. This eliminates the need to maintain a separate pipeline or use delayed batch inserts to analyze your live data.
For the ultimate in analytical performance, the joint solution between MariaDB and Exasol connects your mission-critical transactional data to the world’s fastest analytics engine. Available on-premise or in the cloud on platforms like AWS and Microsoft Azure, this solution brings high-speed analytics to any environment.
MariaDB Exa erases the barrier between live operational data and high-speed analytics, leveraging Exasol’s massively parallel processing (MPP) and in-memory engine. It is the ideal solution for powering your most demanding analytics and AI/ML workloads with unmatched speed and efficiency.
MariaDB ColumnStore uses a shared-nothing, distributed architecture with separate modules for SQL and storage, enabling scalable, high-performance analytics.
Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.
This section provides instructions for installing and configuring MariaDB ColumnStore. It covers various deployment scenarios, including single- and multi-node setups with both local and S3 storage.
Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.
MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.
MariaDB ColumnStore uses MariaDB Server’s security—encryption, access control, auditing, and firewall—for secure analytics.
MariaDB ColumnStore ensures high availability with multi-node setups and shared storage, while MaxScale adds monitoring and failover for continuous analytics.
MariaDB ColumnStore supports standard MariaDB tools, BI connectors (e.g., Tableau, Power BI), data ingestion (cpimport, Kafka), and REST APIs for admin.
The ColumnStore StorageManager manages columnar data storage and retrieval, optimizing analytical queries.
MariaDB ColumnStore is ideal for real-time analytics and complex queries on large datasets across industries.
MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.
MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.
Quickstart guide for MariaDB ColumnStore hardware requirements
MariaDB ColumnStore is designed for analytical workloads and scales linearly with hardware resources. While the performance generally improves with more CPU cores, memory, and servers, understanding the minimum hardware specifications is crucial for successful deployment, especially in development and production environments.
MariaDB ColumnStore's performance directly benefits from additional hardware resources. More CPU cores enable greater parallel processing, increased memory allows for more data caching (reducing I/O), and more servers enable a larger distributed architecture.
The specifications differentiate between a basic development environment and a production-ready setup:
1. For Development Environments:
CPU: A minimum of 8 CPU cores.
Memory (RAM): A minimum of 32 GB.
Storage: Local disk storage is acceptable for development purposes.
2. For Production Environments:
CPU: A minimum of 64 CPU cores.
Note: This recommendation underscores the highly parallel nature of ColumnStore, which can effectively utilize a large number of cores for analytical processing.
Memory (RAM): A minimum of 128 GB.
Note: Adequate memory is critical for caching data and intermediate results, directly impacting query performance.
Storage: StorageManager (S3) is recommended.
Note: This implies leveraging cloud-object storage (like AWS S3 or compatible services) for scalable and durable data persistence in production.
Minimum Network: For multi-server ColumnStore deployments, a minimum of a 1 Gigabit (1G) network is recommended.
Note: This facilitates efficient data transfer between nodes via TCP/IP for replication and query processing across the distributed architecture. For optimal performance in heavy-load scenarios, higher bandwidth (e.g., 10G or more) is highly beneficial.
Adhering to these minimum specifications will provide a baseline for ColumnStore functionality. For specific workload requirements, it's always advisable to conduct performance testing and scale hardware accordingly.
When using ColumnStore, MariaDB Server creates a series of system databases used for operational purposes.
calpontsys
Database maintains table metadata about ColumnStore tables.
infinidb_querystats
Database maintains information about query performance. For more information, see .
columnstore_info
The database for stored procedures is used to retrieve information about ColumnStore usage. For more information, see the tables.
MariaDB Enterprise ColumnStore minimizes locking for analytical workloads, bulk data loads, and online schema changes.
MariaDB Enterprise ColumnStore supports lockless reads.
MariaDB Enterprise ColumnStore requires a table lock for write operations.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the .
For additional information, see "MariaDB Enterprise ColumnStore Data Loading".
MariaDB Enterprise ColumnStore supports online schema changes, so that supported DDL operations can be performed without blocking reads. The supported DDL operations only require a write metadata lock (MDL) on the target table.
To switchover to a new primary node with Enterprise ColumnStore, perform the following procedure.
The primary node can be switched in MaxScale using :
Use or another supported REST client.
Call a module command using the call command command.
As the first argument, provide the name for the module, which is .
As the second argument, provide the module command, which is switchover .
As the third argument, provide the name of the monitor.
For example:
maxctrl call command \
mariadbmon \
switchover \
mcs_monitorWith the above syntax, MaxScale will choose the most up-to-date replica to be the new primary.
If you want to manually select a new primary, provide the server name of the new primary as the fourth argument:
maxctrl call command \
mariadbmon \
switchover \
mcs_monitor \
mcs2MaxScale is capable of checking the status of using :
List the servers using the list servers command, like this:
maxctrl list serversIf switchover was properly performed, the State column of the new primary shows Master, Running.
MariaDB Enterprise ColumnStore acquires table locks for some operations, and it provides utilities to view and clear those locks.
MariaDB Enterprise ColumnStore acquires table locks for some operations, such as:
DDL statements
DML statements
Bulk data loads
If an operation fails, the table lock does not always get released. If you try to access the table, you can see errors like the following:
ERROR 1815 (HY000): Internal error: CAL0009: Drop table failed due to IDB-2009: Unable to perform the drop table operation because cpimport with PID 16301 is currently holding the table lock for session -1.To solve this problem, MariaDB Enterprise ColumnStore provides two utilities to view and clear the table locks:
cleartablelock
viewtablelock
The viewtablelock utility shows table locks currently held by MariaDB Enterprise ColumnStore:
To view all table locks:
viewtablelock
There is 1 table lock
Table LockID Process PID Session Txn CreationTime State DBRoots
hq_sales.invoices 1 cpimport 16301 BulkLoad n/a Wed April 7 14:20:42 2021 LOADING 1To view table locks for a specific table, specify the database and table:
viewtablelock hq_sales invoices
There is 1 table lock
Table LockID Process PID Session Txn CreationTime State DBRoots
hq_sales.invoices 1 cpimport 16301 BulkLoad n/a Wed April 7 14:20:42 2021 LOADING 1The cleartablelock utility clears table locks currently held by MariaDB Enterprise ColumnStore.
To clear a table lock, specify the lock ID shown by the viewtablelock utility:
cleartablelock 1This page is about security vulnerabilities that have been fixed for or still affect MariaDB ColumnStore. In addition, links are included to fixed security vulnerabilities in MariaDB Server since MariaDB ColumnStore is based on MariaDB Server.
Sensitive security issues can be sent directly to the persons responsible for MariaDB security: security [AT] mariadb (dot) org.
CVE® stands for "Common Vulnerabilities and Exposures". It is a publicly available and free-to-use database of known software vulnerabilities maintained at
The appropriate release notes listed document CVEs fixed within a given release. Additional information can also be found at Security Vulnerabilities Fixed in MariaDB.
There are no known CVEs on ColumnStore-specific infrastructure outside of the MariaDB server at this time.
In MariaDB Enterprise ColumnStore 6, the ExeMgr process uses optimizer statistics in its query planning process.
ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.
In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:
ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.
Enterprise ColumnStore's optimizer statistics can be collected for ColumnStore tables by executing :
[[analyze-table|ANALYZE TABLE]] columnstore_tab;Enterprise ColumnStore's optimizer statistics are not updated automatically. To update the optimizer statistics for a ColumnStore table, must be re-executed.
Enterprise ColumnStore does not implement an interface to show optimizer statistics.
The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.
When the select handler receives the SELECT_LEX object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."
The CSEP for a given query can be viewed by performing the following:
Calling the calSetTrace(1) function:
SELECT calSetTrace(1);Executing the query:
SELECT column1, column2
FROM columnstore_tab
WHERE column1 > '2020-04-01'
AND column1 < '2020-11-01';Calling the calGetTrace() function:
SELECT calGetTrace();# Sample storagemanager.cnf
[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
region = us-west-1
bucket = my_columnstore_bucket
endpoint = s3.amazonaws.com
aws_access_key_id = AKIAR6P77BUKULIDIL55
aws_secret_access_key = F38aR4eLrgNSWPAKFDJLDAcax0gZ3kYblU79
[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cacheLearn about data ingestion for MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.
ColumnStore provides several mechanisms to ingest data:
cpimport provides the fastest performance for inserting data and ability to route data to particular PrimProc nodes. Normally, this should be the default choice for loading data .
LOAD DATA INFILE provides another means of bulk inserting data.
By default, with autocommit on, it internally streams the data to an instance of the cpimport process.
In transactional mode, DML inserts are performed, which is significantly slower and also consumes both binlog transaction files and ColumnStore VersionBuffer files.
DML, i.e. INSERT, UPDATE, and DELETE, provide row-level changes. ColumnStore is optimized towards bulk modifications, so these operations are slower than they would be in, for instance, InnoDB.
Currently ColumnStore does not support operating as a replication replica target.
Bulk DML operations will in general perform better than multiple individual statements.
INSERT INTO SELECT with autocommit behaves similarly to LOAD DATE INFILE because, internally, it is mapped to cpimport for higher performance.
Bulk update operations based on a join with a small staging table can be relatively fast, especially if updating a single column.
Using ColumnStore Bulk Write SDK or ColumnStore Streaming Data Adapters.
IBM Cloud Object Storage (Formerly known as CleverSafe)
Due to the frequent code changes and deviation from the AWS standards, none are approved at this time.
Clients issue a query to the MariaDB Server, which has the ColumnStore storage engine installed. MariaDB Server parses the SQL, identifies the involved ColumnStore tables, and creates an initial logical query execution plan.
Using the ColumnStore storage engine interface (ha_columnstore), MariaDB Server converts involved table references into ColumnStore internal objects. These are then handed off to the ExeMgr, which is responsible for managing and orchestrating query execution across the cluster.
The ExeMgr analyzes the query plan and translates it into a distributed ColumnStore execution plan. It determines the necessary query steps and the execution order, including any required parallelization.
The ExeMgr then references the extent map to identify which PrimProc instances hold the relevant data segments. It applies extent elimination to exclude any PrimProc nodes whose extents do not match the query’s filter criteria.
The ExeMgr dispatches commands to the selected PrimProc instances to perform data block I/O operations.
The PrimProc components perform operations such as
Predicate filtering
Join processing
Initial aggregation
Data retrieval from local disk or external storage (e.g., S3 or cloud object storage)
They then return intermediate result sets to the ExeMgr.
The ExeMgr handles:
Final-stage aggregation
Window function evaluation
Result-set sorting and shaping
The completed result set is returned to the MariaDB Server, which performs any remaining SQL operations like ORDER BY, LIMIT, or computed expressions in the SELECT list.
Finally, the MariaDB Server returns the result set to the client.
The following table outlines the minimum recommended production server specifications which can be followed for both on premise and cloud deployments:
These are minimum recommendations and in general the system will perform better with more hardware:
More CPU cores and servers will improve query processing response time.
More memory will allow the system to cache more data blocks in memory. We have users running system with anywhere from 64G RAM to 2T RAM.
Faster network will allow data to flow faster between PrimProc nodes.
SSD's may be used, however the system is optimized towards block streaming which may perform well enough with HDD's for lower cost.
Where it is an option, it is recommended to use bare metal servers for additional performance since ColumnStore will fully consume CPU cores and memory.
In general it makes more sense to use a higher core count / higher memory server for single server or 2 server combined deployments.
For AWS, our own internal testing generally uses m4.4xlarge instance types as a cost-effective middle ground. The R4.8xlarge has also been tested and performs about twice as fast for about twice the price.
Step 5: Bulk Import of Data
This page details step 5 of a 5-step procedure for deploying .
This step bulk imports data to Enterprise ColumnStore.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Before data can be imported into the tables, create a matching schema.
On the primary server, create the schema:
For each database that you are importing, create the database with the statement:
For each table that you are importing, create the table with the statement:
Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.
MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server run :
When data is loaded with the statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server use statement:
MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a remote MariaDB database:
Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
This page was step 5 of 5.
This procedure is complete.
Step 6: Install MariaDB MaxScale
This page details step 6 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step installs MariaDB MaxScale 22.08. ColumnStore Object Storage requires 1 or more MaxScale nodes.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
Install on Debian / Ubuntu (APT):
On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:
Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
On the MaxScale node, install MariaDB MaxScale.
Install on CentOS / RHEL (YUM):
Install on Debian / Ubuntu (APT):
Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 6 of 9.
Step 6: Install MariaDB MaxScale
This page details step 6 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step installs MariaDB MaxScale 22.08.
ColumnStore Object Storage requires 1 or more MaxScale nodes.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
Install on Debian / Ubuntu (APT):
On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:
Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
On the MaxScale node, install MariaDB MaxScale.
Install on CentOS / RHEL (YUM):
Install on Debian / Ubuntu (APT):
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 6 of 9.
.
MariaDB ColumnStore has a hard limit of 4096 columns per table.
However, it's likely that you run into other limitations before hitting that limit, including:
Row size limit of tables. This varies, depending on the storage engine you're using. For example, which indirectly limits the number of columns.
Size limit of .frm files. Those files hold the column description of tables. Column descriptions vary in length. Once all column descriptions combined reach a length of 64KB, the table's .frm file is full, limiting the number of columns you can have in a table.
Given that, the maximum number of columns a ColumnStore table can effectively have is around 2000 columns.
MariaDB ColumnStore is a columnar storage engine that utilizes a massively parallel distributed data architecture. It's a columnar storage system built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.
is available as a storage engine for MariaDB Server. Before then, it is available as a separate download.
It is designed for big data scaling to process petabytes of data, linear scalability, and exceptional performance with real-time response to analytical queries. It leverages the I/O benefits of columnar storage, compression, just-in-time projection, and horizontal and vertical partitioning to deliver tremendous performance when analyzing large data sets.
Links:
.
A Google Group exists for MariaDB ColumnStore that can be used to discuss ideas and issues and communicate with the community: Send email to mariadb-columnstore@googlegroups.com or use the
Bugs can be reported in MariaDB Jira: (see ). Please file bugs under the MCOL project and include the output from the if possible.
MariaDB ColumnStore is released under the GPL license.
To rejoin a node with Enterprise ColumnStore, perform the following procedure.
The node can be configured to rejoin in MaxScale using :
Use or another supported REST client.
Call a module command using the call command command.
As the first argument, provide the name for the module, which is .
As the second argument, provide the module command, which is rejoin .
As the third argument, provide the name of the monitor.
As the fourth argument, provide the name of the server.
For example:
MaxScale is capable of checking the status of using :
List the servers using the list servers command, like this:
If the node properly rejoined, the State column of the node shows Slave, Running.
From Columnstore 5.5.2, you can use AWS IAM roles in order to connect to S3 buckets without explicitly entering credentials into the storagemanager.cnf config file.
You need to modify the IAM role of your Amazon EC2 instance to allow for this. Please follow the AWS before beginning this process.
It is important to note that you must update the AWS S3 endpoint based on your chosen region; otherwise, you might face delays in propagation as discussed and .
For a complete list of AWS service endpoints, visit the AWS .
Edit your Storage Manager configuration file located at /etc/columnstore/storagemanager.cnf in order to look similar to the example below (replacing those in the [S3] section with your own custom variables):
Physical Server
8 Core CPU, 32 GB Memory
64 Core CPU, 128 GB Memory
Storage
Local disk
StorageManager (S3)
Network Interconnect
In a multi server deployment data will be passed around via TCP/IP networking. At least a 1G network is recommended.
CREATE DATABASE inventory;CREATE TABLE inventory.products (
product_name VARCHAR(11) NOT NULL DEFAULT '',
supplier VARCHAR(128) NOT NULL DEFAULT '',
quantity VARCHAR(128) NOT NULL DEFAULT '',
unit_cost VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;$ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsvLOAD DATA INFILE '/tmp/inventory-products.tsv'
INTO TABLE inventory.products;$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory products$ sudo yum install curl$ sudo apt install curl apt-transport-https$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -
$ chmod +x mariadb_es_repo_setup
$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-server \
--skip-tools \
--mariadb-maxscale-version="22.08"$ sudo yum install maxscale$ sudo apt install maxscale$ sudo yum install curl$ sudo apt install curl apt-transport-https$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -
$ chmod +x mariadb_es_repo_setup
$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-server \
--skip-tools \
--mariadb-maxscale-version="22.08"$ sudo yum install maxscale$ sudo apt install maxscalemaxctrl call command \
mariadbmon \
rejoin \
mcs_monitor \
mcs3maxctrl list servers[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3
[S3]
ec2_iam_mode=enabled
bucket = my_mcs_bucket
region = us-west-2
endpoint = s3.us-west-2.amazonaws.com
[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000
[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cacheStep 2: Install Enterprise ColumnStore
This page details step 2 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.
This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to https://customers.mariadb.com/downloads/token/
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
$ sudo yum install curlInstall on Debian / Ubuntu (APT):
$ sudo apt install curl apt-transport-httpsOn each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:
$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -$ chmod +x mariadb_es_repo_setup$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-maxscale \
--skip-tools \
--mariadb-server-version="11.4"Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
Install additional dependencies:
Install on CentOS / RHEL (YUM)
$ sudo yum install epel-release
$ sudo yum install jemallocInstall of Debian 10 and Ubuntu 20.04 (APT):
$ sudo apt install libjemalloc2Install on Debian 9 and Ubuntu 18.04 (APT):
$ sudo apt install libjemalloc1Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:
Install on CentOS / RHEL (YUM):
$ sudo yum install MariaDB-server \
MariaDB-backup \
MariaDB-shared \
MariaDB-client \
MariaDB-columnstore-engineInstall on Debian / Ubuntu (APT):
$ sudo apt install mariadb-server \
mariadb-backup \
libmariadb3 \
mariadb-client \
mariadb-plugin-columnstoreNavigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
This page was step 2 of 5.
Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.
Step 3: Install MariaDB Enterprise Server
This page details step 3 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore 23.10, CMAPI, and dependencies.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to https://customers.mariadb.com/downloads/token/
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
$ sudo yum install curlInstall on Debian / Ubuntu (APT):
$ sudo apt install curl apt-transport-httpsOn each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:
$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -$ chmod +x mariadb_es_repo_setup$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-maxscale \
--skip-tools \
--mariadb-server-version="11.4"Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
On each Enterprise ColumnStore node, install additional dependencies:
Install on CentOS and RHEL (YUM):
$ sudo yum install jemalloc jq curlInstall on Debian 9 and Ubuntu 18.04 (APT)
$ sudo apt install libjemalloc1 jq curlInstall on Debian 10 and Ubuntu 20.04 (APT):
$ sudo apt install libjemalloc2 jq curlOn each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:
Install on CentOS / RHEL (YUM):
$ sudo yum install MariaDB-server \
MariaDB-backup \
MariaDB-shared \
MariaDB-client \
MariaDB-columnstore-engine \
MariaDB-columnstore-cmapiInstall on Debian / Ubuntu (APT):
$ sudo apt install mariadb-server \
mariadb-backup \
libmariadb3 \
mariadb-client \
mariadb-plugin-columnstore \
mariadb-columnstore-cmapiNavigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 3 of 9.
Next: Step 4: Start and Configure MariaDB Enterprise Server.
Step 3: Install MariaDB Enterprise Server
This page details step 3 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore 23.10, CMAPI, and dependencies.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to https://customers.mariadb.com/downloads/token/
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
$ sudo yum install curlInstall on Debian / Ubuntu (APT):
$ sudo apt install curl apt-transport-httpsOn each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:
$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -$ chmod +x mariadb_es_repo_setup$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-maxscale \
--skip-tools \
--mariadb-server-version="11.4"Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
On each Enterprise ColumnStore node, install additional dependencies:
Install on CentOS and RHEL (YUM):
$ sudo yum install jemalloc jq curlInstall on Debian 9 and Ubuntu 18.04 (APT)
$ sudo apt install libjemalloc1 jq curlInstall on Debian 10 and Ubuntu 20.04 (APT):
$ sudo apt install libjemalloc2 jq curlOn each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:
Install on CentOS / RHEL (YUM):
$ sudo yum install MariaDB-server \
MariaDB-backup \
MariaDB-shared \
MariaDB-client \
MariaDB-columnstore-engine \
MariaDB-columnstore-cmapiInstall on Debian / Ubuntu (APT):
$ sudo apt install mariadb-server \
mariadb-backup \
libmariadb3 \
mariadb-client \
mariadb-plugin-columnstore \
mariadb-columnstore-cmapiNavigation in the procedure "Deploy ColumnStore Object Storage Topology".
This page was step 3 of 9.
Next: Step 4: Start and Configure MariaDB Enterprise Server.
Step 2: Install Enterprise ColumnStore
This page details step 2 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.
This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.
Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.
To retrieve the token for your account:
Navigate to https://customers.mariadb.com/downloads/token/
Log in.
Copy the Customer Download Token.
Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.
On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):
$ sudo yum install curlInstall on Debian / Ubuntu (APT):
$ sudo apt install curl apt-transport-httpsOn each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:
$ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup$ echo "${checksum} mariadb_es_repo_setup" \
| sha256sum -c -$ chmod +x mariadb_es_repo_setup$ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--skip-maxscale \
--skip-tools \
--mariadb-server-version="11.4"Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
Install additional dependencies:
Install on CentOS / RHEL (YUM)
$ sudo yum install epel-release
$ sudo yum install jemallocInstall of Debian 10 and Ubuntu 20.04 (APT):
$ sudo apt install libjemalloc2Install on Debian 9 and Ubuntu 18.04 (APT):
$ sudo apt install libjemalloc1Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:
Install on CentOS / RHEL (YUM):
$ sudo yum install MariaDB-server \
MariaDB-backup \
MariaDB-shared \
MariaDB-client \
MariaDB-columnstore-engineInstall on Debian / Ubuntu (APT):
$ sudo apt install mariadb-server \
mariadb-backup \
libmariadb3 \
mariadb-client \
mariadb-plugin-columnstoreNavigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
This page was step 2 of 5.
Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.
Step 5: Bulk Import of Data
This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.
This step bulk imports data to Enterprise ColumnStore.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Before data can be imported into the tables, create a matching schema.
On the primary server, create the schema:
For each database that you are importing, create the database with the statement:
CREATE DATABASE inventory;For each table that you are importing, create the table with the statement:
CREATE TABLE inventory.products (
product_name VARCHAR(11) NOT NULL DEFAULT '',
supplier VARCHAR(128) NOT NULL DEFAULT '',
quantity VARCHAR(128) NOT NULL DEFAULT '',
unit_cost VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.
MariaDB Enterprise ColumnStore includes cpimport, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server run cpimport:
$ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsvWhen data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using cpimport, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:
LOAD DATA INFILE '/tmp/inventory-products.tsv'
INTO TABLE inventory.products;MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a remote MariaDB database:
$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory productsNavigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
This page was step 5 of 5.
This procedure is complete.
To set a node to maintenance mode with Enterprise ColumnStore, perform the following procedure.
The server object for the node can be set to maintenance mode in MaxScale using :
Use or another supported REST client.
Set the server object to maintenance mode using the set server command.
As the first argument, provide the name for the server.
As the second argument, provide maintenance as the state.
For example:
maxctrl set server \
mcs3 \
maintenanceIf the specified server is a primary server, then MaxScale will allow open transactions to complete before closing any connections.
If you would like MaxScale to immediately close all connections, the --force option can be provided as a third argument:
maxctrl set server \
mcs3 \
maintenance \
--forceConfirm the state of the server object in MaxScale using :
List the servers using the list servers command, like this:
maxctrl list serversIf the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.
Now that the server is in maintenance mode in MaxScale, you can perform your maintenance.
While the server is in maintenance mode:
MaxScale doesn't route traffic to the node.
MaxScale doesn't select the node to be primary during failover.
The node can be rebooted.
The node's services can be restarted.
Maintenance mode for the server object for the node can be cleared in MaxScale using :
Use or another supported REST client.
Clear the server object's state using the clear server command.
As the first argument, provide the name for the server.
As the second argument, provide maintenance as the state.
For example:
maxctrl clear server \
mcs3 \
maintenanceConfirm the state of the server object in MaxScale using :
List the servers using the list servers command, like this:
maxctrl list serversIf the node is no longer in maintenance mode, the State column no longer shows Maintenance as one of the states.
MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.
The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.
ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.
The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.
If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.
To safeguard against potential Extent Map corruption, regularly back up the master copy:
Lock Table:
mariadb -e "FLUSH TABLES WITH READ LOCK;"Save BRM:
save_brmCreate Backup Directory:
mkdir -p /extent_map_backupCopy Extent Map:
cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backupUnlock Tables:
mariadb -e "UNLOCK TABLES;"Stop ColumnStore:
systemctl stop mariadb-columnstoreRename Corrupted Map:
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.badClear Versioning Files:
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vssRestore Backup:
cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/Set Ownership:
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/Start ColumnStore:
systemctl start mariadb-columnstoreShutdown Cluster:
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -kRename Corrupted Map:
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.badClear Versioning Files:
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vssRestore Backup:
mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/Set Ownership:
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrmStart Cluster:
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -kIncorporate the save_brm command into your data import scripts (e.g., those using cpimport) to automate Extent Map backups. This practice ensures regular backups without manual intervention.
Refer to the MariaDB ColumnStore Backup Script for an example implementation.
Starting with MariaDB Enterprise ColumnStore 6.2.3, ColumnStore supports encryption for user passwords stored in Columnstore.xml:
Encryption keys are created with the cskeys utility
Passwords are encrypted using the cspasswd utility
MariaDB Enterprise ColumnStore 6
MariaDB Enterprise ColumnStore 22.08
MariaDB Enterprise ColumnStore 23.02
MariaDB Enterprise ColumnStore stores its password encryption keys in the plain-text file /var/lib/columnstore/.secrets.
The encryption keys are not created by default, but can be generated by executing the cskeys utility:
$ cskeysIn a multi-node Enterprise ColumnStore cluster, every ColumnStore node should have the same encryption keys. Therefore, it is recommended to execute cskeys on the primary server and then copy /var/lib/columnstore/.secrets to every other ColumnStore node and fix the file's permissions:
$ scp 192.0.2.1:/var/lib/columnstore/.secrets /var/lib/columnstore/.secrets
$ sudo chown mysql:mysql /var/lib/columnstore/.secrets
$ sudo chmod 0400 /var/lib/columnstore/.secretsTo encrypt a password:
Generate an encrypted password using the cspasswd utility:
$ cspasswd util_user_passwdIf the --interactive command-line option is specified, cspasswd prompts for the password.
Set the encrypted password in Columnstore.xml using the mcsSetConfig utility:
$ sudo mcsSetConfig CrossEngineSupport Password util_user_encrypted_passwdTo decrypt a password, execute the cspasswd utility and specify the --decrypt command-line option:
$ cspasswd --decrypt util_user_encrypted_passwdMariaDB Enterprise ColumnStore supports backup and restore.
Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the system of record for your Enterprise ColumnStore data.
A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.
Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.
If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:
If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.
If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.
MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:
Enterprise ColumnStore's data and metadata
With S3: an S3 snapshot of the S3-compatible object storage and a file system snapshot or copy of the Storage Manager directory Without S3: a file system snapshot or copy of the DB Root directories.
The MariaDB data directory from the primary node
To see the procedure to perform a full backup and restore, choose the storage type:
Quickstart guide for MariaDB ColumnStore
MariaDB ColumnStore is a specialized columnar storage engine designed for high-performance analytical processing and big data workloads. Unlike traditional row-based storage engines, ColumnStore organizes data by columns, which is highly efficient for analytical queries that often access only a subset of columns across vast datasets.
MariaDB ColumnStore is a columnar storage engine that integrates with MariaDB Server. It employs a massively parallel distributed data architecture, making it ideal for processing petabytes of data with linear scalability. It was originally ported from InfiniDB and is released under the GPL license.
Exceptional Analytical Performance: Delivers superior performance for complex analytical queries (OLAP) due to its columnar nature, which minimizes disk I/O by reading only necessary columns.
High Data Compression: Columnar storage allows for much higher compression ratios compared to row-based storage, reducing disk space usage and improving query speed.
Massive Scalability: Designed to scale horizontally across multiple nodes, processing petabytes of data with ease.
Just-in-Time Projection: Only the required columns are processed and returned, further optimizing query execution.
Real-time Analytics: Capable of handling real-time analytical queries efficiently.
MariaDB ColumnStore utilizes a distributed architecture with different components working together:
User Module (UM): Handles incoming SQL queries, optimizes them for columnar processing, and distributes tasks.
Performance Module (PM): Manages data storage, compression, and execution of query fragments on the data segments.
Data Files: Data is stored in column-segments across the nodes, highly compressed.
MariaDB ColumnStore is installed as a separate package that integrates with MariaDB Server. The exact installation steps vary depending on your operating system and desired deployment type (single server or distributed cluster).
General Steps (conceptual):
Install MariaDB Server: Ensure you have a compatible MariaDB Server version installed (e.g., MariaDB 10.5.4 or later).
Install ColumnStore Package: Download and install the specific MariaDB ColumnStore package for your OS. This package includes the ColumnStore storage engine and its associated tools.
Linux (e.g., Debian/Ubuntu): You would typically add the MariaDB repository configured for ColumnStore and then install mariadb-plugin-columnstore.
Single Server vs. Distributed: For a single-server setup, you install all ColumnStore components on one machine. For a distributed setup, you install and configure components across multiple machines.
Configure MariaDB: After installation, you might need to adjust your MariaDB server configuration (my.cnf or equivalent) to properly load and manage the ColumnStore engine.
Initialize ColumnStore: Run a specific columnstore-setup or post-install script to initialize the ColumnStore environment.
Once MariaDB ColumnStore is installed and configured, you can create and interact with ColumnStore tables using standard SQL.
Specify ENGINE=ColumnStore when creating your table. Note that ColumnStore tables do not support primary keys in the same way as InnoDB, as their primary focus is analytical processing.
You can insert data using standard INSERT statements. For large datasets, bulk loading utilities (for instance, LOAD DATA INFILE) are highly recommended for performance.
Perform analytical queries. ColumnStore will efficiently process these, often leveraging its columnar nature and parallelism.
MariaDB offers varied deployment topologies by workload and technology, each named and diagrammed with benefits listed. Custom configurations are also supported.
MariaDB products can be deployed in many different topologies. The topologies described in this section are representative of the overall structure. MariaDB products can be deployed to form other topologies, leverage advanced product capabilities, or combine the capabilities of multiple topologies.
Topologies are the arrangements of nodes and links to achieve a purpose. This documentation describes a few of the many topologies that can be deployed using MariaDB database products.
We group topologies by workload (transactional, analytical, or hybrid) and technologies (Enterprise Spider). Single-node topologies are listed separately.
To help you select the correct topology:
Each topology is named, and this name is used consistently throughout the documentation.
A thumbnail diagram provides a small-scale summary of the topology's architecture.
Finally, we provide a list of the benefits of the topology.
Although multiple topologies are listed on this page, the listed topologies are not the only options. MariaDB products are flexible, configurable, and extensible, so it is possible to deploy different topologies that combine the capabilities of multiple topologies listed on this page. The topologies listed on this page are primarily intended to be representative of the most commonly requested use cases.
The Read Replicas feature in MariaDB ColumnStore enables horizontal scaling of read performance by incorporating read-only nodes into a multi-node cluster. These replicas differ from standard ColumnStore nodes, in that they don't run the WriteEngineServer process. This means Read Replica nodes cannot handle write operations directly — instead, any write queries attempted on a replica are automatically forwarded to a read-write (RW) node.
Replicas utilize shared storage with other nodes in the cluster, ensuring data consistency without duplication. A key requirement is maintaining at least one RW node — a cluster consisting solely of read replicas is not operational and cannot process reads or writes.
Read-only nodes are incompatible with S3 as the storage backend.
Additionally, there is no automatic promotion of a read replica to RW mode if the only RW node fails, which could lead to temporary downtime until manual intervention.
Horizontal Read Scaling: Adds compute power for handling more read-intensive queries without impacting write performance.
Write Forwarding: Ensures writes on replicas are redirected to RW nodes, maintaining data integrity.
Shared Storage: Replicas access the same DBRoots as RW nodes, promoting efficiency and reducing storage overhead.
Add Read Replica. To introduce a read-only node for scaling reads, run this command:
Remove Node. To safely remove any node (RW or replica) from the cluster, run this command:
This reassigns resources as needed without cluster disruption.
Verify Status. To monitor the cluster's health and node roles, issue:
Node addition is restricted to private IPs only.
Incompatible with S3 storage, limiting use to shared file systems.
No automatic failover or promotion mechanism if the sole RW node goes down, requiring manual recovery.
At least one RW node must always be present for the cluster to function properly, supporting both read and write operations.
Refer to for exact mount points details.
Set Up MariaDB Repository
Run the following to add the MariaDB repository (adjust "11.4" to the latest stable version):
See for additional details about the ES repo setup.
Install Packages
For RPM-based systems, run this command:
Refer to for additional information.
For DEB-based systems, run these commands:
Start and Enable Services
Configure the Initial RW Node
On the primary RW node, set up the cluster API key (use a secure API key):
Add the Initial RW Node to the Cluster
Run this from the primary RW node:
Add Read Replica Nodes
From the primary RW node, add each read replica:
Verify the Cluster
Check the status to ensure nodes are added and the cluster is healthy:
Configure Replication Between Nodes
See for instructions on how to set up replication, and for instructions how to create user accounts and configure replication for multi-node local storage.
Configure MaxScale
See for instructions.
Step 1: Prepare Systems for Enterprise ColumnStore Nodes
This page details step 1 of a 5-step procedure for deploying .
This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.
On each server to host an Enterprise ColumnStore node, optimize the kernel:
Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.
Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:
Use the sysctl command to set the kernel parameters at runtime
The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.
The LSM will be configured and re-enabled later in this deployment procedure.
The steps to disable the LSM depend on the specific LSM used by the operating system.
SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.
To set SELinux to permissive mode:
Set SELinux to permissive mode:
Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.
For example, the file will usually look like this after the change:
Confirm that SELinux is in permissive mode:
SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.
AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.
Disable AppArmor:
Reboot the system.
Confirm that no AppArmor profiles are loaded using aa-status:
AppArmor will be configured and re-enabled later in this deployment procedure.
When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.
On RHEL 8, install additional dependencies:
Set the system's locale to en_US.UTF-8 by executing localedef:
Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
This page was step 1 of 5.
Step 9: Import Data
This page details step 9 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step bulk imports data to Enterprise ColumnStore.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Before data can be imported into the tables, create a matching schema.
On the primary server, create the schema:
For each database that you are importing, create the database with the CREATE DATABASE statement:
For each table that you are importing, create the table with the CREATE TABLE statement:
Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.
MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server run :
When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:
MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a remote MariaDB database:
Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 9 of 9.
This procedure is complete.
Step 9: Import Data
This page details step 9 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step bulk imports data to Enterprise ColumnStore.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Before data can be imported into the tables, create a matching schema.
On the primary server, create the schema:
For each database that you are importing, create the database with the CREATE DATABASE statement:
For each table that you are importing, create the table with the CREATE TABLE statement:
Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.
MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server run :
When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:
MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.
To import your data from a remote MariaDB database:
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 9 of 9.
This procedure is complete.
Step 1: Prepare Systems for Enterprise ColumnStore Nodes
This page details step 1 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.
This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.
On each server to host an Enterprise ColumnStore node, optimize the kernel:
Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.
Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:
Use the sysctl command to set the kernel parameters at runtime
The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.
The LSM will be configured and re-enabled later in this deployment procedure.
The steps to disable the LSM depend on the specific LSM used by the operating system.
SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.
To set SELinux to permissive mode:
Set SELinux to permissive mode:
Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.
For example, the file will usually look like this after the change:
Confirm that SELinux is in permissive mode:
SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.
AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.
Disable AppArmor:
Reboot the system.
Confirm that no AppArmor profiles are loaded using aa-status:
AppArmor will be configured and re-enabled later in this deployment procedure.
When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.
On RHEL 8, install additional dependencies:
Set the system's locale to en_US.UTF-8 by executing localedef:
If you want to use S3-compatible storage, it is important to create the S3 bucket before you start ColumnStore. If you already have an S3 bucket, confirm that the bucket is empty.
S3 bucket configuration will be performed later in this procedure.
Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
This page was step 1 of 5.
.
This page provides a major release upgrade procedure for MariaDB Enterprise ColumnStore. A major release upgrade is an upgrade from an older major release to a newer major release, such as an upgrade from MariaDB Enterprise ColumnStore 5 to MariaDB Enterprise ColumnStore 22.08.
Enterprise ColumnStore 5
Enterprise ColumnStore 6
Enterprise ColumnStore 22.08
This procedure assumes that the new Enterprise ColumnStore version will be installed onto new servers.
To reuse existing servers for the new Enterprise ColumnStore version, you must adapt the procedure detailed below. After step 1, confirm all data has been backed-up and verify backups. The old version of Enterprise ColumnStore should then be uninstalled, and all Enterprise ColumnStore files should be deleted before continuing with step 2.
On the old ColumnStore cluster, perform a full backup.
MariaDB recommends backing up the table schemas to a single SQL file and backing up the table data to table-specific CSV files.
For each table, obtain the table's schema by executing the SHOW CREATE TABLE :
Backup the table schemas by copying the output to an SQL file. This procedure assumes that the SQL file is named schema-backup.sql.
For each table, backup the table data to a CSV file using the SELECT .. INTO OUTFILE :
Copy the SQL file containing the table schemas and the CSV files containing the table data to the primary node of the new ColumnStore cluster.
On the new ColumnStore cluster, follow the deployment instructions of the desired topology for the new ColumnStore version.
For deployment instructions, see "".
On the new ColumnStore cluster, restore the table schemas and data.
Restore the schema backup using :
HOST and PORT should refer to the following:
If you are connecting with MaxScale as a proxy, they should refer to the host and port of the MaxScale listener
If you are connecting directly to a multi-node ColumnStore cluster, they should refer to the host and port of the primary ColumnStore node
If you are connecting directly to single-node ColumnStore, they should refer to the host and port of the ColumnStore node
When the command is executed, mariadb client prompts for the user password
For each table, restore the data from the table's CSV file by executing the on the primary ColumnStore node:
On the new ColumnStore cluster, verify that the table schemas and data have been restored.
For each table, verify the table's definition by executing the SHOW CREATE TABLE statement:
For each table, verify the number of rows in the table by executing SELECT COUNT(*):
For each table, verify the data in the table executing the statement.
If the table is very large, you can limit the number of rows in the result set by adding a LIMIT clause:
A number of system configuration variables exist to allow fine tuning of the system to suit the physical hardware and query characteristics. In general the default values will work relatively well for many cases.
The configuration parameters are maintained in the /etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.
Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information.
The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 25, to ensure enough physical memory.
The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 50, to ensure enough physical memory.
The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 50.
The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 25.
In a single server or combined deployment, the sum of NumBlocksPct and TotalUmMemory should typically not exceed 75% of physical memory. With very large memory servers this could be raised but the key point is to leave enough memory for other processes including mariadbd.
ColumnStore handles concurrent query execution by managing the rate of concurrent batch primitive steps. This is configured using the MaxOutstandingRequests parameter and has a default value of 20. Each batch primitive step is executed within the context of 1 extent column according to this high level process:
ColumnStore issues up to MaxOutstandingRequests number of batch primitive steps.
PrimProc processes the request, using many threads and returns its response. These generally take a fraction of a second up to a low number of seconds depending on the amount of Physical I/O and the performance of that storage.
ColumnStore issues new requests as prior requests complete maintaining the maximum number of outstanding requests.
This scheme allows for large queries to use all available resources when not otherwise being consumed and for smaller queries to execute with minimal delay. Lower values optimize for higher throughput of smaller queries while a larger value optimizes for response time of that query. The default value should work well under most circumstances however the value should be increased as the number of nodes is increased.
How many Queries are running and how many queries are currently in the queue can be checked with
ColumnStore maintains statistics for table and utilizes this to determine which is the larger table of the two. This is based both on the number of blocks in that table and estimation of the predicate cardinality. The first step is to apply any filters as appropriate to the smaller table and returning this data set to memory. The size of this data set is compared against the configuration parameter PmMaxMemorySmallSide which has a default value of 64 (MB). This value can be set all the way up to 4GB. This default allows for approximately 1M rows on the small table side to be joined against billions (or trillions) on the large table side. If the size of the small data set is less than PmMaxMemorySmallSide the dataset is sent to PrimProc for creation of a distributed hashmap. Thus this setting is important to tuning of joins and whether the operation can be distributed or not. This should be set to support your largest expected small table join size up to available memory:
Although this will increase the size of data between nodes to support the join, it means that the join and subsequent aggregates are pushed down, scaled out, and a smaller data set is returned back.
In a multiple server deployment, the sizing should be based from available physical memory on the servers, how much memory to reserve for block caching, and the number of simultaneous join operations that can be expected to run times the average small table join data size.
The above logic for a single table join extrapolates out to multi table joins where the small table values are precalculated and performed as one single scan against the large table. This works well for the typical star schema case joining multiple dimension tables with a large fact table. For some join scenarios it may be necessary to sequence joins to create the intermediate datasets for joining, this would happen for instance with a snowflake schema structure. In some extreme cases it may be hard for the optimizer to be able to determine the most optimal join path. In this case a hint is available to force a join ordering. The INFINIDB_ORDERED hint will force the first table in the from clause to be considered the largest table and override any statistics based decision, for example:
Note: INFINIDB\_ORDERED is deprecated and does not work anymore for ColumnStore 1.2 and above.
use set infinidb_ordered_only=ON;
and for 1.4 set columnstore_ordered_only=ON;
When a join is very large and exceeds the PmMaxMemorySmallSide setting, it is performed in memory. For very large joins, this could exceed the available memory, in which case this is detected and a query error reported. Several configuration parameters are available to enable and configure usage of disk overflow should this occur:
AllowDiskBasedJoin – Controls the option to use disk Based joins or not. Valid values are Y (enabled) or N (disabled). By default, this option is disabled.
TempFileCompression – Controls whether the disk join files are compressed or noncompressed. Valid values are Y (use compressed files) or N (use non-compressed files).
TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /tmp/columnstore_tmp_files/). Files (named infinidb-join-data*) in this directory will be created and cleaned on an as-needed basis. The entire directory is removed and recreated by ExeMgr at startup. It is strongly recommended that this directory be stored on a dedicated partition.
A MariaDB global or session variable is available to specify a memory limit at which point the query is switched over to disk-based joins:
infinidb_um_mem_limit - Memory limit in MB per user (i.e., switch to disk-based join if this limit is exceeded). By default, this limit is not set (value of 0).
When Enterprise ColumnStore executes a query, the on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.
Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:
Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.
As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.
For additional information, see ".".
Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT list).
In calGetTrace() output, a batch primitive step is abbreviated BPS.
Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.
Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.
In calGetTrace() output, a cross-engine step is abbreviated CES.
Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.
For additional information, refer to the ""
Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.
In calGetTrace() output, a dictionary structure step is abbreviated DSS.
Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.
Dictionary structure steps can require a lot of I/O for a couple of reasons:
Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.
Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.
It is generally recommended to avoid queries that will cause dictionary scans.
For additional information, see "Avoid Creating Long String Columns".
Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.
In calGetTrace() output, a hash join step is abbreviated HJS.
Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:
Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.
For additional information, see "" and "".
Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.
In calGetTrace() output, a having step is abbreviated HVS.
Enterprise ColumnStore defines a subquery step to evaluate a subquery.
In calGetTrace() output, a subquery step is abbreviated SQS.
Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.
In calGetTrace() output, a tuple aggregation step is abbreviated TAS.
Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.
In calGetTrace() output, a tuple annexation step is abbreviated TNS.
Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.
In Enterprise ColumnStore 6, disk-based aggregations can be enabled.
For additional information, see "".
Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.
In calGetTrace() output, a tuple union step is abbreviated TUS.
Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a tuple constant step to evaluate constant values.
In calGetTrace() output, a tuple constant step is abbreviated TCS.
Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.
Enterprise ColumnStore defines a window function step to evaluate window functions.
In calGetTrace() output, a window function step is abbreviated WFS.
Window function steps are evaluated locally by the on the initiator/aggregator node.
CREATE TABLE sales_data (
sale_id INT,
product_name VARCHAR(255),
category VARCHAR(100),
sale_date DATE,
quantity INT,
price DECIMAL(10, 2)
) ENGINE=ColumnStore;INSERT INTO sales_data (sale_id, product_name, category, sale_date, quantity, price) VALUES
(1, 'Laptop', 'Electronics', '2023-01-15', 1, 1200.00),
(2, 'Mouse', 'Electronics', '2023-01-15', 2, 25.00),
(3, 'Keyboard', 'Electronics', '2023-01-16', 1, 75.00);-- Get total sales per category
SELECT category, SUM(quantity * price) AS total_sales
FROM sales_data
WHERE sale_date BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY category
ORDER BY total_sales DESC;
-- Count distinct products
SELECT COUNT(DISTINCT product_name) FROM sales_data;MariaDB Replication
Highly available
Asynchronous or semi-synchronous replication
Automatic failover via MaxScale
Manual provisioning of new nodes from backup
Scales read via MaxScale.
Enterprise Server 10.3+, MaxScale 2.5+
Galera Cluster Topology Multi-Primary Cluster Powered by Galera for Transactional/OLTP Workloads
InnoDB Storage Engine
Highly available
Virtually synchronous, certification-based replication
Automated provisioning of new nodes (IST/SST)
Scales reads via MaxScale Enterprise Server 10.3+, MariaDB Enterprise Cluster (powered by Galera), MaxScale 2.5+
Columnar storage engine with shared local storage
Highly available
Automatic failover via MaxScale and CMAPI
Scales reads via MaxScale
Bulk data import
Enterprise Server, Enterprise ColumnStore, MaxScale
Optional Read Replica topology
Columnar storage engine with S3-compatible object storage
Highly available
Automatic failover via MaxScale and CMAPI
Scales reads via MaxScale
Bulk data import
Enterprise Server, Enterprise ColumnStore, MaxScale
Single-stack hybrid transactional/analytical workloads
ColumnStore for analytics with scalable S3-compatible object storage
InnoDB for transactions• Cross-engine JOINs
Enterprise Server, Enterprise ColumnStore, MaxScale
sudo mcs node add --read-replica --node <private-ip>sudo mcs node remove --node <private-ip>sudo mcs cluster statuswget https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup ;
chmod +x mariadb_es_repo_setup;
./mariadb_es_repo_setup --token="xxxxx" --apply --mariadb-server-version="11.4"sudo dnf install -y \
MariaDB-server MariaDB-columnstore-engine MariaDB-columnstore-cmapisudo apt update
sudo apt install -y mariadb-server mariadb-plugin-columnstore mariadb-columnstore-cmapisudo systemctl start mariadb
sudo systemctl enable mariadb
sudo systemctl start mariadb-columnstore-cmapi
sudo systemctl enable mariadb-columnstore-cmapisudo mcs cluster set api-key --key <your-api-key-here>sudo mcs node add --node <private-ip-of-rw-node>sudo mcs node add --read-replica --node <private-ip-of-replica>sudo mcs cluster status# minimize swapping
vm.swappiness = 1
# Increase the TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase the TCP buffer limits
# min, default, and max number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1
# for 1 GigE, increase this to 2500
# for 10 GigE, increase this to 30000
net.core.netdev_max_backlog = 2500$ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf$ sudo setenforce permissive# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targetedsudo getenforce
Permissive$ sudo systemctl disable apparmor$ sudo aa-statusapparmor module is loaded.
0 profiles are loaded.
0 profiles are in enforce mode.
0 profiles are in complain mode.
0 processes have profiles defined.
0 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.$ sudo yum install glibc-locale-source glibc-langpack-en$ sudo localedef -i en_US -f UTF-8 en_US.UTF-8CREATE DATABASE inventory;CREATE TABLE inventory.products (
product_name VARCHAR(11) NOT NULL DEFAULT '',
supplier VARCHAR(128) NOT NULL DEFAULT '',
quantity VARCHAR(128) NOT NULL DEFAULT '',
unit_cost VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;Shell
SQL access is not required
SQL
Shell access is not required
Remote Database
Use normal database client
Avoid dumping data to intermediate filed
$ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsvLOAD DATA INFILE '/tmp/inventory-products.tsv'
INTO TABLE inventory.products;$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory productsCREATE DATABASE inventory;CREATE TABLE inventory.products (
product_name VARCHAR(11) NOT NULL DEFAULT '',
supplier VARCHAR(128) NOT NULL DEFAULT '',
quantity VARCHAR(128) NOT NULL DEFAULT '',
unit_cost VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;Shell
SQL access is not required
SQL
Shell access is not required
Remote Database
Use normal database client
Avoid dumping data to intermediate filed
$ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsvLOAD DATA INFILE '/tmp/inventory-products.tsv'
INTO TABLE inventory.products;$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory products# minimize swapping
vm.swappiness = 1
# Increase the TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase the TCP buffer limits
# min, default, and max number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1
# for 1 GigE, increase this to 2500
# for 10 GigE, increase this to 30000
net.core.netdev_max_backlog = 2500$ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf$ sudo setenforce permissive# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted$ sudo getenforce
Permissive$ sudo systemctl disable apparmor$ sudo aa-statusapparmor module is loaded.
0 profiles are loaded.
0 profiles are in enforce mode.
0 profiles are in complain mode.
0 processes have profiles defined.
0 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.$ sudo yum install glibc-locale-source glibc-langpack-en$ sudo localedef -i en_US -f UTF-8 en_US.UTF-8SHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\GSELECT * INTO OUTFILE '/path/to/DATABASE_NAME-TABLE_NAME.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM DATABASE_NAME.TABLE_NAME;mariadb --host HOST --port PORT --user USER --password < schema-backup.sqlsudo cpimport -s ',' \
DATABASE_NAME \
TABLE_NAME \
/path/to/DATABASE_NAME-TABLE_NAME.csvSHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\GSELECT COUNT(*) FROM DATABASE_NAME.TABLE_NAME;SELECT * FROM DATABASE_NAME.TABLE_NAME LIMIT 100;SELECT calgetsqlcount();SELECT /*! INFINIDB_ORDERED */ r_regionkey
FROM region r, customer c, nation n
WHERE r.r_regionkey = n.n_regionkey
AND n.n_nationkey = c.c_nationkey

MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
For deployment instructions and available documentation, see "MariaDB Enterprise ColumnStore."
The ColumnStore storage engine has the following features:
Storage Engine
ColumnStore
Availability
ES 10.5+, CS 10.5+
MariaDB Enterprise Server
Workload Optimization
OLAP and Hybrid
Table Orientation
Columnar
ACID-compliant
Yes
Indexes
Unnecessary
Compression
Yes
High Availability (HA)
Yes
Main Memory Caching
Yes
Transaction Logging
Yes
Garbage Collection
Yes
Online Schema changes
Yes
Non-locking Reads
Yes
To create a ColumnStore table, use the statement with the ENGINE=ColumnStore option:
CREATE DATABASE columnstore_db;
CREATE TABLE columnstore_db.analytics_test (
id INT,
str VARCHAR(50)
) ENGINE = ColumnStore;To deploy a multi-node Enterprise ColumnStore deployment, a configuration similar to below is required:
[mariadb]
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ci
log_bin = mariadb-bin
log_bin_index = mariadb-bin.index
relay_log = mariadb-relay
relay_log_index = mariadb-relay.index
log_slave_updates = ON
gtid_strict_mode = ON
# This must be unique on each cluster node
server_id = 1To configure the mandatory utility user account, use the mcsSetConfig command:
sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
sudo mcsSetConfig CrossEngineSupport Port 3306
sudo mcsSetConfig CrossEngineSupport User cross_engine
sudo mcsSetConfig CrossEngineSupport Password cross_engine_passwdStep 3: Start and Configure Enterprise ColumnStore
This page details step 3 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.
This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Mandatory system variables and options for Single-Node Enterprise ColumnStore include:
Set this system variable to utf8
Set this system variable to utf8_general_ci
columnstore_use_import_for_batchinsert
Set this system variable to ALWAYS to always use cpimport for and statements.
[mariadb]
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ciStart and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb
$ sudo systemctl enable mariadbStart and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb-columnstore
$ sudo systemctl enable mariadb-columnstoreEnterprise ColumnStore requires a mandatory utility user account. By default, it connects to the server using the root user with no password. MariaDB Enterprise Server 10.6 will reject this login attempt by default, so you will need to configure Enterprise ColumnStore to use a different user account and password and create this user account on Enterprise Server.
On the Enterprise ColumnStore node, create the user account with the statement:
CREATE USER 'util_user'@'127.0.0.1'
IDENTIFIED BY 'util_user_passwd';On the Enterprise ColumnStore node, grant the user account SELECT privileges on all databases with the GRANT statement:
GRANT SELECT, PROCESS ON *.*
TO 'util_user'@'127.0.0.1';Configure Enterprise ColumnStore to use the utility user:
$ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
$ sudo mcsSetConfig CrossEngineSupport Port 3306
$ sudo mcsSetConfig CrossEngineSupport User util_userSet the password:
$ sudo mcsSetConfig CrossEngineSupport Password util_user_passwdFor details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".
Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.
The specific steps to configure the security module depend on the operating system.
Configure SELinux for Enterprise ColumnStore:
To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:
$ sudo yum install policycoreutils policycoreutils-pythonOn RHEL 8, install the following:
$ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utilsAllow the system to run under load for a while to generate SELinux audit events.
After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_localIf no audit events were found, this will print the following:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
Nothing to doIf audit events were found, the new SELinux policy can be loaded using semodule:
$ sudo semodule -i mariadb_local.ppSet SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.
For example, the file will usually look like this after the change:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targetedSet SELinux to enforcing mode:
$ sudo setenforce enforcingFor information on how to create a profile, see How to create an AppArmor Profile on ubuntu.com.
Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
This page was step 3 of 5.
This page details step 4 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.
This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Connect to the server using MariaDB Client using the root@localhost user account:
$ sudo mariadbWelcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 38
Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>Query and confirm that the ColumnStore storage engine plugin is ACTIVE:
SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM information_schema.PLUGINS
WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';+---------------------+---------------+
| PLUGIN_NAME | PLUGIN_STATUS |
+---------------------+---------------+
| Columnstore | ACTIVE |
| COLUMNSTORE_COLUMNS | ACTIVE |
| COLUMNSTORE_TABLES | ACTIVE |
| COLUMNSTORE_FILES | ACTIVE |
| COLUMNSTORE_EXTENTS | ACTIVE |
+---------------------+---------------+Create a test database, if it does not exist:
CREATE DATABASE IF NOT EXISTS test;Create a ColumnStore table:
CREATE TABLE IF NOT EXISTS test.contacts (
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100)
) ENGINE=ColumnStore;Add sample data into the table:
INSERT INTO test.contacts (first_name, last_name, email)
VALUES
("Kai", "Devi", "kai.devi@example.com"),
("Lee", "Wang", "lee.wang@example.com");Read data from table:
SELECT * FROM test.contacts;+------------+-----------+----------------------+
| first_name | last_name | email |
+------------+-----------+----------------------+
| Kai | Devi | kai.devi@example.com |
| Lee | Wang | lee.wang@example.com |
+------------+-----------+----------------------+Create an InnoDB table:
CREATE TABLE test.addresses (
email VARCHAR(100),
street_address VARCHAR(255),
city VARCHAR(100),
state_code VARCHAR(2)
) ENGINE = InnoDB;Add data to the table:
INSERT INTO test.addresses (email, street_address, city, state_code)
VALUES
("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");Perform a cross-engine join:
SELECT name AS "Name", addr AS "Address"
FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
email FROM test.contacts) AS contacts
INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
email FROM test.addresses) AS addr
WHERE contacts.email = addr.email;+----------+-----------------------------------------+
| Name | Address |
+----------+-----------------------------------------+
| Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
| Lee Wang | 32620 Little Blvd, Redwood City, CA |
+----------+-----------------------------------------+
+-------------------+-------------------------------------+
| Name | Address |
+-------------------+-------------------------------------+
| Walker Percy | 500 Thomas More Dr., Covington, LA |
| Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
+-------------------+-------------------------------------+Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
This page was step 4 of 5.
Step 4: Test Enterprise ColumnStore
This page details step 4 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.
This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore 23.10 includes a testS3Connection command to test the S3 configuration, permissions, and connectivity.
On each Enterprise ColumnStore node, test the S3 configuration:
$ sudo testS3ConnectionStorageManager[26887]: Using the config file found at /etc/columnstore/storagemanager.cnf
StorageManager[26887]: S3Storage: S3 connectivity & permissions are OK
S3 Storage Manager Configuration OKIf the testS3Connection command does not return OK, investigate the S3 configuration.
Connect to the server using using the root@localhost user account:
$ sudo mariadbWelcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 38
Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>Query and confirm that the ColumnStore storage engine plugin is ACTIVE:
SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM information_schema.PLUGINS
WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';+---------------------+---------------+
| PLUGIN_NAME | PLUGIN_STATUS |
+---------------------+---------------+
| Columnstore | ACTIVE |
| COLUMNSTORE_COLUMNS | ACTIVE |
| COLUMNSTORE_TABLES | ACTIVE |
| COLUMNSTORE_FILES | ACTIVE |
| COLUMNSTORE_EXTENTS | ACTIVE |
+---------------------+---------------+Create a test database, if it does not exist:
CREATE DATABASE IF NOT EXISTS test;Create a ColumnStore table:
CREATE TABLE IF NOT EXISTS test.contacts (
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100)
) ENGINE=ColumnStore;Add sample data into the table:
INSERT INTO test.contacts (first_name, last_name, email)
VALUES
("Kai", "Devi", "kai.devi@example.com"),
("Lee", "Wang", "lee.wang@example.com");Read data from table:
SELECT * FROM test.contacts;
+------------+-----------+----------------------+
| first_name | last_name | email |
+------------+-----------+----------------------+
| Kai | Devi | kai.devi@example.com |
| Lee | Wang | lee.wang@example.com |
+------------+-----------+----------------------+Create an InnoDB table:
CREATE TABLE test.addresses (
email VARCHAR(100),
street_address VARCHAR(255),
city VARCHAR(100),
state_code VARCHAR(2)
) ENGINE = InnoDB;Add data to the table:
INSERT INTO test.addresses (email, street_address, city, state_code)
VALUES
("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");Perform a cross-engine join:
SELECT name AS "Name", addr AS "Address"
FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
email FROM test.contacts) AS contacts
INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
email FROM test.addresses) AS addr
WHERE contacts.email = addr.email;+----------+-----------------------------------------+
| Name | Address |
+----------+-----------------------------------------+
| Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
| Lee Wang | 32620 Little Blvd, Redwood City, CA |
+----------+-----------------------------------------+
+-------------------+-------------------------------------+
| Name | Address |
+-------------------+-------------------------------------+
| Walker Percy | 500 Thomas More Dr., Covington, LA |
| Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
+-------------------+-------------------------------------+Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
This page was step 4 of 5.
This page provides information on optimizing Linux kernel parameters for improved performance with MariaDB ColumnStore.
MariaDB ColumnStore is a high-performance columnar database designed for analytical workloads. By optimizing the Linux kernel parameters, you can further enhance the performance of your MariaDB ColumnStore deployments.
The following table lists the recommended optimized Linux kernel parameters for MariaDB ColumnStore:
For more information refer to .
vm.overcommit_memory
1
Disables overcommitting of memory, ensuring sufficient memory is available for MariaDB ColumnStore.
vm.dirty_background_ratio
5
Sets the percentage of dirty memory that can be written back to disk in the background. A lower value reduces the amount of dirty memory, improving performance.
vm.dirty_ratio
10
Sets the percentage of dirty memory that can be written back to disk before the kernel starts to write out clean pages. A lower value reduces the amount of dirty memory, improving performance.
vm.vfs_cache_pressure
50
Sets the pressure level for the kernel's VFS cache. A lower value reduces the amount of memory used by the VFS cache, improving performance.
net.core.netdev_max_backlog
2500
Sets the maximum number of packets that can be queued for a network device. A higher value allows for more packets to be queued, improving performance.
net.core.rmem_max
16777216
Sets the maximum receive buffer size for TCP sockets. A higher value allows for larger buffers, improving performance.
net.core.wmem_max
16777216
Sets the maximum send buffer size for TCP sockets. A higher value allows for larger buffers, improving performance.
net.ipv4.tcp_max_syn_backlog
8192
Sets the maximum number of queued SYN requests. A higher value allows for more queued requests, improving performance.
net.ipv4.tcp_timestamps
0
Disables TCP timestamps, reducing overhead and improving performance.
vm.max_map_count
4,262,144
Increases the maximum number of memory map areas a process may have. The default is 65,530, which can be too low for workloads like MariaDB ColumnStore. Raising this prevents mapping errors for processes that need large address spaces.
kernel.pid_max
4,194,304
Defines the maximum process ID value. Older Linux versions defaulted to 32,768; newer versions default to 4,194,304. Raising this ensures support for systems running a very large number of processes concurrently.
kernel.threads-max
2,000,000
Specifies the maximum number of threads allowed on the system. The default varies depending on available RAM. A value of 2 million is suitable for systems with 32–64GB RAM. Increase further if running with more RAM or requiring more threads.
To configure these parameters, you can add them to the /etc/sysctl.conf file. For example:
vm.overcommit_memory=1
vm.dirty_background_ratio=5
vm.dirty_ratio=10
vm.vfs_cache_pressure=50
net.core.netdev_max_backlog=2500
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_max_syn_backlog=8192
net.ipv4.tcp_timestamps=0After making changes to the /etc/sysctl.conf file, you need to apply the changes by running the following command:
sudo sysctl -pcat /proc/sys/kernel/threads-max
cat /proc/sys/kernel/pid_max
cat /proc/sys/vm/max_map_count
# Rhel /etc/sysctl.conf
sudo echo "vm.max_map_count=4262144" >> /etc/sysctl.conf
sudo echo "kernel.pid_max = 4194304" >> /etc/sysctl.conf
sudo echo "kernel.threads-max = 2000000" >> /etc/sysctl.conf
# There may be a file called 50-pid-max.conf or perhaps something similar. If so, modify it
sudo echo "vm.max_map_count=4262144" > /usr/lib/sysctl.d/50-max_map_count.conf
sudo echo "kernel.pid_max = 4194304" > /usr/lib/sysctl.d/50-pid-max.conf
sudo sysctl -pThese optimized parameters are recommended for all MariaDB ColumnStore deployments, regardless of the specific workload. They can improve performance for various use cases, including:
Large-scale data warehousing
Real-time analytics
Business intelligence
Machine learning
By optimizing the Linux kernel parameters, you can significantly improve the performance of your MariaDB ColumnStore deployments. These recommendations provide a starting point for optimizing your system, and you may need to adjust the values based on your specific hardware and workload.
When tuning queries for MariaDB Enterprise ColumnStore, there are some important details to consider.
Enterprise ColumnStore only reads the columns that are necessary to resolve a query.
For example, the following query selects every column in the table:
SELECT * FROM tab;Whereas the following query only selects two columns in the table, so it requires less I/O:
SELECT col1, col2 FROM tab;For best performance, only select the columns that are necessary to resolve a query.
When Enterprise ColumnStore performs ORDER BY and LIMIT operations, the operations are performed in a single-threaded manner after the rest of the query processing has been completed, and the full unsorted result-set has been retrieved. For large data sets, the performance overhead can be significant.
When Enterprise ColumnStore 5 performs aggregations (i.e., DISTINCT, GROUP BY, COUNT(*), etc.), all of the aggregation work happens in-memory by default. As a consequence, more complex aggregation operations require more memory in that version.
For example, the following query could require a lot of memory in Enterprise ColumnStore 5, since it has to calculate many distinct values in memory:
SELECT DISTINCT col1 FROM tab LIMIT 10000;Whereas the following query could require much less memory in Enterprise ColumnStore 5, since it has to calculate fewer distinct values:
SELECT DISTINCT col1 FROM tab LIMIT 100;In Enterprise ColumnStore 6, disk-based aggregations can be enabled.
For best performance, avoid excessive aggregations or enable disk-based aggregations.
For additional information, see "Configure Disk-Based Aggregations".
When Enterprise ColumnStore evaluates built-in functions and aggregate functions, it can often evaluate the function in a distributed manner. Distributed evaluation of functions can significantly improve performance.
Enterprise ColumnStore supports distributed evaluation for some built-in functions. For other built-in functions, the function must be evaluated serially on the final result set.
Enterprise ColumnStore also supports distributed evaluation for user-defined functions developed with ColumnStore's User-Defined Aggregate Function (UDAF) C++ API. For functions developed with Enterprise Server's standard User-Defined Function (UDF) API, the function must be evaluated serially on the final result set.
For best performance, avoid non-distributed functions.
By default, Enterprise ColumnStore performs all joins as in-memory hash joins.
If the joined tables are very large, the in-memory hash join can require too much memory for the default configuration. There are a couple options to work around this:
Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.
Enterprise ColumnStore can be configured to use disk-based joins.
Enterprise ColumnStore can use optimizer statistics to better optimize the join order.
For additional information, see "Configure In-Memory Joins", "Configure Disk-Based Joins", and "Optimizer Statistics".
Enterprise ColumnStore uses extent elimination to optimize queries. Extent elimination uses the minimum and maximum values in the extent map to determine which extents can be skipped for a query.
When data is loaded into Enterprise ColumnStore, it appends the data to the latest extent. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent. As a consequence, if ordered data is loaded in its proper order, then similar values will be clustered together in the same extent. This can improve query performance, because extent elimination performs best when similar values are clustered together.
For example, if you expect to query a table with a filter on a timestamp column, you should sort the data using the timestamp column before loading it into Enterprise ColumnStore. Later, when the table is queried with a filter on the timestamp column, Enterprise ColumnStore would be able to skip many extents using extent elimination.
For best performance, load ordered data in proper order.
When Enterprise ColumnStore performs mathematical operations with very big values using the , , and data types, the operation can sometimes overflow ColumnStore's maximum precision or scale. The maximum precision and scale depend on the version of Enterprise ColumnStore:
In Enterprise ColumnStore 6, the maximum precision (M) is 38, and the maximum scale (D) is 38.
In Enterprise ColumnStore 5, the maximum precision (M) is 18, and the maximum scale (D) is 18.
In Enterprise ColumnStore 6, applications can configure Enterprise ColumnStore to check for decimal overflows by setting the columnstore_decimal_overflow_check system variable, but only when the column has a decimal precision that is 18 or more:
SET SESSION columnstore_decimal_overflow_check=ON;
SELECT (big_decimal1 * big_decimal2) AS product
FROM columnstore_tab;When decimal overflow checks are enabled, math operations have extra overhead.
When the decimal overflow check fails, MariaDB Enterprise ColumnStore raises an error with the ER_INTERNAL_ERROR error SQL, and it writes detailed information about the overflow check failure to the ColumnStore system logs.
MariaDB Enterprise ColumnStore supports Enterprise Server's standard User-Defined Function (UDF) API. However, UDFs developed using that API cannot be executed in a distributed manner.
To support distributed execution of custom SQL, MariaDB Enterprise ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:
The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.
These functions can also be used as Analytic (Window) functions just like any built-in aggregate function.
Step 2: Configure Shared Local Storage
This page details step 2 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step configures shared local storage on systems hosting Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.
The Storage Manager directory is at the following path:
/var/lib/columnstore/storagemanager
The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:
/var/lib/columnstore/data1
/var/lib/columnstore/data2
/var/lib/columnstore/data3
The DB Root directories must be mounted on every ColumnStore node.
Select a Shared Local Storage solution for the Storage Manager directory:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
Filestore
GlusterFS
NFS (Network File System)
For additional information, see "Shared Local Storage Options".
EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.
For Enterprise ColumnStore deployments in AWS:
EBS Multi-Attach is a recommended option for the Storage Manager directory.
Amazon S3 storage is the recommended option for data.
Consult the vendor documentation for details on how to configure EBS Multi-Attach.
EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)
For deployments in AWS:
EFS is a recommended option for the Storage Manager directory.
Amazon S3 storage is the recommended option for data.
Consult the vendor documentation for details on how to configure EFS.
Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).
For Enterprise ColumnStore deployments in GCP:
Filestore is the recommended option for the Storage Manager directory.
Google Object Storage (S3-compatible) is the recommended option for data.
Consult the vendor documentation for details on how to configure Filestore.
GlusterFS is a distributed file system.
GlusterFS is a shared local storage option, but it is not one of the recommended options.
For more information, see "".
On each Enterprise ColumnStore node, install GlusterFS.
Install on CentOS / RHEL 8 (YUM):
Install on CentOS / RHEL 7 (YUM):
Install on Debian (APT):
Install on Ubuntu (APT):
Start the GlusterFS daemon:
Before you can create a volume with GlusterFS, you must probe each node from a peer node.
On the primary node, probe all of the other cluster nodes:
On one of the replica nodes, probe the primary node to confirm that it is connected:
On the primary node, check the peer status:
Number of Peers: 2
Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.
On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:
On the primary node, create the GlusterFS volumes:
On the primary node, start the volume:
On each Enterprise ColumnStore node, create mount points for the volumes:
On each Enterprise ColumnStore node, add the mount points to /etc/fstab:
On each Enterprise ColumnStore node, mount the volumes:
NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.
For on-premises deployments:
NFS is the recommended option for the Storage Manager directory.
Any S3-compatible storage is the recommended option for data.
Consult the documentation for your NFS implementation for details on how to configure NFS.
Navigation in the procedure "".
This page was step 2 of 9.
Step 2: Configure Shared Local Storage
This page details step 2 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step configures shared local storage on systems hosting Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.
The Storage Manager directory is at the following path:
/var/lib/columnstore/storagemanager
The Storage Manager directory must be mounted on every ColumnStore node.
Select a Shared Local Storage solution for the Storage Manager directory:
For additional information, see "".
EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.
For Enterprise ColumnStore deployments in AWS:
EBS Multi-Attach is a recommended option for the Storage Manager directory.
Amazon S3 storage is the recommended option for data.
Consult the vendor documentation for details on how to configure EBS Multi-Attach.
EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)
For deployments in AWS:
EFS is a recommended option for the Storage Manager directory.
Amazon S3 storage is the recommended option for data.
Consult the vendor documentation for details on how to configure EFS.
Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).
For Enterprise ColumnStore deployments in GCP:
Filestore is the recommended option for the Storage Manager directory.
Google Object Storage (S3-compatible) is the recommended option for data.
Consult the vendor documentation for details on how to configure Filestore.
GlusterFS is a distributed file system. GlusterFS is a shared local storage option, but it is not one of the recommended options.
For more information, see "".
On each Enterprise ColumnStore node, install GlusterFS.
Install on CentOS / RHEL 8 (YUM):
Install on CentOS / RHEL 7 (YUM):
Install on Debian (APT):
Install on Ubuntu (APT):
Start the GlusterFS daemon:
Before you can create a volume with GlusterFS, you must probe each node from a peer node.
On the primary node, probe all of the other cluster nodes:
On one of the replica nodes, probe the primary node to confirm that it is connected:
On the primary node, check the peer status:
Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.
On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:
On the primary node, create the GlusterFS volumes:
On the primary node, start the volume:
On each Enterprise ColumnStore node, create mount points for the volumes:
On each Enterprise ColumnStore node, add the mount points to /etc/fstab:
On each Enterprise ColumnStore node, mount the volumes:
NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.
For on-premises deployments:
NFS is the recommended option for the Storage Manager directory.
Any S3-compatible storage is the recommended option for data.
Consult the documentation for your NFS implementation for details on how to configure NFS.
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 2 of 9.
Learn how to import data into MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.
MariaDB Enterprise ColumnStore supports very efficient bulk data loads.
MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms, including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:
Bypasses the SQL layer to decrease overhead
Does not block read queries
Requires a write metadata lock on the table, which can be monitored with the
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "".
Supports parallel distributed bulk loads
Imports data from text files
Imports data from binary files
Imports data from standard input (stdin)
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for the following statements:
[[|load-data-infileLOAD DATA [ LOCAL ] INFILE]]
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. It executes cpimport using a wrapper called cpimport.bin.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the .
$ sudo yum install --enablerepo=PowerTools glusterfs-server$ sudo yum install centos-release-gluster
$ sudo yum install glusterfs-server$ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
$ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
$ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
$ DEBARCH=$(dpkg --print-architecture)
$ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list$ sudo apt update$ sudo apt install glusterfs-server$ sudo apt update$ sudo apt install glusterfs-server$ sudo systemctl start glusterd$ sudo systemctl enable glusterd$ sudo gluster peer probe mcs2$ sudo gluster peer probe mcs3$ sudo gluster peer probe mcs1
peer probe: Host mcs1 port 24007 already in peer list$ sudo gluster peer statusHostname: mcs2
Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
State: Peer in Cluster (Connected)
Hostname: mcs3
Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
State: Peer in Cluster (Connected)$ sudo mkdir -p /brick/storagemanager$ sudo gluster volume create storagemanager \
replica 3 \
mcs1:/brick/storagemanager \
mcs2:/brick/storagemanager \
mcs3:/brick/storagemanager \
force$ sudo gluster volume start storagemanager$ sudo mkdir -p /var/lib/columnstore/storagemanager127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0$ sudo mount -a$ sudo yum install --enablerepo=PowerTools glusterfs-server$ sudo yum install centos-release-gluster
$ sudo yum install glusterfs-server$ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
$ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
$ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
$ DEBARCH=$(dpkg --print-architecture)
$ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list$ sudo apt update$ sudo apt install glusterfs-server$ sudo apt update$ sudo apt install glusterfs-server$ sudo systemctl start glusterd$ sudo systemctl enable glusterd$ sudo gluster peer probe mcs2$ sudo gluster peer probe mcs3$ sudo gluster peer probe mcs1peer probe: Host mcs1 port 24007 already in peer list$ sudo gluster peer statusNumber of Peers: 2
Hostname: mcs2
Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
State: Peer in Cluster (Connected)
Hostname: mcs3
Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
State: Peer in Cluster (Connected)$ sudo mkdir -p /brick/storagemanager$ sudo gluster volume create storagemanager \
replica 3 \
mcs1:/brick/storagemanager \
mcs2:/brick/storagemanager \
mcs3:/brick/storagemanager \
force$ sudo gluster volume start storagemanager$ sudo mkdir -p /var/lib/columnstore/storagemanager127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0$ sudo mount -aFastest
Shell
• Text file. • Binary file • Standard input (stdin)
• Server file system
Lowest latency. • Bypasses SQL layer. • Non-blocking
Fast
columnstore_info.load_from_s3
SQL
• Text file.
• S3-compatible object storage
• Loads data from the cloud. • Translates operation to cpimport command. • Non-blocking
Fast
SQL
• Text file.
• Server file system • Client file system
• Translates operation to cpimport command. • Non-blocking
Slow
SQL
• Other table(s).
• Same MariaDB server
• Translates operation to cpimport command. • Non-blocking





Step 1: Prepare ColumnStore Nodes
This page details step 1 of the 9-step procedure "Multnode Localstorage".
This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.
On each server to host an Enterprise ColumnStore node, optimize the kernel:
Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.
Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:
# minimize swapping
vm.swappiness = 1
# Increase the TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase the TCP buffer limits
# min, default, and max number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1
# for 1 GigE, increase this to 2500
# for 10 GigE, increase this to 30000
net.core.netdev_max_backlog = 2500Use the sysctl command to set the kernel parameters at runtime
$ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.confThe Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.
The LSM will be configured and re-enabled later in this deployment procedure.
The steps to disable the LSM depend on the specific LSM used by the operating system.
SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.
To set SELinux to permissive mode:
Set SELinux to permissive mode:
$ sudo setenforce permissiveSet SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.
For example, the file will usually look like this after the change:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targetedConfirm that SELinux is in permissive mode:
$ sudo getenforce
PermissiveSELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.
AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.
Disable AppArmor:
$ sudo systemctl disable apparmorReboot the system.
Confirm that no AppArmor profiles are loaded using aa-status:
$ sudo aa-statusapparmor module is loaded.
0 profiles are loaded.
0 profiles are in enforce mode.
0 profiles are in complain mode.
0 processes have profiles defined.
0 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.AppArmor will be configured and re-enabled later in this deployment procedure.
MariaDB Enterprise ColumnStore requires the following TCP ports:
3306
Port used for MariaDB Client traffic
8600-8630
Port range used for inter-node communication
8640
Port used by CMAPI
8700
Port used for inter-node communication
8800
Port used for inter-node communication
The firewall should be temporarily disabled on each Enterprise ColumnStore node during installation.
The firewall will be configured and re-enabled later in this deployment procedure.
The steps to disable the firewall depend on the specific firewall used by the operating system.
Check if the firewalld service is running:
$ sudo systemctl status firewalldIf the firewalld service is running, stop it:
$ sudo systemctl stop firewalldFirewalld will be configured and re-enabled later in this deployment procedure.
Check if the UFW service is running:
$ sudo ufw status verboseIf the UFW service is running, stop it:
$ sudo ufw disableUFW will be configured and re-enabled later in this deployment procedure.
To install Enterprise ColumnStore on Amazon Web Services (AWS), the security group must be modified prior to installation.
Enterprise ColumnStore requires all internal communications to be open between Enterprise ColumnStore nodes. Therefore, the security group should allow all protocols and all ports to be open between the Enterprise ColumnStore nodes and the MaxScale proxy.
When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.
On RHEL 8, install additional dependencies:
$ sudo yum install glibc-locale-source glibc-langpack-enSet the system's locale to en_US.UTF-8 by executing localedef:
$ sudo localedef -i en_US -f UTF-8 en_US.UTF-8MariaDB Enterprise ColumnStore requires all nodes to have host names that are resolvable on all other nodes. If your infrastructure does not configure DNS centrally, you may need to configure static DNS entries in the /etc/hosts file of each server.
On each Enterprise ColumnStore node, edit the /etc/hosts file to map host names to the IP address of each Enterprise ColumnStore node:
192.0.2.1 mcs1
192.0.2.2 mcs2
192.0.2.3 mcs3
192.0.2.100 mxs1Replace the IP addresses with the addresses in your own environment.
Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 1 of 9.
Step 7: Start and Configure MariaDB MaxScale
This page details step 7 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step starts and configures MariaDB MaxScale 22.08.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.
On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:
[maxscale]
threads = auto
admin_host = 0.0.0.0
admin_secure_gui = falseFor additional information, see "Global Parameters".
On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:
$ sudo systemctl restart maxscaleFor additional information, see "Start and Stop Services".
On the MaxScale node, use maxctrl create to create a server object for each Enterprise ColumnStore node:
$ maxctrl create server mcs1 192.0.2.101$ maxctrl create server mcs2 192.0.2.102$ maxctrl create server mcs3 192.0.2.103MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).
On the MaxScale node, use maxctrl create monitor to create a MariaDB Monitor:
$ maxctrl create monitor columnstore_monitor mariadbmon \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
replication_user=repl \
replication_password='REPLICATION_USER_PASSWORD' \
--servers mcs1 mcs2 mcs3In this example:
columnstore_monitor is an arbitrary name that is used to identify the new monitor.
mariadbmon is the name of the module that implements the MariaDB Monitor.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.
password='MAXSCALE_USER_PASSWORD' sets the password parameter to the password used by the database user account that MaxScale uses to monitor the ColumnStore nodes.
replication_user=REPLICATION_USER sets the replication_user parameter to the database user account that MaxScale uses to setup replication.
replication_password='REPLICATION_USER_PASSWORD' sets the replication_password parameter to the password used by the database user account that MaxScale uses to setup replication.
--servers sets the servers parameter to the set of nodes that MaxScale should monitor. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by mariadbmon in MaxScale 22.08 can also be specified.
Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.
Connection-based load balancing
Routes connections to Enterprise ColumnStore nodes designated as replica servers for a read-only pool
Routes connections to an Enterprise ColumnStore node designated as the primary server for a read-write pool.
Query-based load balancing
Routes write queries to an Enterprise ColumnStore node designated as the primary server
Routes read queries to Enterprise ColumnStore node designated as replica servers
Automatically reconnects after node failures
Automatically replays transactions after node failures
Optionally enforces causal reads
Use MaxScale Read Connection Router (readconnroute) to route connections to replica servers for a read-only pool.
On the MaxScale node, use maxctrl create service to create a router:
$ maxctrl create service connection_router_service readconnroute \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
router_options=slave \
--servers mcs1 mcs2 mcs3In this example:
connection_router_service is an arbitrary name that is used to identify the new service.
readconnroute is the name of the module that implements the Read Connection Router.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.
password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.
router_options=slave sets the router_options parameter to slave, so that MaxScale only routes connections to the replica nodes.
--servers sets the servers parameter to the set of nodes to which MaxScale should route connections. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by readconnroute in MaxScale 22.08 can also be specified.
These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.
On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read Connection Router (readconnroute):
$ maxctrl create listener connection_router_service connection_router_listener 3308 \
protocol=MariaDBClientIn this example:
connection_router_service is the name of the readconnroute service that was previously created.
connection_router_listener is an arbitrary name that is used to identify the new listener.
3308 is the TCP port.
protocol=MariaDBClient sets the protocol parameter.
Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.
MaxScale Read/Write Split Router (readwritesplit) performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.
On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the Read/Write Split Router (readwritesplit):
$ maxctrl create service query_router_service readwritesplit \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
--servers mcs1 mcs2 mcs3In this example:
query_router_service is an arbitrary name that is used to identify the new service.
readwritesplit is the name of the module that implements the Read/Write Split Router.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.
password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.
--servers sets the servers parameter to the set of nodes to which MaxScale should route queries. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by readwritesplit in MaxScale 22.08 can also be specified.
These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.
On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read/Write Split Router (readwritesplit):
$ maxctrl create listener query_router_service query_router_listener 3307 \
protocol=MariaDBClientIn this example:
query_router_service is the name of the readwritesplit service that was previously created.
query_router_listener is an arbitrary name that is used to identify the new listener.
3307 is the TCP port.
protocol=MariaDBClient sets the protocol parameter.
Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.
To start the services and monitors, on the MaxScale node use maxctrl start services:
$ maxctrl start servicesNavigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 7 of 9.
To remove a node from Enterprise ColumnStore, perform the following procedure.
The server object for the node must be unlinked from the service using :
Unlink the server object from the service using the unlink service command.
As the first argument, provide the name of the service.
As the second argument, provide the name of the server.
maxctrl unlink service \
mcs_service \
mcs3To confirm that the server object was properly unlinked from the service, the service should be checked using :
Show the services using the show services command, like this:
maxctrl show servicesThe server object for the node must be unlinked from the monitor using :
Unlink a server object from the monitor using the unlink monitor command.
As the first argument, provide the name of the monitor.
As the second argument, provide the name of the server.
maxctrl unlink monitor \
mcs_monitor \
mcs3To confirm that the server object was properly unlinked from the monitor, the monitor should be checked using :
Show the monitors using the show monitors command, like this:
maxctrl show monitorsThe server object for the node must also be removed from MaxScale using :
Use or another supported REST client.
Remove the server object using the destroy server command.
As the first argument, provide the name for the server.
For example:
maxctrl destroy server \
mcs3To confirm that the server object was properly removed, the server objects should be checked using :
Show the server objects using the show servers command, like this:
maxctrl show serversThe Enterprise Server. Enterprise ColumnStore, and CMAPI services can be stopped using the systemctl command.
Perform the following procedure on the node:
Stop the MariaDB Enterprise Server service:
sudo systemctl stop mariadbStop the MariaDB Enterprise ColumnStore service:
sudo systemctl stop mariadb-columnstoreStop the CMAPI service:
sudo systemctl stop mariadb-columnstore-cmapiThe node must be removed from Enterprise ColumnStore using CMAPI:
Remove the node using the remove-node endpoint path.
Use a supported REST client, such as curl .
Format the JSON output using jq for enhanced readability.
Authenticate using the configured API key.
Include the required headers.
For example, if the primary node's host name is mcs1 and the IP address for the node to remove is 192.0.2.3:
In ES 10.5.10-7 and later:
curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.3"}' \
| jq .In ES 10.5.9-6 and earlier:
curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/remove-node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.3"}' \
| jq .Example output:
{
"timestamp": "2020-10-28 00:39:14.672142",
"node_id": "192.0.2.3"
}To confirm that the node was properly removed, the status of Enterprise ColumnStore should be checked using CMAPI:
Check the status using the status endpoint path.
For example, if the primary node's host name is mcs1:
curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .Example output:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 2
}MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple storage options.
This page discusses how to backup and restore Enterprise ColumnStore when it uses S3-compatible object storage for data and shared local storage (such as NFS) for the Storage Manager directory.
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.
If Enterprise ColumnStore is your system of record, it should be backed up regularly.
If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the following items must be backed up:
The MariaDB Data directory is backed up using .
The S3 bucket must be backed up using the vendor's snapshot procedure.
The Storage Manager directory must be backed up.
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/mcs cluster status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .The output will show "dbrm_mode": "master" for the primary server:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
$ mariadb --host=192.0.2.1 \
--user=root \
--passwordLock the database with the statement:
FLUSH TABLES WITH READ LOCK;Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.
For example, to make a copy of the directory with rsync:
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/Use to backup the MariaDB data directory:
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwdUse to prepare the backup:
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.
Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the statement:
UNLOCK TABLES;Use the following process to restore a backup:
Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadbRestore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.
For example, to restore the backup with rsync:
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanagerUse to restore the backup of the MariaDB data directory:
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysqlRestore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.
Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf.
For example:
[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_pathThe default local cache size is 2 GB.
The default local cache path is /var/lib/columnstore/storagemanager/cache.
Ensure that the local cache path has sufficient store space to store the local cache.
The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.
To use an IAM role, you must also uncomment and set iam_role_name, sts_region, and sts_endpoint.
Start the services on each node:
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapiMariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple storage options.
This page discusses how to backup and restore Enterprise ColumnStore when it uses shared local storage (such as NFS) for the DB Root directories.
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.
If Enterprise ColumnStore is your system of record, it should be backed up regularly.
If Enterprise ColumnStore uses shared local storage for the DB Root directories, the following items must be backed up:
The MariaDB Data directory is backed up using
The Storage Manager directory must be backed up
Each DB Root directories must be backed up
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .The output will show dbrm_mode: master for the primary server:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
$ mariadb --host=192.0.2.1 \
--user=root \
--passwordLock the database with the statement:
FLUSH TABLES WITH READ LOCK;Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.
For example, to make a copy of the directory with rsync:
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/Make a copy or snapshot of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to make a copy of the directories with rsync in a 3-node deployment:
$ sudo rsync -av /var/lib/columnstore/data1 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data2 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data3 /backups/columnstore/202101291600/Use to backup the Storage Manager directory:
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwdUse to prepare the backup:
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the statement:
UNLOCK TABLES;Use the following process to restore a backup:
Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadbRestore the backup of the Storage Manager director. By default, it is located at /var/lib/columnstore/storagemanager.
For example, to restore the backup with rsync:
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanagerRestore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to restore the backup with rsync in a 3-node deployment:
$ sudo rsync -av /backups/columnstore/202101291600/data1/ /var/lib/columnstore/data1/
$ sudo rsync -av /backups/columnstore/202101291600/data2/ /var/lib/columnstore/data2/
$ sudo rsync -av /backups/columnstore/202101291600/data3/ /var/lib/columnstore/data3/
$ sudo chown -R mysql:mysql /var/lib/columnstore/data1
$ sudo chown -R mysql:mysql /var/lib/columnstore/data2
$ sudo chown -R mysql:mysql /var/lib/columnstore/data3Use to restore the backup of the MariaDB data directory:
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysqlStart the services on each node:
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapiThe ColumnStore engine does not fully support recursive Common Table Expressions (CTEs). Attempting to use recursive CTEs directly against ColumnStore tables typically results in an error.
The purpose of the following examples is to demonstrate three potential workarounds for this issue. The best fit for your organization will depend on your specific needs and ability to refactor queries and adjust your approach.
It simulates a simple organizational chart with employees and managers to illustrate the problem and the workarounds.
First, an InnoDB table for comparison:
CREATE TABLE employees_innodb (
id INT PRIMARY KEY,
name VARCHAR(100),
manager_id INT -- references employees.id (nullable for top-level)
);
INSERT INTO employees_innodb (id, name, manager_id) VALUES
(1, 'CEO', NULL),
(2, 'VP of Sales', 1),
(3, 'Sales Rep A', 2),
(4, 'VP of Eng', 1),
(5, 'Eng A', 4),
(6, 'Eng B', 4);
Next, the ColumnStore table, which is where the CTE issue arises:
CREATE TABLE employees (
id INT,
name VARCHAR(100),
manager_id INT -- references employees.id (nullable for top-level)
) engine=columnstore;
INSERT INTO employees (id, name, manager_id) VALUES
(1, 'CEO', NULL),
(2, 'VP of Sales', 1),
(3, 'Sales Rep A', 2),
(4, 'VP of Eng', 1),
(5, 'Eng A', 4),
(6, 'Eng B', 4);
Attempting to run a recursive CTE directly on the employees (ColumnStore) table:
WITH RECURSIVE org_chart AS (
-- Anchor: start with the top-level manager (CEO)
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE id = 1
UNION ALL
-- Recursive step: find employees who report to the previous level
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;
This will result in the aforementioned error:
ERROR 1178 (42000): The storage engine for the table doesn't support Recursive CTEHere are three potential workarounds to address the recursive CTE limitation with MariaDB ColumnStore.
You can temporarily bypass ColumnStore's SELECT handler by disabling it at the session level before executing your recursive CTE and then re-enabling it afterwards.
SET SESSION columnstore_select_handler=OFF;
WITH RECURSIVE org_chart AS (
-- Anchor: start with the top-level manager (CEO)
SELECT id, name, manager_id, 0 AS level
FROM employees
WHERE id = 1
UNION ALL
-- Recursive step: find employees who report to the previous level
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;
SET SESSION columnstore_select_handler=ON;
Note: This workaround may not always be effective, as its success can depend on the specific MariaDB server version and table definitions.
If direct recursive CTEs fail or cause server crashes, you can simulate the recursive logic using a stored procedure and a temporary table. This approach iteratively populates the hierarchy.
First, create a temporary table to store the hierarchical data:
CREATE TABLE temp_org_chart (
id INT,
name VARCHAR(100),
manager_id INT,
level INT
);
-- Initialize the temporary table with the top-level employees
INSERT INTO temp_org_chart (id, name, manager_id, level)
SELECT id, name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL;Next, create a stored procedure to iteratively populate the temp_org_chart table:
DELIMITER //
CREATE OR REPLACE PROCEDURE populate_org_chart()
BEGIN
DECLARE v_level INT DEFAULT 1;
DECLARE rows_inserted INT DEFAULT 1;
-- Loop until no more rows are inserted, indicating the hierarchy is fully traversed
WHILE rows_inserted > 0 DO
-- Insert employees who report to the previous level
INSERT INTO temp_org_chart (id, name, manager_id, level)
SELECT e.id, e.name, e.manager_id, v_level
FROM employees e
JOIN temp_org_chart t ON e.manager_id = t.id
WHERE t.level = v_level - 1
AND NOT EXISTS (
SELECT 1 FROM temp_org_chart x WHERE x.id = e.id
);
-- Get the number of rows inserted in the current iteration
SET rows_inserted = ROW_COUNT();
-- Increment the level for the next iteration
SET v_level = v_level + 1;
END WHILE;
END //
DELIMITER ;Finally, call the stored procedure and then select from the populated temporary table:
CALL populate_org_chart();
SELECT * FROM temp_org_chart;Another robust workaround is to clone the structure and data of the ColumnStore table into an InnoDB table. Once the data resides in an InnoDB table, you can execute the recursive CTE as usual, as InnoDB fully supports them.
This approach involves a few steps, often executed via shell commands interacting with the MariaDB client:
Extract and Modify CREATE TABLE Statement: Use SHOW CREATE TABLE to get the definition of your ColumnStore table, then modify it to change the engine to InnoDB and give the new table a different name (e.g., employees2).
mariadb test -qsNe "SHOW CREATE TABLE employees" \
| awk -F '\t' '{print $2}' \
| sed -e 's/ENGINE=Columnstore/ENGINE=InnoDB/' \
-e 's/CREATE TABLE `employees`/CREATE TABLE `employees2`/' \
> create_employees2.sql
Create New Table and Copy Data: Execute the modified CREATE TABLE script to create the new InnoDB table, then insert all data from the original ColumnStore table into it.
mariadb test < create_employees2.sql
mariadb test -e "INSERT INTO employees2 SELECT * FROM employees"Run Recursive CTE on the InnoDB Table: Now, with the data in employees2 (an InnoDB table), you can run your recursive CTE without issues.
WITH RECURSIVE org_chart AS (
-- Anchor: start with the top-level manager (CEO)
SELECT id, name, manager_id, 0 AS level
FROM employees2
WHERE id = 1
UNION ALL
-- Recursive step: find employees who report to the previous level
SELECT e.id, e.name, e.manager_id, oc.level + 1
FROM employees2 e
JOIN org_chart oc ON e.manager_id = oc.id
)
SELECT * FROM org_chart;This guide provides steps for deploying a single-node ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.
This procedure describes the deployment of the Single-Node Enterprise ColumnStore topology with Local storage.
MariaDB Enterprise ColumnStore 23.10 is a columnar storage engine for MariaDB Enterprise Server 10.6. Enterprise ColumnStore is best suited for Online Analytical Processing (OLAP) workloads.
This procedure has 5 steps, which are executed in sequence.
This page provides an overview of the topology, requirements, and deployment procedures.
Please read and understand this procedure before executing.
Customers can obtain support by .
The following components are deployed during this procedure:
The Single-Node Enterprise ColumnStore topology provides support for Online Analytical Processing (OLAP) workloads to MariaDB Enterprise Server.
The Enterprise ColumnStore node:
Receives queries from the application
Executes queries
Uses the local disk for storage.
Single-Node Enterprise ColumnStore does not provide high availability (HA) for Online Analytical Processing (OLAP). If you would like to deploy Enterprise ColumnStore with high availability, see .
These requirements are for the Single-Node Enterprise ColumnStore, when deployed with MariaDB Enterprise Server 10.6 and MariaDB Enterprise ColumnStore 23.10.
Debian 11 (x86_64, ARM64)
Debian 12 (x86_64, ARM64)
Red Hat Enterprise Linux 8 (x86_64, ARM64)
Red Hat Enterprise Linux 9 (x86_64, ARM64)
Rocky Linux 8 (x86_64, ARM64)
Rocky Linux 9 (x86_64, ARM64)
Ubuntu 20.04 LTS (x86_64, ARM64)
Ubuntu 22.04 LTS (x86_64, ARM64)
Ubuntu 24.04 LTS (x86_64, ARM64)
MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the instead.
The minimum hardware requirements are:
MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.
If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:
And the following error message will be raised to the client:
MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.
The recommended hardware requirements are:
MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.
To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.
The systemctl command is used to start and stop the MariaDB Enterprise Server service.
Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:
Next: Step 1: Install MariaDB Enterprise ColumnStore 23.10.
Step 1: Prepare ColumnStore Nodes
This page details step 1 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.
On each server to host an Enterprise ColumnStore node, optimize the kernel:
Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.
Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:
Use the sysctl command to set the kernel parameters at runtime
The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.
The LSM will be configured and re-enabled later in this deployment procedure.
The steps to disable the LSM depend on the specific LSM used by the operating system.
SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.
To set SELinux to permissive mode:
Set SELinux to permissive mode:
Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.
For example, the file will usually look like this after the change:
Confirm that SELinux is in permissive mode:
SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.
AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.
Disable AppArmor:
Reboot the system.
Confirm that no AppArmor profiles are loaded using aa-status:
AppArmor will be configured and re-enabled later in this deployment procedure.
MariaDB Enterprise ColumnStore requires the following TCP ports:
The firewall should be temporarily disabled on each Enterprise ColumnStore node during installation.
The firewall will be configured and re-enabled later in this deployment procedure.
The steps to disable the firewall depend on the specific firewall used by the operating system.
Check if the firewalld service is running:
If the firewalld service is running, stop it:
Firewalld will be configured and re-enabled later in this deployment procedure.
Check if the UFW service is running:
If the UFW service is running, stop it:
UFW will be configured and re-enabled later in this deployment procedure.
To install Enterprise ColumnStore on Amazon Web Services (AWS), the security group must be modified prior to installation.
Enterprise ColumnStore requires all internal communications to be open between Enterprise ColumnStore nodes. Therefore, the security group should allow all protocols and all ports to be open between the Enterprise ColumnStore nodes and the MaxScale proxy.
When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.
On RHEL 8, install additional dependencies:
Set the system's locale to en_US.UTF-8 by executing localedef:
MariaDB Enterprise ColumnStore requires all nodes to have host names that are resolvable on all other nodes. If your infrastructure does not configure DNS centrally, you may need to configure static DNS entries in the /etc/hosts file of each server.
On each Enterprise ColumnStore node, edit the /etc/hosts file to map host names to the IP address of each Enterprise ColumnStore node:
Replace the IP addresses with the addresses in your own environment.
With the ColumnStore Object Storage topology, it is important to create the S3 bucket before you start ColumnStore. All Enterprise ColumnStore nodes access data from the same bucket.
If you already have an S3 bucket, confirm that the bucket is empty.
S3 bucket configuration will be performed later in this procedure.
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 1 of 9.
.
Step 7: Start and Configure MariaDB MaxScale
This page details step 7 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step starts and configures MariaDB MaxScale 22.08.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.
On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:
For additional information, see "Global Parameters".
On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:
For additional information, see "Start and Stop Services".
On the MaxScale node, use to create a server object for each Enterprise ColumnStore node:
MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).
On the MaxScale node, use to create a MariaDB Monitor:
In this example:
columnstore_monitor is an arbitrary name that is used to identify the new monitor.
mariadbmon is the name of the module that implements the MariaDB Monitor.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.
password='MAXSCALE_USER_PASSWORD' sets the password parameter to the password used by the database user account that MaxScale uses to monitor the ColumnStore nodes.
replication_user=REPLICATION_USER sets the replication_user parameter to the database user account that MaxScale uses to setup replication.
replication_password='REPLICATION_USER_PASSWORD' sets the replication_password parameter to the password used by the database user account that MaxScale uses to setup replication.
--servers sets the servers parameter to the set of nodes that MaxScale should monitor. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by mariadbmon in MaxScale 22.08 can also be specified.
Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.
Use to route connections to replica servers for a read-only pool.
On the MaxScale node, use to create a router:
In this example:
connection_router_service is an arbitrary name that is used to identify the new service.
readconnroute is the name of the module that implements the Read Connection Router.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.
password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.
router_options=slave sets the router_options parameter to slave, so that MaxScale only routes connections to the replica nodes.
--servers sets the servers parameter to the set of nodes to which MaxScale should route connections. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by readconnroute in MaxScale 22.08 can also be specified.
These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.
On the MaxScale node, use the command to configure MaxScale to use a listener for the :
In this example:
connection_router_service is the name of the readconnroute service that was previously created.
connection_router_listener is an arbitrary name that is used to identify the new listener.
3308 is the TCP port.
protocol=MariaDBClient sets the protocol parameter.
Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.
MaxScale performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.
On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the :
In this example:
query_router_service is an arbitrary name that is used to identify the new service.
readwritesplit is the name of the module that implements the Read/Write Split Router.
user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.
password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.
--servers sets the servers parameter to the set of nodes to which MaxScale should route queries. All non-option arguments after --servers are interpreted as server names.
Other Module Parameters supported by readwritesplit in MaxScale 22.08 can also be specified.
These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.
On the MaxScale node, use the command to configure MaxScale to use a listener for the :
In this example:
query_router_service is the name of the readwritesplit service that was previously created.
query_router_listener is an arbitrary name that is used to identify the new listener.
3307 is the TCP port.
protocol=MariaDBClient sets the protocol parameter.
Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.
To start the services and monitors, on the MaxScale node use :
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 7 of 9.
MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.
It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:
Create 4 extents in 4 files
When these are filled up (after 32M rows), create 4 more extents in the 4 files created in step 1.
When these are filled up (after 64M rows), create a new partition.
Information about all partitions for a given column can be retrieved using the calShowPartitions stored procedure which takes either two or three mandatory parameters: [database_name], table_name, and column_name. If two parameters are provided the current database is assumed. For example:
The calEnablePartitions stored procedure allows for enabling of one or more partitions. The procedure takes the same set of parameters as calDisablePartitions.
For example:
The result showing the first partition has been enabled:
The calDisablePartitions stored procedure allows for disabling of one or more partitions. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes either two or three mandatory parameters: [database_name], table_name, and partition_numbers separated by commas. If two parameters are provided the current database is assumed.
For example:
The result showing the first partition has been disabled:
The calDropPartitions stored procedure allows for dropping of one or more partitions. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of parameters as calDisablePartitions. Extra caution should be used with this procedure since it is destructive and cannot be reversed.
For example:
The result showing the first partition has been dropped:
Information about a range of parititions for a given column can be retrieved using the calShowPartitionsByValue stored procedure. This procedure takes either four or five mandatory parameters: [database_name], table_name,`` column_name,`` start_value, and`` end_value. If four parameters are provided, the current database is assumed. Only casual partition column types (, , , , up to 8 bytes and up to 7 bytes) are supported for this function.
The function returns a list of partitions whose minimum and maximum values for the column col_name fall completely within the range of start_value and end_value.
For example:
The calEnablePartitionsbyValue stored procedure allows for enabling of one or more partitions by value. The procedure takes the same set of arguments as calShowPartitionsByValue.
A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calEnablePartitionsbyValue call.
For example:
The result showing the first partition has been enabled:
The calDisablePartitionsByValue stored procedure allows for disabling of one or more partitions by value. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes the same set of arguments as calShowPartitionsByValue.
A good practice is to use calShowPartitionsByValue to identify the partitions to be disabled and then the same argument values used to construct the calDisablePartitionsByValue call. For example:
The result showing the first partition has been disabled:
The calDropPartitionsByValue stored procedure allows for dropping of one or more partitions by value. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calDropPartitionsByValue call. Extra caution should be used with this procedure since it is destructive and cannot be reversed.
For example:
The result showing the first partition has been dropped:
Since the partitioning scheme is system-maintained, the minimum and maximum values are not directly specified, but influenced by the order of data loading. If you want to drop a specific date range, additional deletes are required to achieve this. The following cases may occur:
For semi-ordered data, there may be overlap between minimum and maximum values between partitions.
As in the example above, the partition ranges from 1992-01-01 to 1998-08-02. It may be desirable to drop the remaining 1998 rows.
A bulk-delete statement can be used to delete the remaining rows that do not fall exactly within partition ranges. The partition drops will be fastest; however, the system optimizes bulk-delete statements to delete by block internally. This is still relatively fast.
MariaDB Query Accelerator is an Alpha release. Do not use it in production environments. Query Accelerator works only in ColumnStore 25.10.0 and with MariaDB Enterprise Server 11.8.3+.
Query Accelerator allows MariaDB to use ColumnStore to execute queries that are otherwise executed by InnoDB. Under the hood, Columnstore:
receives a query;
searches for applicable Engine Independent statistics for InnoDB table index column;
applies RBO rule to transform its InnoDB tables into a number of UNION queries over non-overlapping ranges of a suitable InnoDB table index;
retrieves the data in parallel from MariaDB, and runs it using Columnstore runtime.
Query Accelerator improves the performance of queries that use aggregation functions, for instance SUM, AVG, MIN, MAX, and GROUP BY, where the performance overhead of pulling the data out of InnoDB can be overcome by the performance optimization of running in the ColumnStore engine.
This avoids the bottleneck/pipeline of having to move data out of InnoDB and into ColumnStore. Query Accelerator strives to parallelize data out of InnoDB, by utilizing table statistics to optimize multiple threads to data ranges on disk. If the InnoDB table in question uses an index, Query Accelerator is able to get the data much faster.
Example of a query benefitting from Query Accelerator (assuming column_a is indexed):
The effectiveness of Query Accelerator can vary depending on the type of queries you run and the specific characteristics of your database schema. Certain types of queries or configurations may not benefit from Query Accelerator, or could potentially experience decreased performance. It's essential to understand when Query Accelerator is most advantageous and when traditional InnoDB operations might be more efficient. Consider the following points to optimize query performance with Query Accelerator:
Make sure your query uses tables that are indexed, and the index key has the first integer column.
Also, run ANALYZE TABLE before running Query Accelerator.
Performance issues occur for queries like this:
InnoDB handles such comparison much better than ColumnStore in general, and in Query Accelerator, that would be even worse.
Generally, if your query takes longer than a minute in InnoDB, try Query Accelerator.
Query Accelerator has the same limitations as ColumnStore in general, in that it has a limited set of and it can handle. Therefore, be aware of
syntax or functions that Columnstore does not support;
data types ColumnStore does not support.
Edit the MariaDB configuration file (my.cnf or my.ini)
Locate (or create) the mariadb section, and add a line enabling Query Accelerator, like this:
Restart MariaDB Server for the change to take effect.
Run queries to turn on Query Accelerator
Set these parameters in a client session:
To use Query Accelerator just for one query, you have to run those SET statements per query, not per session. Setting them per session effectively disables the MariaDB Optimizer for subsequent queries that ColumnStore cannot execute.
There must be engine-independent statistics for an InnoDB table index column so that it can be used for Query Accelerator.
columnstore_unstable_optimizer
enables unstable optimizer that is required for Query Accelerator RBO rule.
columnstore_select_handler
enables/disables ColumnStore processing for InnoDB tables.
columnstore_query_accel_parallel_factor
controls the number of parallel ranges to be used for Query Accelerator.
Watch out for max_connections. If you set columnstore_query_accel_parallel_factor to a high value, you may need to increase max_connections to avoid connection pool exhaustion.
There are two ways to verify Query Accelerator is being used:
Use select mcs_get_plan('rules') to get a list of the rules that were applied to the query.
Look for patterns like derived table - $added_sub_#db_name_#table_name_X in the optimized plan using select mcs_get_plan('optimized').
This example shows a SUM(x) GROUP BY y query which runs ~2.6s in InnoDB with indexes, and 3x faster via ColumnStore query acceleration ( ~0.7s ), provided there's enough CPU and a high enough parallel_factor.
In mariadb (MariaDB command-line client), run these statements:
Turn on Query Accelerator - On CLI:
In mariadb (MariaDB command-line client), run these statements:
Log out of mariadb (MariaDB command-line client), and log in again.
In mariadb (MariaDB command-line client), run these statements:
Turn off Query Accelerator - On CLI:
Tail the ColumnStore log debug.log, and confirm parallel access to InnoDB:
Increase or decrease parallelism with columnstore_ces_optimization_parallel_factor. Keep in mind you need enough max_connections in MariaDB server:
Check the execution plan via EXPLAIN FORMAT=JSON. It should say Pushed select:
Verify that mcs_get_plan shows parallel_ces, and that the detailed ColumnStore execution plan shows derived table:
The high level components of the ColumnStore architecture are:
PrimProc: PrimProc (Primitives Processor) is responsible for parsing the SQL requests into an optimized set of primitive job steps executed by one or more servers. PrimProc is thus responsible for query optimization and orchestration of query execution by the servers. While every instance has their own PrimProc in a multi-server deployment, each query begins and ends on the same PrimProc it originated from. A database load balancer such as MariaDB MaxScale can be deployed to appropriately balance external requests against individual servers. PrimProc also executes granular job steps received from the server (mariadbd) in a multi-threaded manner. ColumnStore allows distribution of the work across many servers.
Extent Maps: ColumnStore maintains metadata about each column in a shared distributed object known as the Extent Map. The primary node references the Extent Map to help assist in generating the correct primitive job steps. The primary node server references the Extent Map to identify the correct disk blocks to read. Each column is made up of one or more files and each file can contain multiple extents. As much as possible the system attempts to allocate contiguous physical storage to improve read performance.
Storage: ColumnStore can use either local storage or shared storage (e.g. SAN or EBS) to store data. Using shared storage allows for data processing to fail over to another node automatically in case of a server failing.
The system supports full MVCC ACID transactional logic via Insert, Update, and Delete statements. The MVCC architecture allows for concurrent query and DML / batch load. Although DML is supported, the system is optimized more for batch inserts and so larger data loads should be achieved through a batch load. The most flexible and optimal way to load data is via the cpimport tool. This tool optimizes the load path and can be run centrally or in parallel on each server.
If the data contains a time or (time correlated ascending value) column then significant performance gains will be achieved if the data is sorted by this field and also typically queried with a where clause on that column. This is because the system records a minimum and maximum value for each extent providing for a system maintained range partitioning scheme. This allows the system to completely eliminate scanning an extent map if the query includes a where clause for that field limiting the results to a subset of extent maps.
MariaDB ColumnStore has its own query optimizer and execution engine distinct from the MariaDB server implementation. This allows for scaling out query execution to multiple servers, and to optimize for handling data stored as columns rather than rows. As such, the factors influencing query performance are very different:
A query is first parsed by the MariaDB server (mariadbd) process and passed through to the ColumnStore storage engine. This passes the request onto the PrimProc process which is responsible for optimizing and orchestrating execution of the query. The PrimProc module's optimizer creates a series of batch primitive steps that are executed on all nodes in the cluster. Since multiple servers can be deployed, this allows for scale-out execution of the queries. The optimizer attempts to process query execution in parallel. However, certain operations inherently must be executed centrally, for example final result ordering. Filtering, joins, aggregates, and GROUP BY clauses are general.y pushed down and executed in parallel in PrimProc on all servers. In PrimProc, batch primitive steps are performed at a granular level where individual threads operate on individual 1K-8K blocks within an extent. This enables a larger multi-core server to be fully consumed and scale out within a single server. The current batch primitive steps available in the system include:
Single Column Scan: Scan one or more Extents for a given column based on a single column predicate, including operators like =, <>, IN (list), BETWEEN, and ISNULL. See the first scan section of for additional details on tuning this.
Additional Single Column Filters: Project additional columns for any rows found by a previous scan and apply additional single column predicates as needed. Access of blocks is based on row identifier, going directly to the blocks. See the additional column read section of for additional details on tuning this.
Table Level Filters: Project additional columns as required for any table level filters such as column1 < column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.
Project Join Columns for Joins: Project additional join columns as needed for any join operations. Access of blocks is again based on row identifier, going directly to the blocks. See the join tuning section of for additional details on tuning this.
Execute Multi-Join: Apply one or more hash join operation against projected join columns, and use that value to probe a previously built hash map. Build out tuples as needed to satisfy inner or outer join requirements. See the multi-table join section of for additional details on tuning this.
Cross-Table Level Filters: Project additional columns from the range of rows for the Primitive Step as needed for any cross-table level filters such as table1.column1 < table2.column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.
Aggregation/Distinct Operation Part 1: Apply any local group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. Part 1 of this process is handled by PrimProc.
Aggregation/Distinct Operation Part 2: Apply any final group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. This processing is handled by PrimProc. See the memory management section of for additional details on tuning this.
The following items should be considered when thinking about query execution in ColumnStore vs a row based store such as InnoDB.
ColumnStore is optimized for large scale aggregation / OLAP queries over large data sets. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries. Instead ColumnStore gains performance by only scanning necessary columns, utilizing system maintained partitioning, and utilizing multiple threads and servers to scale query response time.
Since ColumnStore only reads the necessary columns to resolve a query, only include the necessary columns required. For example, SELECT * is significantly slower than SELECT col1, col2 FROM tbl.
Datatype size is important. If say you have a column that can only have values 0 through 100 then declare this as a tinyint as this will be represented with 1 byte rather than 4 bytes for int. This reduces the I/O cost by 4 times.
For string types, an important threshold is CHAR(9) and VARCHAR(8) or greater. Each column storage file uses a fixed number of bytes per value. This enables fast positional lookup of other columns to form the row. Currently the upper limit for columnar data storage is 8 bytes. So. for strings longer than this, the system maintains an additional 'dictionary' extent where the values are stored. The columnar extent file then stores a pointer into the dictionary. For example, it is more expensive to read and process a VARCHAR(8) column than a CHAR(8) column. Where possible, you get better performance if you can utilize shorter strings, especially if you avoid the dictionary lookup. All TEXT/BLOB data types in ColumnStore 1.1 onward utilize a dictionary and do a multiple-block 8KB lookup to retrieve that data if required. The longer the data, the more blocks are retrieved, and the greater is a potential performance impact.
In a row-based system, adding redundant columns adds to the overall query cost, but in a columnar system a cost is only occurred if the column is referenced. Therefore, additional columns should be created to support different access paths. For instance, store a leading portion of a field in one column to allow for faster lookups, but additionally store the long-form value as another column. Scans on a shorter code or a leading-portion column are faster.
ColumnStore distributes function application across all nodes for greater performance, but this requires a distributed implementation of the function in addition to the MariaDB server implementation. See for the full list.
Hash joins are utilized by ColumnStore to optimize for large scale joins and avoid the need for indexes and the overhead of nested loop processing. ColumnStore maintains table statistics so as to determine the optimal join order. This is implemented by first identifying the small table side (based on extent map data) and materializing the necessary rows from that table for the join. If the size of this is less than the configuration setting PmMaxMemorySmallSide, the join is pushed down into PrimProc for distributed in-memory processing. Otherwise, the larger side rows is not processed in a distributed manner for joining, and only the WHERE clause on that side is executed across all PrimProc modules in the cluster. If the join is too large for memory, disk-based join can be enabled to allow the query to complete.
Similarly to scalar functions ColumnStore distributes aggregate evaluation as much as possible. However some post processing is required to combine the final results. Enough memory must exist to handle queries with a very large number of values in the aggregate columns.
Aggregation performance is also influenced by the number of distinct aggregate column values. Generally, the same number of rows with 100 distinct values computes faster than 10000 distinct values. This is due to increased memory management as well as transfer overhead.
ORDER BY and LIMIT are implemented at the very end by the mariadbd server process on the temporary result set table. This means that the unsorted results must be fully retrieved before either are applied. The performance overhead of this is minimal on small to medium results, but for larger results, it can be significant.
Subqueries are executed in sequence thus the subquery intermediate results must be materialized and then the join logic applies with the outer query.
Window functions are executed as part of final aggregation in PrimProc due to the need for ordering of the window results. The ColumnStore window function engines uses a dedicated faster sort process.
Automated system partitioning of columns is provided by ColumnStore. As data is loaded into extent maps, the system will capture and maintain min/max values of column data in that extent map. New rows are appended to each extent map until full at which point a new extent map is created. For column values that are ordered or semi-ordered this allows for very effective data partitioning. By using the min and max values, entire extent maps can be eliminated and not read to filter data. This generally works particularly well for time dimension / series data or similar values that increase over time.
Step 1
Step 2
Step 3
Step 4
Step 5
MariaDB Enterprise Server
Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.
Columnar Storage Engine
Optimized for Online Analytical Processing (OLAP) workloads
Enterprise ColumnStore node
4+ cores
16+ GB
Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.Enterprise ColumnStore node
64+ cores
128+ GB
Configuration File
Configuration files (such as /etc/my.cnf) can be used to set and . The server must be restarted to apply changes made to configuration files.
Command-line
The server can be started with command-line options that set and .
SQL
Users can set that support dynamic changes on-the-fly using the statement.
CentOS
Red Hat Enterprise Linux (RHEL)
/etc/my.cnf.d/z-custom-mariadb.cnf
Debian
Ubuntu
/etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf
Start
sudo systemctl start mariadb
Stop
sudo systemctl stop mariadb
Restart
sudo systemctl restart mariadb
Enable during startup
sudo systemctl enable mariadb
Disable during startup
sudo systemctl disable mariadb
Status
sudo systemctl status mariadb

# minimize swapping
vm.swappiness = 1
# Increase the TCP max buffer size
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
# Increase the TCP buffer limits
# min, default, and max number of bytes to use
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# don't cache ssthresh from previous connection
net.ipv4.tcp_no_metrics_save = 1
# for 1 GigE, increase this to 2500
# for 10 GigE, increase this to 30000
net.core.netdev_max_backlog = 2500$ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf$ sudo setenforce permissive# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted$ sudo getenforce
Permissive$ sudo systemctl disable apparmor$ sudo aa-statusapparmor module is loaded.
0 profiles are loaded.
0 profiles are in enforce mode.
0 profiles are in complain mode.
0 processes have profiles defined.
0 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.3306
Port used for MariaDB Client traffic
8600-8630
Port range used for inter-node communication
8640
Port used by CMAPI
8700
Port used for inter-node communication
8800
Port used for inter-node communication
$ sudo systemctl status firewalld$ sudo systemctl stop firewalld$ sudo ufw status verbose$ sudo ufw disable$ sudo yum install glibc-locale-source glibc-langpack-en$ sudo localedef -i en_US -f UTF-8 en_US.UTF-8192.0.2.1 mcs1
192.0.2.2 mcs2
192.0.2.3 mcs3
192.0.2.100 mxs1[maxscale]
threads = auto
admin_host = 0.0.0.0
admin_secure_gui = false$ sudo systemctl restart maxscale$ maxctrl create server mcs1 192.0.2.101$ maxctrl create server mcs2 192.0.2.102$ maxctrl create server mcs3 192.0.2.103$ maxctrl create monitor columnstore_monitor mariadbmon \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
replication_user=repl \
replication_password='REPLICATION_USER_PASSWORD' \
--servers mcs1 mcs2 mcs3Connection-based load balancing
Routes connections to Enterprise ColumnStore nodes designated as replica servers for a read-only pool
Routes connections to an Enterprise ColumnStore node designated as the primary server for a read-write pool.|
Query-based load balancing
Routes write queries to an Enterprise ColumnStore node designated as the primary server
Routes read queries to Enterprise ColumnStore node designated as replica servers
Automatically reconnects after node failures
Automatically replays transactions after node failures
Optionally enforces causal reads|
$ maxctrl create service connection_router_service readconnroute \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
router_options=slave \
--servers mcs1 mcs2 mcs3$ maxctrl create listener connection_router_service connection_router_listener 3308 \
protocol=MariaDBClient$ maxctrl create service query_router_service readwritesplit \
user=mxs \
password='MAXSCALE_USER_PASSWORD' \
--servers mcs1 mcs2 mcs3$ maxctrl create listener query_router_service query_router_listener 3307 \
protocol=MariaDBClient$ maxctrl start servicesselect calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)select calEnablePartitions('orders', '0.0.1');
+----------------------------------------+
| calEnablePartitions('orders', '0.0.1') |
+----------------------------------------+
| Partitions are enabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 rows in set (0.05 sec)select calDisablePartitions('orders','0.0.1');
+----------------------------------------+
| calDisablePartitions('orders','0.0.1') |
+----------------------------------------+
| Partitions are disabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)select calDropPartitions('orders', '0.0.1');
+--------------------------------------+
| calDropPartitions('orders', '0.0.1') |
+--------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------+
1 row in set (0.28 sec)select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02', '2010-07-24') |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)select calEnablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+--------------------------------------------------------------------------------+
| calenablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+--------------------------------------------------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------------------------------------------------+
1 row in set (0.28 sec)select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 rows in set (0.05 sec)select calDisablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+---------------------------------------------------------------------------------+
| caldisablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+---------------------------------------------------------------------------------+
| Partitions are disabled successfully |
+---------------------------------------------------------------------------------+
1 row in set (0.28 sec)select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24’ ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)select calDropPartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+------------------------------------------------------------------------------+
| caldroppartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+------------------------------------------------------------------------------+
| Partitions are enabled successfully. |
+------------------------------------------------------------------------------+
1 row in set (0.28 sec)select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)DELETE FROM orders WHERE orderdate <= '1998-12-31';SELECT column_a, SUM(column_b) FROM innodb_table GROUP BY column_a SELECT column_a FROM tbl WHERE column_a = column_b [mariadb]
columnstore_innodb_queries_use_mcs = onSET columnstore_unstable_optimizer=ON;
SET optimizer_switch="index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=on,in_to_exists=off,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=off,orderby_uses_equalities=off,condition_pushdown_for_derived=on,split_materialized=off,condition_pushdown_for_subquery=off,rowid_filter=off,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off,sargable_casefold=off";ANALYZE TABLE table_name PERSISTENT FOR COLUMNS (column_name) indexes();CREATE DATABASE IF NOT EXISTS test; USE test;
CREATE TABLE IF NOT EXISTS test.customer_indexed ( `c_d_id` int(2) NOT NULL, `c_w_id` int(6) NOT NULL, `c_first` varchar(16) , `c_middle` char(2) , `c_last` varchar(16) , `c_street_1` varchar(20) , `c_street_2` varchar(20) , `c_city` varchar(20) , `c_state` char(2) , `c_zip` int(5) , `c_phone` char(16) , `c_since` datetime DEFAULT NULL, `c_credit` char(2) , `c_credit_lim` decimal(12,2) DEFAULT NULL, `c_discount` decimal(4,4) DEFAULT NULL, `c_balance` decimal(12,2) DEFAULT NULL, `c_ytd_payment` decimal(12,2) DEFAULT NULL, `c_payment_cnt` int(8) DEFAULT NULL, `c_delivery_cnt` int(8) DEFAULT NULL, `c_data` varchar(500)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO test.customer_indexed SELECT ROUND(RAND() * 42000, 0), ROUND(RAND() * 42000, 0), substring(MD5(RAND()*1000000000),1,16), substring(MD5(RAND()),1,2), substring(MD5(RAND()*1000000000),1,16), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()),1,2), ROUND(RAND() * 42000, 0), substring(MD5(RAND()),1,16), CURRENT_TIMESTAMP - INTERVAL FLOOR(RAND() * 365 * 24 * 60 *60) SECOND, substring(MD5(RAND()),1,2), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 0, 4), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 42000, 0), ROUND(RAND() * 42000, 0), substring(MD5(RAND()*1000000000),1,500) FROM seq_1_to_8000000; -- 3.5 min
ALTER TABLE test.customer_indexed ADD INDEX idx_fast (`c_zip`, `c_payment_cnt`); -- ~1.5 min
-- baseline
SELECT c_zip, sum(c_payment_cnt) FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ; --2.6s sed -i 's/^#columnstore_innodb_queries_use_mcs = on/columnstore_innodb_queries_use_mcs = on/' /etc/my.cnf.d/columnstore.cnf
systemctl restart mariadb# In mariadb (MariaDB command-line client)
USE test;
ANALYZE table test.customer_indexed PERSISTENT FOR COLUMNS (c_zip,c_payment_cnt) indexes(); --8s
SELECT table_name, column_name, hist_type FROM mysql.column_stats WHERE table_name="customer_indexed";
SHOW VARIABLES LIKE "%columnstore_innodb_queries_use_mcs%";SET columnstore_unstable_optimizer=ON;
SET optimizer_switch='index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=on,in_to_exists=off,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=off,orderby_uses_equalities=off,condition_pushdown_for_derived=on,split_materialized=off,condition_pushdown_for_subquery=off,rowid_filter=off,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off,sargable_casefold=off';
SELECT c_zip, sum(c_payment_cnt) FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ; -- 0.7ssed -i 's/^columnstore_innodb_queries_use_mcs = on/#columnstore_innodb_queries_use_mcs = on/' /etc/my.cnf.d/columnstore.cnf
systemctl restart mariadbtail -f /var/log/mariadb/columnstore/debug.logSET columnstore_ces_optimization_parallel_factor=100;EXPLAIN FORMAT=JSON SELECT c_zip, SUM(c_payment_cnt) FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ;
...
| {
"query_block": {
"select_id": 1,
"table": {
"message": "Pushed select"
}
}
} |
...SELECT mcs_get_plan('rules');
+-----------------------+
| mcs_get_plan('rules') |
+-----------------------+
| parallel_ces |
+-----------------------+
SELECT mcs_get_plan('optimized');
+-----------------------+
| mcs_get_plan('rules') |
+-----------------------+
...
>>From Tables
derived table - $added_sub_test_customer_indexed_0
Step 5: Test MariaDB Enterprise Server
This page details step 5 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Use Systemd to test whether the MariaDB Enterprise Server service is running. This action is performed on each Enterprise ColumnStore node.
Check if the MariaDB Enterprise Server service is running by executing the following:
$ systemctl status mariadbIf the service is not running on any node, start the service by executing the following on that node:
$ sudo systemctl start mariadbUse MariaDB Client to test the local connection to the Enterprise Server node.
This action is performed on each Enterprise ColumnStore node:
$ sudo mariadb
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 38
Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>The sudo command is used here to connect to the Enterprise Server node using the root@localhost user account, which authenticates using the unix_socket authentication plugin. Other user accounts can be used by specifying the --user and --password command-line options.
Query the table to confirm that the ColumnStore storage engine is loaded.
This action is performed on each Enterprise ColumnStore node.
Execute the following query:
SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM information_schema.PLUGINS
WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
+---------------------+---------------+
| PLUGIN_NAME | PLUGIN_STATUS |
+---------------------+---------------+
| Columnstore | ACTIVE |
| COLUMNSTORE_COLUMNS | ACTIVE |
| COLUMNSTORE_TABLES | ACTIVE |
| COLUMNSTORE_FILES | ACTIVE |
| COLUMNSTORE_EXTENTS | ACTIVE |
+---------------------+---------------+The PLUGIN_STATUS column for each ColumnStore-related plugin should contain ACTIVE.
Use Systemd to test whether the CMAPI service is running. This action is performed on each Enterprise ColumnStore node.
Check if the CMAPI service is running by executing the following:
$ systemctl status mariadb-columnstore-cmapiIf the service is not running on any node, start the service by executing the following on that node:
$ sudo systemctl start mariadb-columnstore-cmapiUse CMAPI to request the ColumnStore status. The API key needs to be provided as part of the X-API-key HTML header.
This action is performed with the CMAPI service on the primary server.
Check the ColumnStore status using curl by executing the following:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"192.0.2.3": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 3
}Use MariaDB Client to test DDL.
On the primary server, use the MariaDB Client to connect to the node:
$ sudo mariadbCreate a test database and ColumnStore table:
CREATE DATABASE IF NOT EXISTS test;
CREATE TABLE IF NOT EXISTS test.contacts (
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100)
) ENGINE = ColumnStore;On each replica server, use the MariaDB Client to connect to the node:
$ sudo mariadbConfirm that the database and table exist:
SHOW CREATE TABLE test.contacts\G;If the database or table do not exist on any node, then check the replication configuration.
Use MariaDB Client to test DML.
On the primary server, use the MariaDB Client to connect to the node:
$ sudo mariadbInsert sample data into the table created in the DDL test:
INSERT INTO test.contacts (first_name, last_name, email)
VALUES
("Kai", "Devi", "kai.devi@example.com"),
("Lee", "Wang", "lee.wang@example.com");On each replica server, use the MariaDB Client to connect to the node:
$ sudo mariadbExecute a query to retrieve the data:
SELECT * FROM test.contacts;
+------------+-----------+----------------------+
| first_name | last_name | email |
+------------+-----------+----------------------+
| Kai | Devi | kai.devi@example.com |
| Lee | Wang | lee.wang@example.com |
+------------+-----------+----------------------+If the data is not returned on any node, check the ColumnStore status and the storage configuration.
Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 5 of 9.
Step 3: Start and Configure Enterprise ColumnStore
This page details step 3 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.
This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Mandatory system variables and options for Single-Node Enterprise ColumnStore include:
Set this system variable to utf8
Set this system variable to utf8_general_ci
columnstore_use_import_for_batchinsert
Set this system variable to ALWAYS to always use cpimport for and statements.
[mariadb]
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ciConfigure Enterprise ColumnStore S3 Storage Manager to use S3-compatible storage by editing the /etc/columnstore/storagemanager.cnf configuration file:
[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
# ec2_iam_mode = enabled
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_pathThe S3-compatible object storage options are configured under [S3]:
The bucket option must be set to the name of the bucket that you created in "Create an S3 Bucket".
The endpoint option must be set to the endpoint for the S3-compatible object storage.
The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.
To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.
To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.
The local cache options are configured under [Cache]:
The cache_size option is set to 2 GB by default.
The path option is set to /var/lib/columnstore/storagemanager/cache by default.
Ensure that the specified path has sufficient storage space for the specified cache size.
Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb
$ sudo systemctl enable mariadbStart and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb-columnstore
$ sudo systemctl enable mariadb-columnstoreEnterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.
Create the user account with the statement:
CREATE USER 'util_user'@'127.0.0.1'
IDENTIFIED BY 'util_user_passwd';Grant the user account SELECT privileges on all databases with the statement:
GRANT SELECT, PROCESS ON *.*
TO 'util_user'@'127.0.0.1';Configure Enterprise ColumnStore to use the utility user:
$ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
$ sudo mcsSetConfig CrossEngineSupport Port 3306
$ sudo mcsSetConfig CrossEngineSupport User util_userSet the password:
$ sudo mcsSetConfig CrossEngineSupport Password util_user_passwdFor details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".
Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.
The specific steps to configure the security module depend on the operating system.
Configure SELinux for Enterprise ColumnStore:
To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:
$ sudo yum install policycoreutils policycoreutils-pythonOn RHEL 8, install the following:
$ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utilsAllow the system to run under load for a while to generate SELinux audit events.
After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_localIf no audit events were found, this will print the following:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
Nothing to doIf audit events were found, the new SELinux policy can be loaded using semodule:
$ sudo semodule -i mariadb_local.ppSet SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.
For example, the file will usually look like this after the change:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targetedSet SELinux to enforcing mode:
$ sudo setenforce enforcingFor information on how to create a profile, see How to create an AppArmor Profile on ubuntu.com.
Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
This page was step 3 of 5.
The ColumnStore Bulk Data API enables the creation of higher performance adapters for ETL integration and data ingestions. The Streaming Data Adapters are out of box adapters using these API for specific data sources and use cases.
MaxScale CDC Data Adapter is integration of the MaxScale CDC streams into MariaDB ColumnStore.
Kafka Data Adapter is integration of the Kafka streams into MariaDB ColumnStore.
The MaxScale CDC Data Adapter has been deprecated.
The MaxScale CDC Data Adapter allows streaming change data events (binary log events) from MariaDB Master hosting non-columnstore engines (InnoDB, MyRocks, MyISAM) to MariaDB ColumnStore. In other words, replicate data from a MariaDB master server to MariaDB ColumnStore. It acts as a CDC Client for MaxScale and uses the events received from MaxScale as input to MariaDB ColumnStore Bulk Data API to push the data to MariaDB ColumnStore. maxscale-cdc-adapter
It registers with MariaDB MaxScale as a CDC Client using the MaxScale CDC Connector API, receiving change data records from MariaDB MaxScale (that are converted from binlog events received from the Master on MariaDB TX) in a JSON format. Then, using the MariaDB ColumnStore bulk write SDK, it converts the JSON data into API calls and streams it to a MariaDB PM node. The adapter has options to insert all the events in the same schema as the source database table or insert each event with metadata as well as table data. The event meta data includes the event timestamp, the GTID, event sequence and event type (insert, update, delete).
Download and install MaxScale CDC Connector API from connector.
Download and install MariaDB ColumnStore bulk write SDK from columnstore-bulk-write-sdk.md.
sudo yum -y install epel-release
sudo yum -y install <data adapter>.rpmsudo apt-get update
sudo dpkg -i <data adapter>.deb
sudo apt-get -f installsudo echo "deb http://httpredir.debian.org/debian jessie-backports main contrib non-free" >> /etc/apt/sources.list
sudo apt-get update
sudo dpkg -i <data adapter>.deb
sudo apt-get -f installUsage: mxs_adapter [OPTION]... DATABASE TABLE
-f FILE TSV file with database and table names to stream (must be in `database TAB table NEWLINE` format)
-h HOST MaxScale host (default: 127.0.0.1)
-P PORT Port number where the CDC service listens (default: 4001)
-u USER Username for the MaxScale CDC service (default: admin)
-p PASSWORD Password of the user (default: mariadb)
-c CONFIG Path to the Columnstore.xml file (default: '/usr/local/mariadb/columnstore/etc/Columnstore.xml')
-a Automatically create tables on ColumnStore
-z Transform CDC data stream from historical data to current data (implies -n)
-s Directory used to store the state files (default: '/var/lib/mxs_adapter')
-r ROWS Number of events to group for one bulk load (default: 1)
-t TIME Connection timeout (default: 10)
-n Disable metadata generation (timestamp, GTID, event type)
-i TIME Flush data every TIME seconds (default: 5)
-l FILE Log output to FILE instead of stdout
-v Print version and exit
-d Enable verbose debug outputTo stream multiple tables, use the -f parameter to define a path to a TSV formatted file. The file must have one database and one table name per line. The database and table must be separated by a TAB character and the line must be terminated in a newline (\n).
Here is an example file with two tables, t1 and t2 both in the test database:
test t1
test t2You can have the adapter automatically create the tables on the ColumnStore instance with the -an option. In this case, the user used for cross-engine queries will be used to create the table (the values in Columnstore.CrossEngineSupport). This user requires CREATE privileges on all streamed databases and tables.
The -z option enables the data transformation mode. In this mode, the data is converted from historical, append-only data to the current version of the data. In practice, this replicates changes from a MariaDB master server to ColumnStore via the MaxScale CDC.
Download and install both MaxScale and ColumnStore.
Copy the Columnstore.xml file from /usr/local/mariadb/columnstore/etc/Columnstore.xml from one of the ColumnStore PrimProc nodes to the server where the adapter is installed.
Configure MaxScale according to the .
Create a CDC user by executing the following MaxAdmin command on the MaxScale server. Replace the <service> with the name of the avrorouter service and <user> and <password> with the credentials that are to be created.
maxadmin call command cdc add_user <service> <user> <password>Then we can start the adapter by executing the following command.
mxs_adapter -u <user> -p <password> -h <host> -P <port> -c <path to Columnstore.xml> <database><table>The <database> and <table> define the table that is streamed to ColumnStore. This table should exist on the master server where MaxScale is reading events from. If the table is not created on ColumnStore, the adapter will print instructions on how to define it in the correct way.
The <user> and <password> are the users created for the CDC user, <host> is the MaxScale address and <port> is the port where the CDC service listener is listening.
The -c flag is optional if you are running the adapter on the server where ColumnStore is located.
The Kafka data adapter streams all messages published to Apache Kafka topics in Avro format to MariaDB ColumnStore automatically and continuously - enabling data from many sources to be streamed and collected for analysis without complex code. The Kafka adapter is built using librdkafka and the MariaDB ColumnStore bulk write SDK
A tutorial for the Kafka adapter for ingesting Avro formatted data can be found in the kafka-to-columnstore-data-adapter document.
Starting with MariaDB ColumnStore 1.1.4, a data adapter for Pentaho Data Integration (PDI) / Kettle is available to import data directly into ColumnStore’s WriteEngine. It is built on MariaDB’s rapid-paced Bulk Write SDK.
The plugin was designed for the following software composition:
Operating system: Windows 10 / Ubuntu 16.04 / RHEL/CentOS 7+
MariaDB ColumnStore >= 1.1.4
MariaDB Java Database client* >= 2.2.1
Java >= 8
Pentaho Data Integration >= 7 +not officially supported by Pentaho.
*Only needed if you want to execute DDL.
The following steps are necessary to install the ColumnStore Data adapter (bulk loader plugin):
Extract the archive mariadb-columnstore-kettle-bulk-exporter-plugin-*.zip into your PDI installation directory $PDI-INSTALLATION/plugins.
Copy MariaDB's JDBC Client mariadb-java-client-2.2.x.jar into PDI's lib directory $PDI-INSTALLATION/lib.
Install the additional library dependencies
sudo apt-get install libuv1 libxml2 libsnappy1v5sudo yum install epel-release
sudo yum install libuv libxml2 snappyOn Windows the installation of the Visual Studio 2015/2017 C++ Redistributable (x64) is required.
Each MariaDB ColumnStore Bulk Loader block needs to be configured. On the one hand, it needs to know how to connect to the underlying Bulk Write SDK to inject data into ColumnStore, and on the other hand, it needs to have a proper JDBC connection to execute DDL.
Both configurations can be set in each block’s settings tab.
The database connection configuration follows PDI’s default schema.
By default, the plugin tries to use ColumnStore's default configuration /usr/local/mariadb/columnstore/etc/Columnstore.xml to connect to the ColumnStore instance through the Bulk Write SDK. In addition, individual paths or variables can be used too.
Information on how to prepare the Columnstore.xml configuration file can be found here.
Once a block is configured and all inputs are connected in PDI, the inputs have to be mapped to ColumnStore’s table format.
One can either choose “Map all inputs”, which sets target columns of adequate type, or choose a custom mapping based on the structure of the existing table.
The SQL button can be used to generate DDL based on the defined mapping and to execute it.
This plugin is a beta release.
In addition, it can't handle blob data types and only supports multiple inputs to one block if the input field names are equal for all input sources.
Step 5: Test MariaDB Enterprise Server
This page details step 5 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
MariaDB Enterprise ColumnStore 23.10 includes a testS3Connection command to test the S3 configuration, permissions, and connectivity.
This action is performed on each Enterprise ColumnStore node.
Test the S3 configuration by executing the following:
If the testS3Connection command does not return OK, investigate the S3 configuration.
Use Systemd to test whether the MariaDB Enterprise Server service is running.
This action is performed on each Enterprise ColumnStore node.
Check if the MariaDB Enterprise Server service is running by executing the following:
If the service is not running on any node, start the service by executing the following on that node:
Use to test the local connection to the Enterprise Server node.
This action is performed on each Enterprise ColumnStore node:
The sudo command is used here to connect to the Enterprise Server node using the root@localhost user account, which authenticates using the unix_socket authentication plugin. Other user accounts can be used by specifying the --user and --password command-line options.
Query the table to confirm that the ColumnStore storage engine is loaded.
This action is performed on each Enterprise ColumnStore node.
Execute the following query:
The PLUGIN_STATUS column for each ColumnStore-related plugin should contain ACTIVE.
Use Systemd to test whether the CMAPI service is running.
This action is performed on each Enterprise ColumnStore node.
Check if the CMAPI service is running by executing the following:
If the service is not running on any node, start the service by executing the following on that node:
Use CMAPI to request the ColumnStore status. The API key needs to be provided as part of the X-API-key HTML header.
This action is performed with the CMAPI service on the primary server.
Check the ColumnStore status using curl by executing the following:
Use MariaDB Client to test DDL.
On the primary server, use the MariaDB Client to connect to the node:
Create a test database and ColumnStore table:
On each replica server, use the MariaDB Client to connect to the node:
Confirm that the database and table exist:
If the database or table do not exist on any node, then check the replication configuration.
Use MariaDB Client to test DML.
On the primary server, use the MariaDB Client to connect to the node:
Insert sample data into the table created in the DDL test:
On each replica server, use the MariaDB Client to connect to the node:
Execute a query to retrieve the data:
If the data is not returned on any node, check the ColumnStore status and the storage configuration.
Navigation in the procedure 'Deploy ColumnStore Object Storage Topology".
This page was step 5 of 9.
This guide provides steps for deploying a multi-node S3 ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.
This procedure describes the deployment of the Single-Node Enterprise ColumnStore topology with Object storage.
MariaDB Enterprise ColumnStore 23.10 is a columnar storage engine for MariaDB Enterprise Server and Enterprise ColumnStore is best suited for Online Analytical Processing (OLAP) workloads.
This procedure has 5 steps, which are executed in sequence.
This page provides an overview of the topology, requirements, and deployment procedures.
Please read and understand this procedure before executing.
Customers can obtain support by .
The following components are deployed during this procedure:
The Single-Node Enterprise ColumnStore topology provides support for Online Analytical Processing (OLAP) workloads to MariaDB Enterprise Server.
The Enterprise ColumnStore node:
Receives queries from the application
Executes queries
Use for data
Single-Node Enterprise ColumnStore does not provide high availability (HA) for Online Analytical Processing (OLAP). If you would like to deploy Enterprise ColumnStore with high availability, see Enterprise ColumnStore with Object storage.
These requirements are for the Single-Node Enterprise ColumnStore, when deployed with MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.
Debian 11 (x86_64, ARM64)
Debian 12 (x86_64, ARM64)
Red Hat Enterprise Linux 8 (x86_64, ARM64)
Red Hat Enterprise Linux 9 (x86_64, PPC64LE, ARM64)
Red Hat UBI 8 (x86_64, ARM64)
Rocky Linux 8 (x86_64, ARM64)
Rocky Linux 9 (x86_64, ARM64)
Ubuntu 20.04 LTS (x86_64, ARM64)
Ubuntu 22.04 LTS (x86_64, ARM64)
Ubuntu 24.04 LTS (x86_64, ARM64)
MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the instead.
The minimum hardware requirements are:
MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.
If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:
And the following error message will be raised to the client:
MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.
The recommended hardware requirements are:
Single-node Enterprise ColumnStore with Object Storage requires the following storage type:
Single-node Enterprise ColumnStore with Object Storage uses S3-compatible object storage to store data.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
For the preferred S3-compatible object storage providers that provide cloud and hardware solutions, see the following sections:
The use of non-cloud and non-hardware providers is at your own risk.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, .
Amazon Web Services (AWS) S3
Google Cloud Storage
Azure Storage
Alibaba Cloud Object Storage Service
Cloudian HyperStore
Dell EMC
Seagate Lyve Rack
Quantum ActiveScale
IBM Cloud Object Storage
MariaDB Enterprise Server Configuration Management
MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.
To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.
The systemctl command is used to start and stop the MariaDB Enterprise Server service.
Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:
.
$ sudo testS3ConnectionStorageManager[26887]: Using the config file found at /etc/columnstore/storagemanager.cnf
StorageManager[26887]: S3Storage: S3 connectivity & permissions are OK
S3 Storage Manager Configuration OK$ systemctl status mariadb$ sudo systemctl start mariadb$ sudo mariadb
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 38
Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]>SELECT PLUGIN_NAME, PLUGIN_STATUS
FROM information_schema.PLUGINS
WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
+---------------------+---------------+
| PLUGIN_NAME | PLUGIN_STATUS |
+---------------------+---------------+
| Columnstore | ACTIVE |
| COLUMNSTORE_COLUMNS | ACTIVE |
| COLUMNSTORE_TABLES | ACTIVE |
| COLUMNSTORE_FILES | ACTIVE |
| COLUMNSTORE_EXTENTS | ACTIVE |
+---------------------+---------------+$ systemctl status mariadb-columnstore-cmapi$ sudo systemctl start mariadb-columnstore-cmapi$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"192.0.2.3": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 3
}$ sudo mariadbCREATE DATABASE IF NOT EXISTS test;
CREATE TABLE IF NOT EXISTS test.contacts (
first_name VARCHAR(50),
last_name VARCHAR(50),
email VARCHAR(100)
) ENGINE = ColumnStore;$ sudo mariadbSHOW CREATE TABLE test.contacts\G;$ sudo mariadbINSERT INTO test.contacts (first_name, last_name, email)
VALUES
("Kai", "Devi", "kai.devi@example.com"),
("Lee", "Wang", "lee.wang@example.com");$ sudo mariadbSELECT * FROM test.contacts;
+------------+-----------+----------------------+
| first_name | last_name | email |
+------------+-----------+----------------------+
| Kai | Devi | kai.devi@example.com |
| Lee | Wang | lee.wang@example.com |
+------------+-----------+----------------------+Step 1
Step 2
Step 3
Step 4
Step 5
Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.
Columnar Storage Engine
Optimized for Online Analytical Processing (OLAP) workloads
S3-compatible object storage
Enterprise ColumnStore node
4+ cores
16+ GB
Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.Enterprise ColumnStore node
64+ cores
128+ GB
Single-node Enterprise ColumnStore with Object Storage uses S3-compatible object storage to store data.
Configuration File
Configuration files (such as /etc/my.cnf) can be used to set and . The server must be restarted to apply changes made to configuration files.
Command-line
The server can be started with command-line options that set and .
SQL
Users can set that support dynamic changes on-the-fly using the statement.
CentOS
Red Hat Enterprise Linux (RHEL)
/etc/my.cnf.d/z-custom-mariadb.cnf
Debian
Ubuntu
/etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf
Start
sudo systemctl start mariadb
Stop
sudo systemctl stop mariadb
Restart
sudo systemctl restart mariadb
Enable during startup
sudo systemctl enable mariadb
Disable during startup
sudo systemctl disable mariadb
Status
sudo systemctl status mariadb




MariaDB Enterprise ColumnStore is a smart storage engine designed to efficiently execute analytical queries using distributed query execution and massively parallel processing (MPP) techniques.
MariaDB Enterprise ColumnStore is designed to achieve vertical and horizontal scalability for production analytics using distributed query execution and massively parallel processing (MPP) techniques.
Enterprise ColumnStore evaluates each query as a sequence of job steps using sophisticated techniques to get the best performance for complex analytical queries. Some types of job steps are designed to scale with the system's resources. As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute those types of job steps.
Enterprise ColumnStore stores each column on disk in extents. The storage format is designed to maintain scalability, even as the table grows. If an operation does not read parts of a large table, I/O costs are reduced. Enterprise ColumnStore uses a technique called extent elimination that compares the maximum and minimum values in the extent map to the query's conditions, and it avoids scanning extents that don't satisfy the conditions.
Enterprise ColumnStore provides exceptional scalability for analytical queries. Enterprise ColumnStore's design supports targeted scale-out to address increased workload requirements, whether it is a larger query load or increased storage and query processing capacity.
MariaDB Enterprise ColumnStore provides horizontal scalability by executing some types of job steps in a distributed manner using multiple nodes.
When Enterprise ColumnStore is evaluating a job step, the ExeMgr process or facility on the initiator/aggregator node requests the PrimProc process on each node to perform the job step on different extents in parallel. As more nodes are added, Enterprise ColumnStore can perform more work in parallel.
Enterprise ColumnStore also uses massively parallel processing (MPP) techniques to speed up some types of job steps. For some types of aggregation operations, each node can perform an initial local aggregation, and then the initiator/aggregator node only needs to combine the local results and perform a final aggregation. This technique can be very efficient for some types of aggregation operations, such as for queries that use the AVG(), COUNT(), or SUM() aggregate functions.
MariaDB Enterprise ColumnStore provides vertical scalability by executing some types of job steps in a multi-threaded manner using a thread pool.
When the PrimProc process on a node receives work, it executes the job step on an extent in a multi-threaded manner using a thread pool. Each thread operates on a different block within the extent. As more CPUs are added, Enterprise ColumnStore can work on more blocks in parallel.
MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as table size increases.
Most databases are row-based databases that use manually-created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.
Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each extent in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.
When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.
Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.
The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.
All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.
For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.
The generic select handler is not optimal for Enterprise ColumnStore, because:
Enterprise ColumnStore selects data by column, but the generic selects handler selects data by row
Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not
Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not
Enterprise ColumnStore supports distributed functions, but the generic select handler does not
Enterprise ColumnStore supports extent elimination, but the generic select handler does not
Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it
The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
If a storage engine implements a custom select handler, it is known as a smart storage engine.
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
The ColumnStore storage engine can use either the custom select handler or the generic select handler. The select handler can be configured using the columnstore_select_handler system variable:
AUTO
When set to AUTO, Enterprise ColumnStore automatically chooses the best select handler for a given SELECT query.
AUTO was added in Enterprise ColumnStore 6.
OFF
When set to OFF, Enterprise ColumnStore uses the generic select handlers for all SELECT queries.
It is not recommended to use this value, unless recommended by MariaDB Support.
ON
When set to ON, Enterprise ColumnStore uses the custom select handlers for all SELECT queries.
ON is the default in Enterprise ColumnStore 5 and Enterprise ColumnStore 6.
MariaDB Enterprise ColumnStore performs join operations using hash joins.
By default, hash joins are performed in memory.
MariaDB Enterprise ColumnStore can be configured to allocate more memory for hash joins.
The relevant configuration options are:
HashJoin
PmMaxMemorySmallSide
Configures the amount of memory available for a single join.
Valid values are from 0 to 4 GB.
Default value is 1 GB.
HashJoin
TotalUmMemory
Configures the amount of memory available for all joins.
Values can be specified as a percentage of total system memory or as a specific amount of memory.
Valid percentage values are from 0 to 100%
Default value is 25%
For example, to configure Enterprise ColumnStore to use more memory for hash joins using the mcsSetConfig utility:
$ mcsSetConfig HashJoin PmMaxMemorySmallSide 2G
$ mcsSetConfig HashJoin TotalUmMemory '40%'MariaDB Enterprise ColumnStore can be configured to perform disk-based joins.
The relevant configuration options are:
HashJoin
AllowDiskBasedJoin
Enables disk-based joins
Valid values are Y and N
Default value is N
HashJoin
TempFileCompression
Enables compression for temporary files used by disk-based joins
Valid values are Y and N
Default value is N
SystemConfig
SystemTempFileDir
Configures the directory used for temporary files used by disk-based joins and aggregations
Default value is /tmp/columnstore_tmp_files
For example, to configure Enterprise ColumnStore to perform disk-based joins using the mcsSetConfig utility:
mcsSetConfig HashJoin AllowDiskBasedJoin Y
mcsSetConfig HashJoin TempFileCompression Y
mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmpMariaDB Enterprise ColumnStore performs aggregation operations on all nodes in a distributed manner, and then all nodes send their results to a single node, which combines the results and performs the final aggregation.
By default, aggregation operations are performed in memory.
In Enterprise ColumnStore 5.6.1 and later, disk-based aggregations can be configured.
The relevant configuration options are:
RowAggregation
AllowDiskBasedAggregation
Enables disk-based joins
Valid values are Y and N
Default value is N
RowAggregation
Compression
Enables compression for temporary files used by disk-based joins
Valid values are Y and N
Default value is N
SystemConfig
SystemTempFileDir
Configures the directory used for temporary files used by disk-based joins and aggregations
Default value is /tmp/columnstore_tmp_files
For example, to configure Enterprise ColumnStore to perform disk-based aggregations using the mcsSetConfig utility:
$ mcsSetConfig RowAggregation AllowDiskBasedAggregation Y
$ mcsSetConfig RowAggregation Compression SNAPPY
$ mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmpThe ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its own queries using the custom select handler.
MariaDB Enterprise ColumnStore's query planning is divided into two steps:
ES provides the query's SELECT_LEX object to the custom select handler. The custom select handler builds a ColumnStore Execution Plan (CSEP).
The custom select handler provides the CSEP to the ExeMgr process or facility on the same node. ExeMgr performs extent elimination and creates a job list.
The ColumnStore storage engine provides the CSEP to the ExeMgr process or facility on the same node, which will act as the initiator/aggregator node for the query.
Starting with MariaDB Enterprise ColumnStore 22.08, the ExeMgr facility has been integrated into the PrimProc process, so it is no longer a separate process.
ExeMgr performs multiple tasks:
Performs extent elimination.
Views the optimizer statistics.
Transforms the CSEP to a job list, which consists of job steps.
Assigns distributed job steps to the PrimProc process on each node.
Evaluates non-distributed job steps itself.
Provides final query results to ES.
When Enterprise ColumnStore executes a query, it goes through the following process:
The client or application sends the query to MariaDB MaxScale's listener port.
The query is processed by the Read/Write Split Router (readwritesplit) service associated with the listener.
The service routes the query to the ES TCP port on a ColumnStore node.
MariaDB Enterprise Server (ES) evaluates the query using the handler interface.
The handler interface builds a SELECT_LEX object to represent the query.
The handler interface provides the SELECT_LEX object to the ColumnStore storage engine's select handler.
The select handler transforms the SELECT_LEX object into a ColumnStore Execution Plan (CSEP).
The select handler provides the CSEP to the ExeMgr facility on the same node, which will act as the initiator/aggregator node for the query.
ExeMgr transforms the CSEP into a job list, which consists of job steps.
ExeMgr evaluates each job step sequentially.
If it is a non-distributed job step, ExeMgr evaluates the job step itself.
If it is a distributed job step, ExeMgr provides the job step to the PrimProc process on each node. The PrimProc process on each node evaluates the job step in a multi-threaded manner using a thread pool. After the PrimProc process on each node evaluates its job step, the results are returned to ExeMgr on the initiator/aggregator node as a Row Group.
After all job steps are evaluated, ExeMgr returns the results to ES.
ES returns the results to MaxScale.
MaxScale returns the results to the client or application.
These instructions detail the upgrade from MariaDB Enterprise ColumnStore 6 to MariaDB Enterprise ColumnStore 23.10 in a Multi-Node topology on a range of supported Operating Systems.
This action is performed for each replica server on the MaxScale node.
Prior to upgrading, the replica servers must be set to maintenance mode in MaxScale. The replicas can be set to maintenance mode in MaxScale using . If you are using , the replicas can be set to maintenance mode using the set server command:
maxctrl set server \
mcs2 \
maintenanceAs the first argument, provide the name for the server
As the second argument, provide maintenance as the state
This action is performed on the MaxScale node.
Confirm that the replicas are set to maintenance mode in MaxScale using MaxScale's REST API. If you are using MaxCtrl, the state of the replicas can be viewed using the list servers command:
maxctrl list servers┌────────┬───────────────┬──────┬─────────────┬──────────────────────┬────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
│ mcs3 │ 192.0.2.3 │ 3306 │ 0 │ Maintenance, Running │ 0-1-17 │
├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
│ mcs2 │ 192.0.2.2 │ 3306 │ 0 │ Maintenance, Running │ 0-1-17 │
├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
│ mcs1 │ 192.0.2.1 │ 3306 │ 0 │ Master, Running │ 0-1-17 │
└────────┴───────────────┴──────┴─────────────┴──────────────────────┴────────┘If the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.
This action is performed on each replica server.
The system variable must be disabled for this upgrade procedure. If the gtid_strict_mode system variable is enabled in any configuration files, disable it temporarily until the upgrade procedure is complete.
You can check if the gtid_strict_mode system variable is set in a configuration file by executing my_print_defaults command with the mysqld option:
my_print_defaults --mysqld \
| grep "gtid[-_]strict[-_]mode"--gtid_strict_mode=1If the gtid_strict_mode system variable is set, you can temporarily disable it by adding # in front of it in the configuration file, so that it will be treated as a comment and ignored:
[mariadb]
...
# temporarily commented out for upgrade
# gtid_strict_mode=1Prior to upgrading, MariaDB Enterprise ColumnStore must be shutdown.
mcs cluster stopThis action is performed on each ColumnStore node.
Prior to upgrading, several services must be stopped on each ColumnStore node:
Stop the CMAPI service:
sudo systemctl stop mariadb-columnstore-cmapiStop the MariaDB Enterprise ColumnStore service:
sudo systemctl stop mariadb-columnstoreStop the MariaDB Enterprise Server service:
sudo systemctl stop mariadbMariaDB Corporation provides package repositories for YUM (RHEL, CentOS, Rocky Linux) and APT (Debian, Ubuntu).
Retrieve your Customer Download Token at https://customers.mariadb.com/downloads/token/ and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.
Configure the YUM package repository.
Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to .
To configure YUM package repositories:
sudo yum install curlcurl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setupecho "${checksum} mariadb_es_repo_setup" | sha256sum -c -chmod +x mariadb_es_repo_setupsudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--mariadb-server-version="11.4"Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
Update MariaDB Enterprise Server and package dependencies:
sudo yum update "MariaDB-*" "MariaDB-columnstore-engine" "MariaDB-columnstore-cmapi"Retrieve your Customer Download Token at https://customers.mariadb.com/downloads/token/ and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.
Configure the APT package repository.
Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to mariadb_es_repo_setup.
To configure APT package repositories:
sudo apt install curlcurl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setupecho "${checksum} mariadb_es_repo_setup" sha256sum -c -chmod +x mariadb_es_repo_setupsudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
--mariadb-server-version="11.4"sudo apt updateChecksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.
Update MariaDB Enterprise Server and package dependencies.
The update command depends on the installed APT version, which can be determined by executing the following command:
apt --versionapt 2.0.9 (amd64)For versions prior to APT 2.0, execute the following command:
sudo apt install --only-upgrade "mariadb*"For APT 2.0 and later, execute the following command:
sudo apt install --only-upgrade '?upgradable ?name(mariadb.*)'This action is performed on each ColumnStore node.
After upgrading, the MariaDB Enterprise ColumnStore service should be stopped, since it will be controlled by CMAPI:
sudo systemctl stop mariadb-columnstore
sudo systemctl disable mariadb-columnstoreCMAPI disables the Enterprise ColumnStore service in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not need to start automatically upon reboot.
This action is performed on each ColumnStore node.
After upgrading, the CMAPI service and the MariaDB Enterprise Server service must be started on each ColumnStore node:
Start the CMAPI service:
sudo systemctl start mariadb-columnstore-cmapiStart the MariaDB Enterprise Server service:
sudo systemctl start mariadbOn the primary server, run mariadb-upgrade to upgrade the data directory with binary logging enabled to update the system tables:
mariadb-upgrade --write-binlogAfter upgrading, MariaDB Enterprise ColumnStore must be started.
mcs cluster startThis action is performed on each replica server.
If you temporarily disabled the system variable in the Disable GTID Strict Mode step, it can be re-enabled. If the gtid_strict_mode system variable was temporarily disabled in any configuration files, re-enable it.
This action is performed on each ColumnStore node.
After upgrading, it is recommended to confirm the Enterprise ColumnStore version on each ColumnStore node. Connect to the node using and query the Columnstore_version status variable with :
SHOW GLOBAL STATUS LIKE 'Columnstore_version';+---------------------+---------+
| Variable_name | Value |
+---------------------+---------+
| Columnstore_version | 23.10.0 |
+---------------------+---------+This action is performed on each ColumnStore node.
After upgrading, it is recommended to confirm the ES version on each ColumnStore node. Connect to the node using and query the version system variable with :
SHOW GLOBAL VARIABLES LIKE 'version';+---------------+----------------------------------+
| Variable_name | Value |
+---------------+----------------------------------+
| version | 10.6.9-5-MariaDB-enterprise-log |
+---------------+----------------------------------+This action is performed for each replica server on the MaxScale node.
After the upgrade, maintenance mode for each replica has been cleared in MaxScale using . If you are using , maintenance mode can be cleared using the clear server command:
maxctrl clear server \
mcs2 \
maintenanceAs the first argument, provide the name for the server
As the second argument, provide maintenance as the state
This action is performed for each replica server on the MaxScale node.
Confirm that maintenance mode in MaxScale has been cleared for each replica using . If you are using , the state of the replicas can be viewed using the list servers command:
maxctrl list servers┌────────┬───────────────┬──────┬─────────────┬─────────────────┬─────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
│ mcs3 │ 192.0.2.3 │ 3306 │ 0 │ Slave, Running │ 0-3-159 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
│ mcs2 │ 192.0.2.2 │ 3306 │ 0 │ Slave, Running │ 0-1-88 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
│ mcs1 │ 192.0.2.1 │ 3306 │ 0 │ Master, Running │ 0-1-88 │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴─────────┘If the node is no longer in maintenance mode, then the State column will no longer show Maintenance as one of the states.
R is a language and environment for statistical computing and graphics.
R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, …), graphical techniques, machine learning packages and is highly extensible.
One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.
R is an integrated suite of software facilities for data manipulation, calculation, and graphical display.
It includes:
• an effective data handling and storage facility,
• a suite of operators for calculations on arrays, in particular matrices,
• a large, coherent, integrated collection of intermediate tools for data analysis,
• graphical facilities for data analysis and display either on-screen or on hardcopy, and
• a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.
Some basic notions / tips on how to use R along with MariaDB are the following:
A. The recommended R distribution is “Base R”: CRAN
B. The recommended R GUIs are RStudio Desktop, or RStudio Server: RStudio
Alternative GUIs would be:
RCode (PGM Solutions): RCode.
“R” and “MariaDB Server” can be installed either in the same server, or in different servers, as an ODBC communication protocol will be used for the exchange of data between the two environments.
For the transfer of data between MariaDB Server and R Environment, it is recommended R's "odbc" Package: CRAN odbc
“odbc" is a new R package available on CRAN (Since 2017-02-05), and maintained by RStudio, which is designed to comply with the DBI specification.
Tutorials on how to use R's "odbc" package can be found here:
Setting up ODBC Drivers: DB RStudio Drivers
"odbc" R Package: DB RStudio odbc Usage
The "odbc" package requires to have previously installed the MariaDB or MySQL ODBC connector:
For installing the "odbc" package from CRAN, execute in R:
install.packages("odbc")“RMariaDB” R library, is a modern 'MariaDB' client based on 'Rcpp'.
For installing RMariaDB package through CRAN, execute the following R statement:
install.packages("RMariaDB")And for connecting to MariaDB:
library(RMariaDB)
con <- dbConnect(
drv = RMariaDB::MariaDB(),
username = NULL,
password = NULL,
host = NULL,
port = 3306
)There are other alternatives for data transfer between R and MariaDB:
“readr” R package, for writing / reading CSV files. To be used in MariaDB along with “LOAD DATA INFILE”.
"RODBC" R package: Robust and well-tested (Since 2000-05-24) package which enables data transfer between R and MariaDB by means of an ODBC connector: CRAN RODBC
It is slightly slower than RStudio's new "odbc" package (See benchmarks): RStudio odbc
For bug report to the RODBC package maintainer, use the following R statement: bug.report(package = "RODBC")
A vignette on how to use the RODBC package can be found here: RODBC CRAN Vignette
Recommended resources for learning how to program in R are the following:
A recommended book for understanding the underlying statistics in the R packages is:
Rstudio Cheatsheets are a recommended and valuable resource: RStudio Cheatsheets: Webpage
Along with the following Base R reference card: R Reference Card v2
Information on new R packages is regularly published in the following websites:
H2O.AI
The R Programming language has support for the H2O.ai library (h2o), which enables to create in-memory multi-cluster GPU powered machine learning models.
For installing H2O.ai through CRAN, execute:
install.packages("h2o")The following R Statements can be used for importing a MariaDB table to H2O.ai using the R Front End:
import_sql_table: "This function imports a SQL table to H2OFrame in memory".
import_sql_select: "This function imports the SQL table that is the result of the specified SQL query to H2OFrame in memory".
connection_url <- "jdbc:mariadb://172.16.2.178:3306/ingestSQL?&useSSL=false"
username <- "root"
password <- "abc123"
# Whole Table:
table <- "citibike20k"
my_citibike_data <- h2o.import_sql_table(connection_url, table, username, password)
# SELECT Query:
select_query <- "SELECT bikeid FROM citibike20k"
my_citibike_data <- h2o.import_sql_select(connection_url, select_query, username, password)NOTE: Be sure to start the h2o.jar in the terminal with your downloaded JDBC driver in the classpath:
java -cp <path_to_h2o_jar>:<path_to_jdbc_driver_jar> water.H2OAppKERAS
R package keras offers an interface to Python's 'Keras', a high-level neural networks 'API'.
'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.
R LIBRARIES: CARET
A book which introduces core Machine Learning concepts:
Documentation on how to perform Text Mining in R can be found in the book "Text Mining With R":
SHINY WEB APPS
Shiny R Package makes it incredibly easy to build interactive web applications with R.
Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.
For deploy Shiny Web Applications using Open Source Alternatives, you can either use:
RMARKDOWN DOCUMENTS
Some of the most advanced R resources for fully understanding the internals and nuances of the R Programming Language are the following:
This guide explains how to upgrade MariaDB Enterprise Server (ES) and MariaDB Enterprise ColumnStore across all nodes in a cluster using the unified mcs command-line tool that you have to run only once.
The mcs install_es command:
Validates your MariaDB Enterprise Repository access using an ES API token.
Stops ColumnStore and MariaDB services in a controlled sequence.
Installs/configures the ES repository for the target version.
Creates a pre‑upgrade backup of ColumnStore DBRM and config files on each node.
Upgrades MariaDB Enterprise Server, ColumnStore, and CMAPI.
Waits for CMAPI to come back online on each node and, for upgrades, automatically restarts services.
Administrative privileges on all cluster nodes (package installation and service management required).
A valid ES API token with access to the MariaDB Enterprise Repository.
Network access from the nodes to the MariaDB Enterprise Repository endpoints.
A maintenance window: the upgrade will stop ColumnStore and MariaDB services.
Recent backups:
At a minimum, ensure Extent Map and configuration backups exist.
Recommended: take a full backup with the mcs backup command.
Related docs:
General backup and restore guidance:
Always back up your data before upgrading. While the tool performs a pre‑upgrade backup of DBRM and configs, it is not a substitute for a full database backup.
The command can target a specific ES version, or use the latest tested version (currently latest 10.6 version).
Install latest tested version (if you omit the --version option, mcs uses the latest version):
Install a specific version:
Proceed even if nodes report different installed package versions (use the majority version as baseline):
Options summary:
--token TEXT: ES API Token to use for the upgrade (required) — get it .
-v, --version TEXT: ES version to install; if omitted or set to latest, upgrades to the latest tested version.
For a different version, specify something like --version 10.6.23-19 or --version 11.4.8-5 .
--ignore-mismatch: Continue even if cluster nodes report different package versions; uses majority versions as the baseline.
Stop or pause write workloads and heavy ingestion (e.g., cpimport, large INSERT/LOAD DATA jobs).
Drain or put traffic managers/proxies (for example, MaxScale) into maintenance/drain mode.
Ensure you have administrative/SSH and package manager access on all nodes.
Verify time synchronization across all nodes (NTP/Chrony) to avoid coordination issues.
Confirm recent backups are complete and restorable.
Validate token and target version.
If --version=latest, the tool resolves the latest tested ES version.
If a specific version is requested, it is validated against the repository. Some versions could exists only for specific operating systems.
Stop services.
Gracefully stops the ColumnStore cluster.
Stops the MariaDB server.
Configure repository.
Installs/configures the MariaDB Enterprise Server repository for the chosen version on each node automatically.
Validate installed repo on each node separately
Pre‑upgrade backups (per node).
Creates a backup of DBRM and key configuration files with name preupgrade_dbrm_backup in default backup directory.
Upgrade packages (per node).
Upgrades MariaDB Enterprise Server and ColumnStore packages.
Upgrades CMAPI and waits for it to become ready again on each node (up to 5 minutes).
Service handling after upgrade.
On upgrades: automatically restarts MariaDB and the ColumnStore cluster.
On downgrades: automatic restarts are intentionally skipped; manual steps are required.
Run mcs cluster status to verify all services are up and the cluster is healthy. In case of a failure:
Verify CMAPI readiness on all nodes (for example, via mcs or an external monitoring tool).
Run a quick smoke test:
Create a small ColumnStore table, insert a few rows, and run a SELECT query.
Check for errors in server/ColumnStore logs.
Review /var/tmp/mcs_cli_install_es.log for the full sequence, and ensure no errors were reported.
Downgrades are supported up to MariaDB 10.6.9-5 and ColumnStore 22.08.4.
When downgrading, the tool doesn't automatically restart services. Complete these steps manually:
Start MariaDB on each node (for example, via your service manager).
Start the ColumnStore cluster (for example, using the mcs cluster start command).
Verify cluster health before resuming traffic.
Downgrades can cause data loss or cluster inconsistency if not planned and validated. Always test and ensure backups are restorable.
After a successful upgrade, or after downgrading and a manual restart:
Validate that CMAPI is ready on all nodes:
mcs cmapi is-ready
Check ColumnStore and MariaDB services are running and the cluster is healthy:
mcs cluster status
The mcs install_es command writes a detailed run log to:
/var/tmp/mcs_cli_install_es.log
If CMAPI readiness times out or services do not start cleanly, review:
CMAPI logs: /var/log/mariadb/columnstore/cmapi_server.log
Service logs on each node: /var/log/mariadb/columnstore/
The install_es log file (/var/tmp/mcs_cli_install_es.log) for the full sequence and any errors
Mixed package versions across nodes.
If nodes report different installed versions of Server/ColumnStore/CMAPI, the command fails with a mismatch message.
You can force continuation with --ignore-mismatch; the tool uses the majority version per package as the baseline, but this carries risk—align versions whenever possible.
CMAPI readiness timeout
After upgrading CMAPI, the command waits up to 300 seconds per node for readiness.
On slow nodes or constrained environments, this timeout may be insufficient, and the command exits with a failure; verify services manually and adjust operational expectations.
Downgrade restarts are skipped by design.
After a downgrade, automatic restarts are not performed; you must start MariaDB and the ColumnStore cluster manually and validate health.
ColumnStore skips automatic restarts, because it cannot guarantee that all the expected APIs endpoints exist or are backward-compatible.
MaxScale maintenance handling not automated.
Transitioning MaxScale to maintenance/normal mode during upgrades is not automated at this time; manage traffic routing and maintenance state manually if applicable.
Repository access and version validation.
Invalid tokens, network restrictions, or unsupported version strings can result in validation errors (for example, HTTP 422). Ensure the token has the correct entitlements and the requested version exists for your platform.
Single‑node detection.
If no active nodes are detected, the tool falls back to localhost only; ensure this matches your topology.
Downgrading to 22.08.4 (10.6.9-5) technically working but finished with known issues:
Got ERROR on waiting CMAPI ready. But in fact CMAPI starts and is working fine (check mcs status and systemctl status mariadb-columnstore-cmapi on each node).
If you try to run a mariadb command, you got an error due to unknown configuration flag. Tool forcing to save current config files while installing packages, and an older MariaDB version doesn't support never flag obviously. To fix it, remove this flag from the configuration file, or restore the configuration from last installed package.
Tool currently supported limited packages.
Only MariaDB-server (and dependencies), MariaDB-columnstore-engine (MariaDB-plugin-columnstore) and MariaDB-columnstore-cmapi packages remove and install supported. So packages like MariaDB-backup currently not supported and should be upgraded/downgraded manually.
Re‑run with -v/--verbose to enable console debug logging.
Inspect /var/tmp/mcs_cli_install_es.log for the complete sequence and API responses.
If package repository installation fails, verify token validity and outbound access from all nodes.
If CMAPI does not become ready, check service logs on each node.
For mismatched node versions, align package versions before re‑running, or proceed with --ignore-mismatch , but only after assessing the risk.
Cluster state: ColumnStore cluster should be healthy before starting.
Node access: All nodes must be reachable (SSH/admin access) and responsive.
Disk space: Ensure sufficient free space for package downloads and pre-upgrade backups.
Internet access: Nodes must reach MariaDB Enterprise repositories (per your operating system).
CMAPI communication: Port 8640 (default) must be reachable between nodes.
Time sync: Keep NTP/Chrony synchronized across nodes.
Downgrades can be destructive.
This prompts for confirmation. After downgrade, services are not restarted automatically; start MariaDB and the ColumnStore cluster manually and verify health.
If the upgrade fails or CMAPI does not become ready on all nodes:
Review the detailed log at /var/tmp/mcs_cli_install_es.log for errors.
Check service status on each node:
systemctl status mariadb
systemctl status mariadb-columnstore-cmapi
Verify network/ports (CMAPI 8640) and repository reachability.
Manually start services if safe to do so:
systemctl start mariadb
mcs start (or mcs cluster start)
If corruption is suspected, follow your backup recovery plan (for example, restore from a recent backup and/or extent map backup).
Prior to upgrading:
Create a full backup and verify restore procedures.
Test the process in staging with similar topology/data.
Document current package versions and configs.
Schedule a maintenance window and inform stakeholders.
During upgrading:
Monitor the console output and /var/tmp/mcs_cli_install_es.log .
Avoid interrupting the process; ensure network stability.
After upgrading:
Validate services and cluster health (mcs cluster status).
Run basic data integrity and application smoke tests.
Monitor performance and logs for regressions.
Contact MariaDB Support if you encounter unexpected failures, data issues, or performance regressions. Provide:
The complete log file: /var/tmp/mcs_cli_install_es.log .
The mcs review logs: mcs review --logs .
The exact command used (with parameters, masking sensitive values).
Cluster topology (nodes, versions, operating system, network).
Source and target versions (Server, ColumnStore, CMAPI).
Exact error messages and timestamps.
Command reference: mcs install_es in the command-line tool help and tool README.
Backups: mcs backup and Extent Map backup guidance.
Cluster management: mcs cluster start|stop|status .
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which bypasses the SQL layer to decrease the overhead of bulk data loading.
Refer to the for additional information and to .
The cpimport tool:
Bypasses the SQL layer to decrease overhead;
Does not block read queries;
Requires a write metadata lock on the table, which can be monitored with the ;
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries;
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files;
Supports parallel distributed bulk loads;
Imports data from text files;
Imports data from binary files;
Imports data from standard input (stdin).
You can load data using the cpimport tool in the following cases:
You are loading data into a ColumnStore table from a text file stored on the primary node's file system.
You are loading data into a ColumnStore table from a binary file stored on the primary node's file system.
You are loading data into a ColumnStore table from the output of a command running on the primary node.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the .
Before data can be imported into the tables, the schema must be created.
Connect to the primary server using :
After the command is executed, it prompts for a password.
For each imported database, create the database with the statement:
For each imported table, create the table with the statement:
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.
While the bulk load is in progress, the newly appended data is temporarily hidden from queries.
After the bulk load is complete, the newly appended data is visible to queries.
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.
The order of data can have a significant effect on performance with Enterprise ColumnStore, so it can be helpful to sort the data in the input file prior to importing it.
For additional information, see "".
Before importing a file into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.
The default field delimiter for the cpimport tool is a pipe (|).
To use a different delimiter, you can set the field delimiter.
The cpimport tool can import data from a text file if a file is provided as an argument after the database and table name.
For example, to import the file inventory-products.txt into the products table in the inventory database:
The cpimport tool can import data from a binary file if the -I1 or -I2 option is provided and a file is provided as an argument after the database and table name.
For example, to import the file inventory-products.bin into the products table in the inventory database:
The -I1 and -I2 options allow two different binary import modes to be selected:
The binary file should use the following format for data:
In binary input files, the cpimport tool expects columns to be in the following format:
In binary input files, the cpimport tool expects columns to be in the following format:
The cpimport tool can import data from standard input (stdin) if no file is provided as an argument.
Importing from standard input is useful in many scenarios.
One scenario is when you want to import data from a remote database. You can use to query the table using the statement, and then pipe the results into the standard input of the cpimport tool:
The cpimport tool can import data from a file stored in a remote S3 bucket.
You can use the AWS CLI to copy the file from S3, and then pipe the contents into the standard input of the cpimport tool:
Alternatively, the columnstore_info.load_from_s3 stored procedure can import data from S3-compatible cloud object storage.
The default field delimiter for the cpimport tool is a pipe sign (|).
If your data file uses a different field delimiter, you can specify the field delimiter with the -s option.
For a TSV (tab-separated values) file:
For a CSV (comma-separated values) file:
By default, the cpimport tool does not expect fields to be quoted.
If your data file uses quotes around fields, you can specify the quote character with the -E option.
To load a TSV (tab-separated values) file that uses double quotes:
To load a CSV (comma-separated values) file that uses optional single quotes:
The cpimport tool writes logs to different directories, depending on the Enterprise ColumnStore version:
In Enterprise ColumnStore 5.5.2 and later, logs are written to /var/log/mariadb/columnstore/bulk/
In Enterprise ColumnStore 5 releases before 5.5.2, logs are written to /var/lib/columnstore/data/bulk/
In Enterprise ColumnStore 1.4, logs are written to /usr/local/mariadb/columnstore/bulk/
The cpimport tool requires column values to be in the same order in the input file as the columns in the table definition.
The cpimport tool requires values to be specified in the format YYYY-MM-DD.
The cpimport tool does not write bulk data loads to the transaction log, so they are not transactional.
The cpimport tool does not write bulk data loads to the binary log, so they cannot be replicated using .
When Enterprise ColumnStore uses object storage and the Storage Manager directory uses EFS in the default Bursting Throughput mode, the cpimport tool can have performance problems if multiple data load operations are executed consecutively. The performance problems can occur because the Bursting Throughput mode scales the rate relative to the size of the file system, so the burst credits for a small Storage Manager volume can be fully consumed very quickly.
When this problem occurs, some solutions are:
Avoid using burst credits by using Provisioned Throughput mode instead of Bursting Throughput mode
Monitor burst credit balances in AWS and run data load operations when burst credits are available
Increase the burst credit balance by increasing the file system size (for example, by creating a dummy file)
Additional information is available .
mcs install_es --token <ES_API_TOKEN> --version latestmcs install_es --token <ES_API_TOKEN> --version <ES_VERSION>mcs install_es --token <ES_API_TOKEN> --version <ES_VERSION> --ignore-mismatchmcs install_es --token <ES_API_TOKEN> --version 10.6.15-10$ mariadb --host 192.168.0.100 --port 5001 \
--user db_user --password \
--default-character-set=utf8CREATE DATABASE inventory;CREATE TABLE inventory.products (
product_name VARCHAR(11) NOT NULL DEFAULT '',
supplier VARCHAR(128) NOT NULL DEFAULT '',
quantity VARCHAR(128) NOT NULL DEFAULT '',
unit_cost VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;$ sudo cpimport \
inventory products \
inventory-products.txt$ sudo cpimport -I1 \
inventory products \
inventory-products.bin-I1
Numeric fields containing NULL will be treated as NULL unless the column has a default value
-I2
Numeric fields containing NULL will be saturated
BIGINT
Little-endian integer format Signed NULL: 0x8000000000000000ULL Unsigned NULL: 0xFFFFFFFFFFFFFFFEULL
CHAR
String padded with '0' to match the length of the field NULL: '0' for the full length of the field
DATE
Use the format represented by the struct Date NULL: 0xFFFFFFFE
DATETIME
Use the format represented by the struct DateTime NULL: 0xFFFFFFFFFFFFFFFEULL
DECIMAL
Use an integer representation of the value without the decimal point Sizing depends on precision: * 1-2: use 2 bytes * 3-4: use 3 bytes * 4-9: use 4 bytes * 10+: use 8 bytes Signed and unsigned NULL: See equivalent-sized integer
DOUBLE
Native IEEE floating point format NULL: 0xFFFAAAAAAAAAAAAAULL
FLOAT
Native IEEE floating point format NULL: 0xFFAAAAAA
INT
Little-endian integer format Signed NULL: 0x80000000 Unsigned NULL: 0xFFFFFFFE
SMALLINT
Little-endian integer format Signed NULL: 0x8000 Unsigned NULL: 0xFFFE
TINYINT
Little-endian integer format Signed NULL: 0x80 Unsigned NULL: 0xFE
VARCHAR
String padded with '0' to match the length of the field NULL: '0' for the full length of the field
struct Date
{
unsigned spare : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};struct DateTime
{
unsigned msecond : 20;
unsigned second : 6;
unsigned minute : 6;
unsigned hour : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};$ mariadb --quick \
--skip-column-names \
--execute="SELECT * FROM inventory.products" \
| cpimport -s '\t' inventory products$ aws s3 cp --quiet s3://columnstore-test/inventory-products.csv - \
| cpimport -s ',' inventory products$ sudo cpimport -s '\t' \
inventory products \
inventory-products.tsv$ sudo cpimport -s ',' \
inventory products \
inventory-products.csv$ sudo cpimport -s '\t' -E '"' \
inventory products \
inventory-products.tsv$ sudo cpimport -s ',' -E "'" \
inventory products \
inventory-products.csv

Adding a Node to MariaDB Enterprise ColumnStore
To add a new node to Enterprise ColumnStore, perform the following procedure.
Before you can add a node to Enterprise ColumnStore, confirm that the Enterprise ColumnStore software has been deployed on the node in the desired topology.
For additional information, see "".
Before the new node can be added, its MariaDB data directory must be consistent with the Primary Server. To ensure that it is consistent, take a backup of the Primary Server:
The instructions below show how to perform a backup using .
On the Primary Server, take a full backup:
sudo mariadb-backup --backup \
--user=mariabackup_user \
--password=mariabackup_passwd \
--target-dir=/data/backup/replica_backupConfirm successful completion of the backup operation.
On the Primary Server, prepare the backup:
sudo mariadb-backup --prepare \
--target-dir=/data/backup/replica_backupConfirm successful completion of the prepare operation.
To make the new node consistent with the Primary Server, restore the new backup on the new node:
On the Primary Server, copy the backup to the new node:
sudo rsync -av /data/backup/replica_backup 192.0.2.3:/data/backup/On the new node, restore the backup using .
sudo mariadb-backup --copy-back \
--target-dir=/data/backup/replica_backupOn the new node, fix the file permissions of the restored backup:
sudo chown -R mysql:mysql /var/lib/mysqlThe Enterprise Server. Enterprise ColumnStore, and CMAPI services can be started using the systemctl command. In case the services were started during the installation process, use the restart command.
Perform the following procedure on the new node:
Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
sudo systemctl restart mariadb
sudo systemctl enable mariadbStart and disable the MariaDB Enterprise ColumnStore service, so that it does not start automatically upon reboot:
sudo systemctl restart mariadb-columnstore
sudo systemctl disable mariadb-columnstoreNote
The Enterprise ColumnStore service should not be enabled in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not require starting automatically upon reboot.
Start and enable the CMAPI service, so that it starts automatically upon reboot:
sudo systemctl restart mariadb-columnstore-cmapi
sudo systemctl enable mariadb-columnstore-cmapiMariaDB Enterprise ColumnStore requires MariaDB Replication, which must be configured.
Get the GTID position that corresponds to the restored backup.
If the backup was taken with , this position will be located in xtrabackup_binlog_info:
cat xtrabackup_binlog_info
mariadb-bin.000096 568 0-1-2001,1-2-5139The GTID position from the above output is 0-1-2001,1-2-5139.
Connect to the Replica Server using using the root@localhost user account:
sudo mariadbSet the system variable to the GTID position:
SET GLOBAL gtid_slave_pos='0-1-2001,1-2-5139';Execute the statement to configure the new node to connect to the Primary Server at this position:
CHANGE MASTER TO
MASTER_USER = "repl",
MASTER_HOST = "192.0.2.1",
MASTER_PASSWORD = "repl_passwd",
MASTER_USE_GTID=slave_pos;The above statement configures the Replica Server to connect to a Primary Server located at 192.0.2.1 using the repl user account.
Start replication using the command:
START SLAVE;The above statement configures the new node to connect to the Primary Server to retrieve new binary log events and replicate them into the local database.
The new node must be added to Enterprise ColumnStore using CMAPI:
Add the node using the add-node endpoint path
Use a supported REST client, such as curl
Format the JSON output using jq for enhanced readability
Authenticate using the configured API key
Include the required headers
For example, if the primary node's host name is mcs1 and the new node's IP address is 192.0.2.3:
In ES 10.5.10-7 and later:
curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.3"}' \
| jq .In ES 10.5.9-6 and earlier:
curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/add-node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.3"}' \
| jq .Example output:
{
"timestamp": "2020-10-28 00:39:14.672142",
"node_id": "192.0.2.3"
}To confirm that the node was properly added, the status of Enterprise ColumnStore should be checked using CMAPI:
Check the status using the status endpoint path
For example, if the primary node's host name is mcs1:
curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .Example output:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"192.0.2.3": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 3
}A server object for the new node must also be added to MaxScale using :
Use or another supported REST client
Add the server object using the create server command
As the first argument, provide a name for the server
As the second argument, provide the IP address for the node
For example:
maxctrl create server \
mcs3 \
192.0.2.3To confirm that the server object was properly added, the server objects should be checked using :
Show the server objects using the show servers command
For example:
maxctrl show serversThe server object for the new node must be linked to the monitor using :
Link a server object to the monitor using the link monitor command
As the first argument, provide the name of the monitor
As the second argument, provide the name of the server
maxctrl link monitor \
mcs_monitor \
mcs3To confirm that the server object was properly linked to the monitor, the monitor should be checked using :
Show the monitors using the show monitors command
For example:
maxctrl show monitorsThe server object for the new node must be linked to the service using :
Link the server object to the service using the link service command
As the first argument, provide the name of the service
As the second argument, provide the name of the server
maxctrl link service \
mcs_service \
mcs3To confirm that the server object was properly linked to the service, the service should be checked using :
Show the services using the show services command
For example:
maxctrl show servicesMaxScale is capable of checking the status of using :
List the servers using the list servers command
For example:
maxctrl list serversIf the new node is properly replicating, then the State column will show Slave, Running.
The MariaDB SHOW PROCESSLIST statement is used to see a list of active queries on that UM:
MariaDB [test]> SHOW PROCESSLIST;
+----+------+-----------+-------+---------+------+-------+--------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+------+-----------+-------+---------+------+-------+--------------+
| 73 | root | localhost | ssb10 | Query | 0 | NULL | show processlist
+----+------+-----------+-------+---------+------+-------+--------------+
1 row in set (0.01 sec)getActiveSQLStatements is a mcsadmin command that shows which SQL statements are currently being executed on the database:
mcsadmin> getActiveSQLStatements
getactivesqlstatements Wed Oct 7 08:38:32 2015
Get List of Active SQL Statements
=================================
Start Time Time (hh:mm:ss) Session ID SQL Statement
---------------- ---------------- -------------------- ------------------------------------------------------------
Oct 7 08:38:30 00:00:03 73 select c_name,sum(lo_revenue) from customer, lineorder where lo_custkey = c_custkey and c_custkey = 6 group by c_nameThe calGetStats function provides statistics about resources used on the node, and network by the last run query. Example:
MariaDB [test]> SELECT count(*) FROM wide2;
+----------+
| count(*) |
+----------+
| 5000000 |
+----------+
1 row in set (0.22 sec)
MariaDB [test]> SELECT calGetStats();
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| calGetStats() |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Query Stats: MaxMemPct-0; NumTempFiles-0; TempFileSpace-0B; ApproxPhyI/O-1931; CacheI/O-2446; BlocksTouched-2443; PartitionBlocksEliminated-0; MsgBytesIn-73KB; MsgBytesOut-1KB; Mode-Distributed |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)The output contains information on:
MaxMemPct - Peak memory utilization on the User Module, likely in support of a large (User Module) based hash join operation.
NumTempFiles - Report on any temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold.
TempFileSpace - Report on space used by temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold.
PhyI/O - Number of 8k blocks read from disk, SSD, or other persistent storage.
CacheI/O - Approximate number of 8k blocks processed in memory, adjusted down by the number of discrete PhyI/O calls required.
BlocksTouched - Approximate number of 8k blocks processed in memory.
PartitionBlocksEliminated - The number of block touches eliminated via the Extent Map elimination behavior.
MsgBytesIn, MsgByteOut - Message size in MB sent between nodes in support of the query.
The output is useful to determine how much physical I/O was required, how much data was cached, and how many partition blocks were eliminated through use of extent map elimination. The system maintains min / max values for each extent and uses these to help implement where clause filters to completely bypass extents where the value is outside of the min/max range. When a column is ordered (or semi-ordered) during load such as a time column this offer very large performance gains as the system can avoid scanning many extents for the column.
While the MariaDB Server's utility can be used to look at the query plan, it is somewhat less helpful for ColumnStore tables as ColumnStore does not use indexes or make use of MariaDB I/O functionality. The execution plan for a query on a ColumnStore table is made up of multiple steps. Each step in the query plan performs a set of operations that are issued from the User Module to the set of Performance Modules in support of a given step in a query.
Full Column Scan - an operation that scans each entry in a column using all available threads on the Performance Modules. Speed of operation is generally related to the size of the data type and the total number of rows in the column. The closest analogy for a traditional system is an index scan operation.
Partitioned Column Scan - an operation that uses the Extent Map to identify that certain portions of the column do not contain any matching values for a given set of filters. The closest analogy for a traditional row-based DBMS is a partitioned index scan, or partitioned table scan operation.
Column lookup by row offset - once the set of matching filters have been applied and the minimal set of rows have been identified; additional blocks are requested using a calculation that determines exactly which block is required. The closest analogy for a traditional system is a lookup by rowid.
These operations are automatically executed together in order to execute appropriate filters and column lookup by row offset.
In MariaDB ColumnStore there is a set of SQL tracing stored functions provided to see the distributed query execution plan between the nodes.
The basic steps to using these SQL tracing stored functions are:
Start the trace for the particular session.
Execute the SQL statement in question.
Review the trace collected for the statement. As an example, the following session starts a trace, issues a query against a 6 million row fact table and 300,000 row dimension table, and then reviews the output from the trace:
MariaDB [test]> SELECT calSetTrace(1);
+----------------+
| calSetTrace(1) |
+----------------+
| 0 |
+----------------+
1 row in set (0.00 sec)
MariaDB [test]> SELECT c_name, sum(o_totalprice)
-> FROM customer, orders
-> WHERE o_custkey = c_custkey
-> AND c_custkey = 5
-> GROUP BY c_name;
+--------------------+-------------------+
| c_name | sum(o_totalprice) |
+--------------------+-------------------+
| Customer#000000005 | 684965.28 |
+--------------------+-------------------+
1 row in set, 1 warning (0.34 sec)
MariaDB [test]> SELECT calGetTrace();
+------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------+
| calGetTrace() |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------+
|
Desc Mode Table TableOID ReferencedColumns PIO LIO PBE Elapsed Rows
BPS PM customer 3024 (c_custkey,c_name) 0 43 36 0.006 1
BPS PM orders 3038 (o_custkey,o_totalprice) 0 766 0 0.032 3
HJS PM orders-customer 3038 - - - - ----- -
TAS UM - - - - - - 0.021 1
|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)The columns headings in the output are as follows:
Desc – Operation being executed. Possible values:
BPS - Batch Primitive Step: scanning or projecting the column blocks.
CES - Cross Engine Step: Performing Cross engine join
DSS - Dictionary Structure Step: a dictionary scan for a particular variable length string value.
HJS - Hash Join Step: Performing a hash join between 2 tables
HVS - Having Step: Performing the having clause on the result set
SQS - Sub Query Step: Performing a sub query
TAS - Tuple Aggregation step: the process of receiving intermediate aggregation results from other nodes.
TNS - Tuple Annexation Step: Query result finishing, e.g. filling in constant columns, limit, order by and final distinct cases.
TUS = Tuple Union step: Performing a SQL union of 2 sub queries.
TCS = Tuple Constant Step: Process Constant Value Columns
WFS = Window Function Step: Performing a window function.
Mode – Where the operation was performed within the PrimProc library
Table – Table for which columns may be scanned/projected.
TableOID – ObjectID for the table being scanned.
ReferencedOIDs – ObjectIDs for the columns required by the query.
PIO – Physical I/O (reads from storage) executed for the query.
LIO – Logical I/O executed for the query, also known as Blocks Touched.
PBE – Partition Blocks Eliminated identifies blocks eliminated by Extent Map min/max.
Elapsed – Elapsed time for a give step.
Rows – Intermediate rows returned.
Sometimes it can be useful to clear caches to allow understanding of un-cached and cached query access. The calFlushCache() function will clear caches on all servers. This is only really useful for testing query performance:
MariaDB [test]> SELECT calFlushCache();It can be useful to view details about the extent map for a given column. This can be achieved using the edit item process on any ColumnStore server. Available arguments can be provided by using the -h flag. The most common use is to provide the column object id with the -o argument which will output details for the column and in this case the -t argument is provided to show min / max values as dates:
editem -o 3032 -t
Col OID = 3032, NumExtents = 10, width = 4
428032 - 432127 (4096) min: 1992-01-01, max: 1993-06-21, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 0, seg#: 0, HWM: 0; status: avail
502784 - 506879 (4096) min: 1992-01-01, max: 1993-06-22, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 0, seg#: 1, HWM: 0; status: unavail
708608 - 712703 (4096) min: 1993-06-21, max: 1994-12-11, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 0, seg#: 2, HWM: 0; status: unavail
766976 - 771071 (4096) min: 1993-06-22, max: 1994-12-12, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 0, seg#: 3, HWM: 0; status: unavail
989184 - 993279 (4096) min: 1994-12-11, max: 1996-06-01, seqNum: 1, state: valid, fbo: 4096, DBRoot: 1, part#: 0, seg#: 0, HWM: 8191; status: avail
1039360 - 1043455 (4096) min: 1994-12-12, max: 1996-06-02, seqNum: 1, state: valid, fbo: 4096, DBRoot: 2, part#: 0, seg#: 1, HWM: 8191; status: avail
1220608 - 1224703 (4096) min: 1996-06-01, max: 1997-11-22, seqNum: 1, state: valid, fbo: 4096, DBRoot: 1, part#: 0, seg#: 2, HWM: 8191; status: avail
1270784 - 1274879 (4096) min: 1996-06-02, max: 1997-11-22, seqNum: 1, state: valid, fbo: 4096, DBRoot: 2, part#: 0, seg#: 3, HWM: 8191; status: avail
1452032 - 1456127 (4096) min: 1997-11-22, max: 1998-08-02, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 1, seg#: 0, HWM: 1930; status: avail
1510400 - 1514495 (4096) min: 1997-11-22, max: 1998-08-02, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 1, seg#: 1, HWM: 1930; status: availHere it can be seen that the extent maps for the o_orderdate (object id 3032) column are well partitioned since the order table source data was sorted by the order_date. This example shows 2 separate DBRoot values as the environment was a 2-node combined deployment.
Column object ids may be found by querying the calpontsys.syscolumn metadata table (deprecated) or information_schema.columnstore_columns table (version 1.0.6+).
MariaDB ColumnStore query statistics history can be retrieved for analysis. By default the query stats collection is disabled. To enable the collection of query stats, the element in the ColumnStore.XML configuration file should be set to Y (default is N).
<QueryStats>
<Enabled>Y</Enabled>
</QueryStats>Cross Engine Support must also be enabled before enabling Query Statistics. See the Cross Engine Configuration section.
For Querystats Cross Engine User needs INSERT Privilege on querystats table.
Example:
grant INSERT on infinidb_querystats.querystats to 'cross_engine'@'127.0.0.1';
grant INSERT on infinidb_querystats.querystats to 'cross_engine'@'localhost';When enabled the history of query statistics across all sessions along with execution time, and those stats provided by calgetstats() is stored in a table in the infinidb_querystats schema. Only queries in the following ColumnStore syntax are available for statistics monitoring:
SELECT
INSERT
UPDATE
DELETE
INSERT ... SELECT
LOAD DATA INFILE
When QueryStats is enabled, the query statistics history is collected in the querystats table in the infinidb_querystats schema.
The columns of this table are:
queryID - A unique identifier assigned to the query
Session ID (sessionID) - The session number that executed the statement.
queryType - The type of the query whether insert, update, delete, select, delete, insert select or load data infile
query - The text of the query
Host (host) - The host that executed the statement.
User ID (user) - The user that executed the statement.
Priority (priority) The priority the user has for this statement.
Query Execution Times (startTime, endTime) Calculated as end time – start time.
start time - the time that the query gets to ExeMgr, DDLProc, or DMLProc
end time - the time that the last result packet exits ExeMgr, DDLProc or DMLProc
Rows returned or affected (rows) -The number of rows returned for SELECT queries, or the number of rows affected by DML queries. Not valid for DDL and other query types.
Error Number (errNo) - The IDB error number if this query failed, 0 if it succeeded.
Physical I/O (phyIO) - The number of blocks that the query accessed from the disk, including the pre-fetch blocks. This statistic is only valid for the queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.
Cache I/O (cacheIO) - The number of blocks that the query accessed from the cache. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.
Blocks Touched (blocksTouched) - The total number of blocks that the query accessed physically and from the cache. This should be equal or less than the sum of physical I/O and cache I/O. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.
Partition Blocks Eliminated (CPBlocksSkipped) - The number of blocks being eliminated by the extent map casual partition. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.
Messages to other nodes (msgOutUM) - The number of messages in bytes that ExeMgr sends to the PrimProc. If a message needs to be distributed to all the PMs, the sum of all the distributed messages will be counted. Only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.
Messages from other nodes (msgInUM) - The number of messages in bytes that PrimProc sends to the ExeMgr. Only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with where clause, and INSERT SELECT.
Memory Utilization (maxMemPct) - This field shows memory utilization in support of any join, group by, aggregation, distinct, or other operation.
Blocks Changed (blocksChanged) - Total number of blocks that queries physically changed on disk. This is only for delete/update statements.
Temp Files (numTempFiles) - This field shows any temporary file utilization in support of any join, group by, aggregation, distinct, or other operation.
Temp File Space (tempFileSpace) - This shows the size of any temporary file utilization in support of any join, group by, aggregation, distinct, or other operation.
Users can view the query statistics by selecting the rows from the query stats table in the infinidb_querystats schema. Examples listed below:
Example 1: List execution time, rows returned for all the select queries within the past 12 hours:
MariaDB [infinidb_querystats]> select queryid, query, endtime-starttime, rows from querystats
where starttime >= now() - interval 12 hour and querytype = 'SELECT';Example 2: List the three slowest running select queries of session 2 within the past 12 hours:
MariaDB [infinidb_querystats]> select a.* from (select endtime-starttime execTime, query from queryStats
where sessionid = 2 and querytype = 'SELECT' and starttime >= now()-interval 12 hour
order by 1 limit 3) a;Example 3: List the average, min and max running time of all the INSERT SELECT queries within the past 12 hours:
MariaDB [infinidb_querystats]> select min(endtime-starttime), max(endtime-starttime), avg(endtime-starttime) from querystats
where querytype='INSERT SELECT' and starttime >= now() - interval 12 hour;Controls whether disk joins are forced to run even if they are not estimated to be the most efficient execution plan. This can be useful for debugging purposes or for situations where the optimizer's estimates are not accurate.
Scope: global, session
Data type:
Default value: OFF
Range: ON, OFF
Introduced in: MariaDB Enterprise Server 10.6
Sets the maximum depth of the partition tree that can be used for disk joins. A higher value allows for more complex joins, but may also increase the memory usage and execution time.
Scope: global, session
Data type:
Default value: 10
Introduced in: MariaDB Enterprise Server 10.6
Sets the maximum number of values that can be used in an IN predicate on a Columnstore table. This limit helps to prevent performance issues caused by queries with a large number of IN values.
Scope: global, session
Data type:
Default value: 10000
Introduced in: MariaDB Enterprise Server 10.6
Sets the maximum number of rows that can be returned by a parallel merge join on a Columnstore table. This limit helps to prevent memory issues caused by joins that return a large number of rows.
Scope: global, session
Data type:
Default value: 1000000
Introduced in: MariaDB Enterprise Server 10.6
Command line: Yes
Scope: global, session
Data type:
Default value: 2
Range: 0,2
Command line: Yes
Scope: global, session
Data type:
Default value: 8
Command line: Yes
Scope: global, session
Data type:
Default value: 100
Command line: Yes
Scope: global, session
Data type:
Default value: 0
Command line: Yes
Scope: global, session
Data type:
Default value: 0
Command line: Yes
Scope: global, session
Data type:
Default value: OFF
Range: OFF, ON
Command line: Yes
Scope: global, session
Data type:
Default value: 7
Command line: Yes
Scope: global, session
Data type:
Default value: 17
Command line: Yes
Scope: global, session
Data type:
Default value: 0
Range: 0,1
Command line: Yes
Scope: global, session
Data type:
Default value: OFF
Range: OFF, ON
Command line: Yes
Scope: global, session
Data type:
Default value: 10
Command line: Yes
Scope: global, session
Data type:
Default value: 20
Command line: Yes
Scope: global, session
Data type:
Default value: 0
Command line: Yes
Scope: global, session
Data type:
Default value: OFF
Range: OFF, ON
Command line: Yes
Scope: global, session
Data type:
Default value: ON
Range: OFF, ON
Command line: Yes
Scope: global, session
Data type:
Default value: ON
Range: OFF, ON
Command line: Yes
Scope: global, session
Data type:
Default value: 1
Range: 0,1,2
MariaDB ColumnStore has the ability to compress data. This is controlled through a compression mode, which can be set as a default for the instance or set at the session level.
To set the compression mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:
where n is:
compression is turned off. Any subsequent table create statements run will have compression turned off for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned off for that column unless any statement override has been performed.
compression is turned on. Any subsequent table create statements run will have compression turned on for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned on for that column unless any statement override has been performed. ColumnStore uses snappy compression in this mode.
MariaDB ColumnStore has the ability to change intermediate decimal mathematical results from decimal type to double. The decimal type has approximately 17-18 digits of precision, but a smaller maximum range. Whereas the double type has approximately 15-16 digits of precision, but a much larger maximum range.
In typical mathematical and scientific applications, the ability to avoid overflow in intermediate results with double math is likely more beneficial than the additional two digits of precisions. In banking applications, however, it may be more appropriate to leave in the default decimal setting to ensure accuracy to the least significant digit.
The infinidb\_double\_for\_decimal\_math variable is used to control the data type for intermediate decimal results. This decimal for double math may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.
To enable/disable the use of the decimal to double math at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:
where n is:
off (disabled, default)
on (enabled)
ColumnStore has the ability to support varied internal precision on decimal calculations. infinidb_decimal_scale is used internally by the ColumnStore engine to control how many significant digits to the right of the decimal point are carried through in suboperations on calculated columns. If, while running a query, you receive the message ‘aggregate overflow’, try reducing infinidb_decimal_scale and running the query again.
Note that, as you decrease infinidb_decimal_scale, you may see reduced accuracy in the least significant digit(s) of a returned calculated column. infinidb_use_decimal_scale is used internally by the ColumnStore engine to turn the use of this internal precision on and off. These two system variables can be set as a default for the instance or at session level.
To enable/disable the use of the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:
where n is off (disabled) or on (enabled).
To set the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
where n is the amount of precision desired for calculations.
Joins are performed in memory. When a join operation exceeds the memory allocated for query joins, the query is aborted with an error code IDB-2001.
Disk-based joins enable such queries to use disk for intermediate join data in case when the memory needed for join exceeds the memory limit. Although slower in performance as compared to a fully in-memory join, and bound by the temporary space on disk, it does allow such queries to complete.
The following variables in the HashJoin element in the Columnstore.xml configuration file relate to disk-based joins. Columnstore.xml resides in /usr/local/mariadb/columnstore/etc/.
AllowDiskBasedJoin – Option to use disk-based joins. Valid values are Y (enabled) or N (disabled). Default is disabled.
TempFileCompression – Option to use compression for disk join files. Valid values are Y (use compressed files) or N (use non-compressed files).
TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /usr/local/mariadb/columnstore/tmp). Files (named infinidb-join-data*) in this directory will be created and cleaned on an as needed basis. The entire directory is removed and recreated by ExeMgr at startup.)
In addition to the system wide flags, at SQL global and session level, the following system variables exists for managing per user memory limit for joins.
infinidb_um_mem_limit - A value for memory limit in MB per user. When this limit is exceeded by a join, it will switch to a disk-based join. By default, the limit is not set (value of 0).
For modification at the global level:
In my.cnf file (typically /usr/local/mariadb/columnstore/mysql):
where value is the value in MB for in memory limitation per user.
For modification at the session level, before issuing your join query from the SQL client, set the session variable as follows.
MariaDB ColumnStore has the ability to utilize the cpimport fast data import tool for non-transactional and SQL statements. Using this method results in a significant increase in performance in loading data through these two SQL statements. This optimization is independent of the storage engine used for the tables in the select statement.
The infinidb_use_import_for_batchinsert variable is used to control if cpimport is used for these statements. This variable may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.
To enable/disable the use of the use cpimport for batch insert at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
where n is:
0 (disabled)
1 (enabled)
The infinidb_import_for_batchinsert_delimiter variable is used internally by MariaDB ColumnStore on a non-transactional INSERT INTO SELECT FROM statement as the default delimiter passed to the cpimport tool. With a default value ascii 7, there should be no need to change this value unless your data contains ascii 7 values.
To change this variable value at the at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
where ascii_value is an ASCII value representation of the delimiter desired.
Note that this setting may cause issues with multi byte character set data. It is recommended to utilize UTF8 files directly with cpimport.
If the following error is received, most likely with a transaction LOAD DATA INFILE or INSERT INTO SELECT, it is recommended to break up the load into multiple smaller chunks, increase the VersionBufferFileSize setting, consider a nontransactional LOAD DATA INFILE, or use cpimport.
The VersionBufferFileSize setting is updated in the ColumnStore.xml typically located under /usr/local/mariadb/columnstore/etc. This dictates the size of the version buffer file on disk which provides DML transactional consistency. The default value is '1GB' which reserves up to a 1 Gigabyte file size. Modify this on the primary node and restart the system if you require a larger value.
MariaDB ColumnStore has the ability to query data from just a single node instead of the whole cluster. In order to accomplish this, the infinidb_local_query variable in the my.cnf configuration file is used and maybe set as a default at system wide or set at the session level.
Local PrimProc query can be enabled system wide during the install process when running the install script postConfigure. Answer 'y' to this prompt during the install process:
To enable the use of the local PrimProc query at the instance level, specify infinidb_local_query =1 (enabled) in the my.cnf configuration file at /usr/local/mariadb/columnstore/mysql. The default is 0 (disabled).
To enable/disable the use of the local PrimProc query at the session level, the following statement is used. Once the session has ended, any subsequent session will return to the default for the instance:
where n is:
0 (disabled)
1 (enabled)
At the session level, this variable applies only to executing a query on an individual . The PrimProc must be set up with the local query option during installation.
With the infinidb_local_query variable set to 1 (default with local PrimProc Query):
With the infinidb_local_query variable set to 0 (default with local PrimProc Query):
Create a script (i.e., extract_query_script.sql in our example) similar to the following:
The infinidb_local_query is set to 0 to allow query across all PrimProc nodes.
The query is structured so PrimProc gets the fact table data locally from the PrimProc node (as indicated by the use of the function), while the dimension table data is extracted from all the PrimProc nodes.
Then you can execute the script to pipe it directly into cpimport:
ColumnStore has the ability to support full MariaDB query syntax through an operating mode. This operating mode may be set as a default for the instance or set at the session level. To set the operating mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.
where n is:
a generic, highly compatible row-by-row processing mode. Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by MySQL using a nested loop join mechanism.
(the default) query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. Queries executed in this mode take advantage of distributed execution and typically result in higher performance.
auto-switch mode: ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.
SET infinidb_compression_type = nSET infinidb_double_for_decimal_math = onSET infinidb_use_decimal_scale = onSET infinidb_decimal_scale = n[mysqld]
...
infinidb_um_mem_limit = valueSET infinidb_um_mem_limit = valueSET infinidb_use_import_for_batchinsert = nSET infinidb_import_for_batchinsert_delimiter = ascii_valueERROR 1815 (HY000) at line 1 in file: 'ldi.sql': Internal error: CAL0006: IDB-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.NOTE: Local Query Feature allows the ability to query data from a single Performance
Module. Check MariaDB ColumnStore Admin Guide for additional information.
Enable Local Query feature? [y,n] (n) >SET infinidb_local_query = nmcsmysql -e 'select * from source_schema.source_table;' –N | /usr/local/Calpont/bin/cpimport target_schema target_table -s '\t' –n1SET infinidb_local_query=0;
SELECT fact.column1, dim.column2
FROM fact JOIN dim USING (KEY)
WHERE idbPm(fact.KEY) = idbLocalPm();mcsmysql source_schema -N < extract_query_script.sql | /usr/local/mariadb/columnstore/bin/cpimport target_schema target_table -s '\t' –n1SET infinidb_vtable_mode = nMariaDB Enterprise ColumnStore's storage architecture is designed to provide great performance for analytical queries.
MariaDB Enterprise ColumnStore is a columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.
Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads but are less performant for analytical workloads.
Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.
MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.
MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to perform analytical or online analytical processing (OLAP) workloads.
OLAP workloads are generally characterized by ad hoc queries on very large data sets. Some other typical characteristics are:
Each query typically reads a subset of columns in the table
Most activity typically consists of read-only queries that perform aggregations, window functions, and various calculations
Analytical applications typically require only a few concurrent queries
Analytical applications typically require the scalability of large, complex queries
Analytical applications typically require efficient bulk loads of new data
OLAP workloads are typically required for:
Business intelligence (BI)
Health informatics
Historical data mining
Row-based storage engines have a disadvantage for OLAP workloads. Indexes are not usually very useful for OLAP workloads, because the large size of the data set and the ad hoc nature of the queries preclude the use of indexes to optimize queries.
Columnar storage engines are much better suited for OLAP workloads. MariaDB Enterprise ColumnStore is a columnar storage engine that is designed for OLAP workloads:
When a query reads a subset of columns in the table, Enterprise ColumnStore can reduce I/O by reading those columns and ignoring all others, because each column is stored separately
When most activity consists of read-only queries that perform aggregations, window functions, and various calculations, Enterprise ColumnStore is able to efficiently execute those queries using extent elimination, distributed query execution, and massively parallel processing (MPP) techniques
When only a few concurrent queries are required, Enterprise ColumnStore is able to maximize the use of system resources by using multiple threads and multiple nodes to perform work for each query
When scalability of large, complex queries is required, Enterprise ColumnStore is able to achieve horizontal and vertical scalability using distributed query execution and massively parallel processing (MPP) techniques
When efficient bulk loads of new data are required, Enterprise ColumnStore is able to bulk load new data without affecting existing data using automatic partitioning with the extent map
MariaDB Enterprise Server has had excellent performance for transactional or online transactional processing (OLTP) workloads since the beginning.
OLTP workloads are generally characterized by a fixed set of queries using a relatively small data set. Some other typical characteristics are:
Each query typically reads and/or writes many columns in the table.
Most activity typically consists of small transactions that only read and/or write a small number of rows.
Transactional applications typically require many concurrent transactions.
Transactional applications typically require a fast response time and low latency.
Transactional applications typically require ACID properties to protect data.
OLTP workloads are typically required for:
Financial transactions performed by financial institutions and e-commerce sites.
Store inventory changes performed by brick-and-mortar stores and e-commerce sites.
Account metadata changes performed by many sites that stores personal data.
Row-based storage engines have several advantages for OLTP workloads:
When a query reads and/or writes many columns in the table, row-based storage engines can find all columns on a single page, so the I/O costs of the operation are low.
When a transaction reads/writes a small number of rows, row-based storage engines can use an index to find the page for each row without a full table scan.
When many concurrent transactions are operating, row-based storage engines can implement transactional isolation by storing multiple versions of changed rows.
When a fast response time and low latency are required, row-based storage engines can use indexes to optimize the most common queries.
When ACID properties are required, row-based storage engines can implement consistency and durability with fewer performance trade-offs, since each row's columns are stored contiguously.
is ES's default storage engine, and it is a highly performant row-based storage engine.
MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to function as a single-stack solution for workloads.
Hybrid workloads are characterized by a mix of transactional and analytical queries. Hybrid workloads are also known as "Smart Transactions", "Augmented Transactions" "Translytical", or "Hybrid Operational-Analytical Processing (HOAP)".
Hybrid workloads are typically required for applications that require real-time analytics that lead to immediate action:
Financial institutions use transactional queries to handle financial transactions and analytical queries to analyze the transactions for business intelligence.
Insurance companies use transactional queries to accept/process claims and analytical queries to analyze those claims for business opportunities or risks.
Health providers use transactional queries to track electronic health records (EHR) and analytical queries to analyze the EHRs to discover health trends or prevent adverse drug interactions.
MariaDB Enterprise Server provides multiple components to perform hybrid workloads:
For analytical queries, the Enterprise ColumnStore storage engine can be used.
For transactional queries, row-based storage engines, such as InnoDB, can be used.
For queries that reference both analytical and transactional data, ES's cross-engine join functionality can be used to join Enterprise ColumnStore tables with InnoDB tables.
MariaDB MaxScale is a high-performance database proxy that can dynamically route analytical queries to Enterprise ColumnStore and transactional queries to the transactional storage engine.
MariaDB Enterprise ColumnStore supports multiple storage types:
• S3-compatible object storage is optional but recommended • Enterprise ColumnStore can use S3-compatible object storage to store data. •With multi-node Enterprise ColumnStore, the should use for high availability.
• Required for multi-node Enterprise ColumnStore with high availability. • Enterprise ColumnStore can use shared local storage to store data and metadata. •If S3-compatible storage is used for data, the shared local storage will only be used for the .
Non-Shared Local Storage
• Appropriate for single-node Enterprise ColumnStore. • Enterprise ColumnStore can use non-shared local storage to store data and metadata.
MariaDB Enterprise ColumnStore supports S3-compatible object storage.
S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use Shared Local Storage (such as NFS) for high availability.
S3-compatible object storage is:
Compatible: Many object storage services are compatible with the Amazon S3 API.
Economical: S3-compatible object storage is often very low cost.
Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.
Limitless: S3-compatible object storage is often virtually limitless.
Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.
Scalable: S3-compatible object storage is often highly optimized for read and write scaling.
Secure: S3-compatible object storage is often encrypted-at-rest.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.
MariaDB Enterprise ColumnStore can use any object store that is compatible with the Amazon S3 API.
Many object storage services are compatible with the Amazon S3 API, and compatible object storage services are available for cloud deployments and on-premises deployments, so vendor lock-in is not a concern.
MariaDB Enterprise ColumnStore's Storage Manager enables remote S3-compatible object storage to be efficiently used. The Storage Manager uses a persistent local disk cache for read/write operations, so that network latency has minimal performance impact on Enterprise ColumnStore. In some cases, it will even perform better than local disk operations.
Enterprise ColumnStore only uses the Storage Manager when S3-compatible storage is used for data.
Storage Manager is configured using storagemanager.cnf.
MariaDB Enterprise ColumnStore's Storage Manager directory is at the following path by default:
/var/lib/columnstore/storagemanager
To enable high availability when S3-compatible object storage is used, the Storage Manager directory should use Shared Local Storage and be mounted on every ColumnStore node.
When you want to use S3-compatible storage for Enterprise ColumnStore, you must configure Enterprise ColumnStore's S3 Storage Manager to use S3-compatible storage.
To configure Enterprise ColumnStore to use S3-compatible storage, edit /etc/columnstore/storagemanager.cnf:
[ObjectStorage]
…
service = S3
…
[S3]
region = your_columnstore_bucket_region
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
# ec2_iam_mode=enabled
# port_number = your_port_number
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_pathThe S3-compatible object storage options are configured under [S3]:
The bucket option must be set to the name of the bucket.
The endpoint option must be set to the endpoint for the S3-compatible object storage.
The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.
To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.
To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.
To use a non-default port number, you must set port_number to the desired port.
The local cache options are configured under [Cache]:
The cache_size option is set to 2 GB by default.
The path option is set to /var/lib/columnstore/storagemanager/cache by default.
Ensure that the specified path has sufficient storage space for the specified cache size.
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use shared local storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
The most common options for shared local storage are:
EBS (Elastic Block Store) Multi-Attach
EBS is a high-performance block storage service for AWS (Amazon Web Services).
EBS multi-attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.
For deployments in AWS, EBS Multi-Attach is a recommended option for the , and Amazon S3 storage is the recommended option for data.
EFS (Elastic File System)
EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).
For deployments in AWS, EFS is a recommended option for the , and Amazon S3 storage is the recommended option for data.
Filestore
Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).
For deployments in GCP, Filestore is the recommended option for the , and is the recommended option for data.
NFS (Network File System)
NFS is a distributed file system.
If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately.
For on-premises deployments, NFS is the recommended option for the , and any is the recommended option for data.
GlusterFS
GlusterFS is a distributed file system.
GlusterFS supports replication and failover.
Multi-node MariaDB Enterprise ColumnStore requires some directories to use shared local storage for high availability. The specific requirements depend on if MariaDB Enterprise ColumnStore is configured to use S3-compatible object storage:
Yes
No
For best results, MariaDB Corporation would recommend the following storage options:
AWS
Amazon S3 storage
EBS Multi-Attach or EFS
GCP
Google Object Storage (S3-compatible)
Filestore
On-premises
Any S3-compatible object storage
NFS
MariaDB Enterprise ColumnStore's storage format is optimized for analytical queries.
MariaDB Enterprise ColumnStore stores data in DB Root directories when S3-compatible object storage is not configured.
In a multi-node Enterprise ColumnStore, each node has its own DB Root directory.
The DB Root directories are at the following path by default:
/var/lib/columnstore/dataN
The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:
/var/lib/columnstore/data1
/var/lib/columnstore/data2
/var/lib/columnstore/data3
To enable high availability for the DB Root directories, each directory should be mounted on every ColumnStore node using Shared Local Storage.
Each column in a table is stored in units called extents.
By default, each extent contains the column values for 8 million rows. The physical size of each extent can range from 8 MB to 64 MB. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent.
Each extent is stored in 8 KB blocks, and each block has a logical block identifier (LBID).
If a string column is longer than 8 characters, the value is stored in a separate dictionary file, and a pointer to the value is stored in the extent.
A segment file is used to store Enterprise ColumnStore data within a DB Root directory.
A segment file always contains two extents. When a segment file reaches its maximum size, Enterprise ColumnStore creates a new segment file.
The relevant configuration options are:
ExtentsPerSegmentFile
• Configures the maximum number of extents that can be stored in each segment file. • Default value is 2.
For example, to configure Enterprise ColumnStore to store more extents in each segment file using the mcsSetConfig utility:
$ mcsSetConfig ExtentMap ExtentsPerSegmentFile 4Enterprise ColumnStore automatically groups a column's segment files into column partitions.
On disk, each column partition is represented by a directory in the DB Root. The directory contains the segment files for the column partition.
By default, a column partition can contain four segment files, but you can configure Enterprise ColumnStore to store more segment files in each column partition. When a column partition reaches the maximum number of segment files, Enterprise ColumnStore creates a new column partition.
The relevant configuration options are:
FilesPerColumnPartition
Configures the maximum number of segment files that can be stored in each column partition.
Default value is 4.
For example, to configure Enterprise ColumnStore to store more segment files in each column partition using the mcsSetConfig utility:
$ mcsSetConfig ExtentMap FilesPerColumnPartition 8Enterprise ColumnStore maintains an Extent Map to determine which values are located in each extent.
The Extent Map identifies each extent using its logical block identifier (LBID) values, and it maintains the minimum and maximum values within each extent.
The Extent Map is used to implement a performance optimization called Extent Elimination.
The primary node has a master copy of the Extent Map. When Enterprise ColumnStore is started, the primary node copies the Extent Map to the replica nodes.
While Enterprise ColumnStore is running, each node maintains a copy of the Extent Map in its main memory, so that it can be accessed quickly without additional I/O.
If the Extent Map gets corrupted, the mcsRebuildEM utility can rebuild the Extent Map from the contents of the database file system. The mcsRebuildEM utility is available starting in MariaDB Enterprise ColumnStore 6.2.2.
Enterprise ColumnStore automatically compresses all data on disk using either Snappy or LZ4 compression. See the columnstore_compression_type system variable for how to select the desired compression type.
Since Enterprise ColumnStore stores a single column's data in each segment file, the data in each segment file tends to be very similar. Similar data usually allows for excellent compressibility. However, the specific data compression ratio will depend on a lot of factors, such as the randomness of the data and the number of distinct values.
Enterprise ColumnStore's compression strategy is tuned to optimize the performance of I/O-bound queries, because the decompression rate is optimized to maximize read performance.
Enterprise ColumnStore uses the version buffer to store blocks that are being modified.
The version buffer is used for multiple tasks:
It is used to roll back a transaction.
It is used for multi-version concurrency control (MVCC). With MVCC, Enterprise ColumnStore can implement read snapshots, which allows a statement to have a consistent view of the database, even if some of the underlying rows have changed. The snapshot for a given statement is identified by the system change number (SCN).
The version buffer is split between data structures that are in-memory and on-disk.
The in-memory data structures are hash tables that keep track of in-flight transaction. The hash tables store the LBIDs for each block that is being modified by a transaction. The in-memory hash tables start at 4 MB, and they grow as-needed. The size of the hash tables increases as the number of modified blocks increases.
An on-disk version buffer file is stored in each DB Root. By default, the on-disk version buffer file is 1 GB, but you can configure Enterprise ColumnStore to use a different file size. The relevant configuration options are:
VersionBufferFileSize
• Configures the size of the on-disk version buffer in each DB Root. • Default value is 1 GB.
For example, to configure Enterprise ColumnStore to use a larger on-disk version buffer file using the mcsSetConfig utility:
$ mcsSetConfig VersionBuffer VersionBufferFileSize 2GBUsing the Extent Map, ColumnStore can perform logical range partitioning and only retrieve the blocks needed to satisfy the query. This is done through Extent Elimination, the process of eliminating Extents from the results that don't meet the given join and filter conditions of the query, which reduces the overall I/O operations.
In Extent Elimination, ColumnStore scans the columns in join and filter conditions. It then extracts the logical horizontal partitioning information of each extent along with the minimum and maximum values for the column to further eliminate Extents. To eliminate an Extent when a column scan involves a filter, that filter is compared to the minimum and maximum values stored in each extent for the column. If the filter value is outside the Extents minimum and maximum value range, ColumnStore eliminates the Extent.
This behavior is automatic and well suited for series, ordered, patterned and time-based data, where the data is loaded frequently and often referenced by time. Any column with clustered values is a good candidate for Extent Elimination.
Step 4: Start and Configure MariaDB Enterprise Server
This page details step 4 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step starts and configures MariaDB Enterprise Server, and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
The installation process might have started some of the ColumnStore services. The services should be stopped prior to making configuration changes.
On each Enterprise ColumnStore node, stop the MariaDB Enterprise Server service:
$ sudo systemctl stop mariadbOn each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:
$ sudo systemctl stop mariadb-columnstoreOn each Enterprise ColumnStore node, stop the CMAPI service:
$ sudo systemctl stop mariadb-columnstore-cmapiOn each Enterprise ColumnStore node, configure Enterprise Server.
Set this system variable to utf8
Set this system variable to utf8_general_ci
columnstore_use_import_for_batchinsert
Set this system variable to ALWAYS to always use cpimport for LOAD DATA INFILE and INSERT...SELECT statements.
Set this system variable to ON.
Set this option to the file you want to use for the Binary Log. Setting this option enables binary logging.
Set this option to the file you want to use to track binlog filenames.
Set this system variable to ON.
Set this option to the file you want to use for the Relay Logs. Setting this option enables relay logging.
Set this option to the file you want to use to index Relay Log filenames.
Sets the numeric Server ID for this MariaDB Enterprise Server. The value set on this option must be unique to each node.
Mandatory system variables and options for ColumnStore Object Storage include:
Example Configuration
[mariadb]
bind_address = 0.0.0.0
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ci
log_bin = mariadb-bin
log_bin_index = mariadb-bin.index
relay_log = mariadb-relay
relay_log_index = mariadb-relay.index
log_slave_updates = ON
gtid_strict_mode = ON
# This must be unique on each Enterprise ColumnStore node
server_id = 1On each Enterprise ColumnStore node, start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb$ sudo systemctl enable mariadbOn each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:
$ sudo systemctl stop mariadb-columnstoreAfter the CMAPI service is installed in the next step, CMAPI will start the Enterprise ColumnStore service as needed on each node. CMAPI disables the Enterprise ColumnStore service to prevent systemd from automatically starting Enterprise ColumnStore upon reboot.
On each Enterprise ColumnStore node, start and enable the CMAPI service, so that it starts automatically upon reboot:
$ sudo systemctl start mariadb-columnstore-cmapi$ sudo systemctl enable mariadb-columnstore-cmapiFor additional information, see "Start and Stop Services".
The ColumnStore Object Storage topology requires several user accounts. Each user account should be created on the primary server, so that it is replicated to the replica servers.
Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.
On the primary server, create the user account with the CREATE USER statement:
CREATE USER 'util_user'@'127.0.0.1'
IDENTIFIED BY 'util_user_passwd';On the primary server, grant the user account SELECT privileges on all databases with the GRANT statement:
GRANT SELECT, PROCESS ON *.*
TO 'util_user'@'127.0.0.1';On each Enterprise ColumnStore node, configure the ColumnStore utility user:
$ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1$ sudo mcsSetConfig CrossEngineSupport Port 3306$ sudo mcsSetConfig CrossEngineSupport User util_userOn each Enterprise ColumnStore node, set the password:
$ sudo mcsSetConfig CrossEngineSupport Password util_user_passwdFor details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".
Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.
ColumnStore Object Storage uses MariaDB Replication to replicate writes between the primary and replica servers. As MaxScale can promote a replica server to become a new primary in the event of node failure, all nodes must have a replication user.
The action is performed on the primary server.
Create the replication user and grant it the required privileges:
Use the CREATE USER statement to create replication user.
CREATE USER 'repl'@'192.0.2.%' IDENTIFIED BY 'repl_passwd';Replace the referenced IP address with the relevant address for your environment.
Ensure that the user account can connect to the primary server from each replica.
Grant the user account the required privileges with the GRANT statement.
GRANT REPLICA MONITOR,
REPLICATION REPLICA,
REPLICATION REPLICA ADMIN,
REPLICATION MASTER ADMIN
ON *.* TO 'repl'@'192.0.2.%';ColumnStore Object Storage 23.10 uses MariaDB MaxScale 22.08 to load balance between the nodes.
This action is performed on the primary server.
Use the statement to create the MaxScale user:
CREATE USER 'mxs'@'192.0.2.%'
IDENTIFIED BY 'mxs_passwd';Replace the referenced IP address with the relevant address for your environment.
Ensure that the user account can connect from the IP address of the MaxScale instance.
Use the statement to grant the privileges required by the router:
GRANT SHOW DATABASES ON *.* TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.columns_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.db TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.procs_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.proxies_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.roles_mapping TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.tables_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.user TO 'mxs'@'192.0.2.%';Use the statement to grant privileges required by the MariaDB Monitor.
GRANT BINLOG ADMIN,
READ_ONLY ADMIN,
RELOAD,
REPLICA MONITOR,
REPLICATION MASTER ADMIN,
REPLICATION REPLICA ADMIN,
REPLICATION REPLICA,
SHOW DATABASES,
SELECT
ON *.* TO 'mxs'@'192.0.2.%';On each replica server, configure MariaDB Replication:
Use the CHANGE MASTER TO statement to configure the connection to the primary server:
CHANGE MASTER TO
MASTER_HOST='192.0.2.1',
MASTER_USER='repl',
MASTER_PASSWORD='repl_passwd',
MASTER_USE_GTID=slave_pos;Start replication using the START REPLICA statement:
START REPLICA;Confirm that replication is working using the SHOW REPLICA STATUS statement:
SHOW REPLICA STATUS;Ensure that the replica server cannot accept local writes by setting the read_only system variable to ON using the SET GLOBAL statement:
SET GLOBAL read_only=ON;Initiate the primary server using CMAPI.
Create an API key for the cluster. This API key should be stored securely and kept confidential, because it can be used to add cluster nodes to the multi-node Enterprise ColumnStore deployment.
For example, to create a random 256-bit API key using openssl rand:
$ openssl rand -hex 32
93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532ddThis document will use the following API key in further examples, but users should create their own:
Use CMAPI to add the primary server to the cluster and set the API key. The new API key needs to be provided as part of the X-API-key HTML header.
For example, if the primary server's host name is mcs1 and its IP address is 192.0.2.1, use the following node command:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":120, "node": "192.0.2.1"}' \
| jq .{
"timestamp": "2020-10-28 00:39:14.672142",
"node_id": "192.0.2.1"
}Use CMAPI to check the status of the cluster node:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}Add the replica servers with CMAPI:
For each replica server, use CMAPI to add the replica server to the cluster. The previously set API key needs to be provided as part of the X-API-key HTML header.
For example, if the primary server's host name is mcs1 and the replica server's IP address is 192.0.2.2, use the following node command:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":120, "node": "192.0.2.2"}' \
| jq .{
"timestamp": "2020-10-28 00:42:42.796050",
"node_id": "192.0.2.2"
}After all replica servers have been added, use CMAPI to confirm that all cluster nodes have been successfully added:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"192.0.2.3": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 3
}The specific steps to configure the security module depend on the operating system.
Configure SELinux for Enterprise ColumnStore:
To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:
$ sudo yum install policycoreutils policycoreutils-pythonOn RHEL 8, install the following:
$ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utilsAllow the system to run under load for a while to generate SELinux audit events.
After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_localIf no audit events were found, this will print the following:
$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
Nothing to doIf audit events were found, the new SELinux policy can be loaded using semodule:
$ sudo semodule -i mariadb_local.ppSet SELinux to enforcing mode:
$ sudo setenforce enforcingSet SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.
For example, the file will usually look like this after the change:
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targetedConfirm that SELinux is in enforcing mode:
$ sudo getenforceEnforcingFor information on how to create a profile, see How to create an AppArmor Profile on Ubuntu.com.
The specific steps to configure the firewall service depend on the platform.
Configure firewalld for Enterprise Cluster on CentOS and RHEL:
Check if the firewalld service is running:
$ sudo systemctl status firewalldIf the firewalld service was stopped to perform the installation, start it now:
For example, if your cluster nodes are in the 192.0.2.0/24 subnet:
$ sudo systemctl start firewalldOpen up the relevant ports using firewall-cmd:
$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="3306" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8600-8630" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8640" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8700" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8800" protocol="tcp"
accept'Reload the runtime configuration:
$ sudo firewall-cmd --reloadConfigure UFW for Enterprise ColumnStore on Ubuntu:
Check if the UFW service is running:
$ sudo ufw status verboseIf the UFW service was stopped to perform the installation, start it now:
$ sudo ufw enableOpen up the relevant ports using ufw.
For example, if your cluster nodes are in the 192.0.2.0/24 subnet in the range 192.0.2.1 - 192.0.2.3:
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 3306 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8600:8630 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8640 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8700 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8800 proto tcpReload the runtime configuration:
$ sudo ufw reloadNavigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".
This page was step 4 of 9.
cpimport is a high-speed bulk load utility that imports data into ColumnStore tables in a fast and efficient manner. It accepts as input any flat file containing data that contains a delimiter between fields of data (i.e. columns in a table). The default delimiter is the pipe (‘|’) character, but other delimiters such as
commas may be used as well. The data values must be in the same order as the create table statement, i.e. column 1 matches the first column in the table and so on. Date values must be specified in the format 'yyyy-mm-dd'.
cpimport – performs the following operations when importing data into a MariaDB ColumnStore database:
Data is read from specified flat files.
Data is transformed to fit ColumnStore’s column-oriented storage design.
Redundant data is tokenized and logically compressed.
Data is written to disk.
It is important to note that:
The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.
There are two primary steps to using the cpimport utility:
Optionally create a job file that is used to load data from a flat file into multiple tables.
Run the cpimport utility to perform the data import.
The simplest form of cpimport command is
cpimport dbName tblName [loadFile]The full syntax is like this:
cpimport dbName tblName [loadFile]
[-h] [-m mode] [-f filepath] [-d DebugLevel]
[-c readBufferSize] [-b numBuffers] [-r numReaders]
[-e maxErrors] [-B libBufferSize] [-s colDelimiter] [-E EnclosedByChar]
[-C escChar] [-j jobID] [-p jobFilePath] [-w numParsers]
[-n nullOption] [-P pmList] [-i] [-S] [-q batchQty]
positional parameters:
dbName Name of the database to load
tblName Name of table to load
loadFile Optional input file name in current directory,
unless a fully qualified name is given.
If not given, input read from STDIN.
Options:
-b Number of read buffers
-c Application read buffer size(in bytes)
-d Print different level(1-3) debug message
-e Max number of allowable error per table per PM
-f Data file directory path.
Default is current working directory.
In Mode 1, -f represents the local input file path.
In Mode 2, -f represents the PM based input file path.
In Mode 3, -f represents the local input file path.
-l Name of import file to be loaded, relative to -f path. (Cannot be used with -p)
-h Print this message.
-q Batch Quantity, Number of rows distributed per batch in Mode 1
-i Print extended info to console in Mode 3.
-j Job ID. In simple usage, default is the table OID.
unless a fully qualified input file name is given.
-n NullOption (0-treat the string NULL as data (default);
1-treat the string NULL as a NULL value)
-p Path for XML job description file.
-r Number of readers.
-s The delimiter between column values.
-B I/O library read buffer size (in bytes)
-w Number of parsers.
-E Enclosed by character if field values are enclosed.
-C Escape character used in conjunction with 'enclosed by'
character, or as part of NULL escape sequence ('\N');
default is '\'
-I Import binary data; how to treat NULL values:
1 - import NULL values
2 - saturate NULL values
-P List of PMs ex: -P 1,2,3. Default is all PMs.
-S Treat string truncations as errors.
-m mode
1 - rows will be loaded in a distributed manner across PMs.
2 - PM based input files loaded onto their respective PM.
3 - input files will be loaded on the local PM.In this mode, you run the cpimport from your primary node (mcs1). The source file is located at this primary location and the data from cpimport is distributed across all the nodes. If no mode is specified, then this is the default.
Example:
cpimport -m1 mytest mytable mytable.tblIn this mode, you run the cpimport from your primary node (mcs1). The source data is in already partitioned data files residing on the PMs. Each PM should have the source data file of the same name but containing the partitioned data for the PM
Example:
cpimport -m2 mytest mytable -l /home/mydata/mytable.tblIn this mode, you run cpimport from the individual nodes independently, which will import the source file that exists on that node. Concurrent imports can be executed on every node for the same table.
Example:
cpimport -m3 mytest mytable /home/mydata/mytable.tblNote:
The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.
The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.
Data can be loaded from STDIN into ColumnStore by simply not including the loadFile parameter
Example:
cpimport db1 table1Similarly the AWS cli utility can be utilized to read data from an s3 bucket and pipe the output into cpimport allowing direct loading from S3. This assumes the aws cli program has been installed and configured on the host:
Example:
aws s3 cp --quiet s3://dthompson-test/trades_bulk.csv - | cpimport test trades -s ","For troubleshooting connectivity problems remove the --quiet option which suppresses client logging including permission errors.
Standard in can also be used to directly pipe the output from an arbitrary SELECT statement into cpimport. The select statement may select from non-columnstore tables such as or . In the example below, the db2.source_table is selected from, using the -N flag to remove non-data formatting. The -q flag tells the mysql client to not cache results which will avoid possible timeouts causing the load to fail.
Example:
mariadb -q -e 'select * from source_table;' -N <source-db> | cpimport -s '\t' <target-db> <target-table>Let's create a sample ColumnStore table:
CREATE DATABASE `json_columnstore`;
USE `json_columnstore`;
CREATE TABLE `products` (
`product_name` VARCHAR(11) NOT NULL DEFAULT '',
`supplier` VARCHAR(128) NOT NULL DEFAULT '',
`quantity` VARCHAR(128) NOT NULL DEFAULT '',
`unit_cost` VARCHAR(128) NOT NULL DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=utf8;Now let's create a sample products.json file like this:
[{
"_id": {
"$oid": "5968dd23fc13ae04d9000001"
},
"product_name": "Sildenafil Citrate",
"supplier": "Wisozk Inc",
"quantity": 261,
"unit_cost": "$10.47"
}, {
"_id": {
"$oid": "5968dd23fc13ae04d9000002"
},
"product_name": "Mountain Juniperus Ashei",
"supplier": "Keebler-Hilpert",
"quantity": 292,
"unit_cost": "$8.74"
}, {
"_id": {
"$oid": "5968dd23fc13ae04d9000003"
},
"product_name": "Dextromethorphan HBR",
"supplier": "Schmitt-Weissnat",
"quantity": 211,
"unit_cost": "$20.53"
}]We can then bulk load data from JSON into Columnstore by first piping the data to jq and then to cpimport using a one-line command.
Example:
cat products.json | jq -r '.[] | [.product_name,.supplier,.quantity,.unit_cost] | @csv' | cpimport json_columnstore products -s ',' -E '"'In this example, the JSON data is coming from a static JSON file, but this same method will work for, and output streamed from any datasource using JSON such as an API or NoSQL database. For more information on 'jq', please view the manual here here.
There are two ways multiple tables can be loaded:
Run multiple cpimport jobs simultaneously. Tables per import should be unique or PMs for each import should be unique if using mode 3.
Use colxml utility: colxml creates an XML job file for your database schema before you can import data. Multiple tables may be imported by either importing all tables within a schema or listing specific tables using the -t option in colxml. Then, using cpimport, that uses the job file generated by colxml. Here is an example of how to use colxml and cpimport to import data into all the tables in a database schema
colxml mytest -j299
cpimport -m1 -j299Usage: colxml [options] dbName
Options:
-d Delimiter (default '|')
-e Maximum allowable errors (per table)
-h Print this message
-j Job id (numeric)
-l Load file name
-n "name in quotes"
-p Path for XML job description file that is generated
-s "Description in quotes"
-t Table name
-u User
-r Number of read buffers
-c Application read buffer size (in bytes)
-w I/O library buffer size (in bytes), used to read files
-x Extension of file name (default ".tbl")
-E EnclosedByChar (if data has enclosed values)
-C EscapeChar
-b Debug level (1-3)The following tables comprise a database name ‘tpch2’:
MariaDB[tpch2]> show tables;
+---------------+
| Tables_in_tpch2 |
+--------------+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+--------------+
8 rows in set (0.00 sec)First, put delimited input data file for each table in /usr/local/mariadb/columnstore/data/bulk/data/import. Each file should be named .tbl.
Run colxml for the load job for the ‘tpch2’ database as shown here:
/usr/local/mariadb/columnstore/bin/colxml tpch2 -j500
Running colxml with the following parameters:
2015-10-07 15:14:20 (9481) INFO :
Schema: tpch2
Tables:
Load Files:
-b 0
-c 1048576
-d |
-e 10
-j 500
-n
-p /usr/local/mariadb/columnstore/data/bulk/job/
-r 5
-s
-u
-w 10485760
-x tbl
File completed for tables:
tpch2.customer
tpch2.lineitem
tpch2.nation
tpch2.orders
tpch2.part
tpch2.partsupp
tpch2.region
tpch2.supplier
Normal exit.Now actually run cpimport to use the job file generated by the colxml execution
/usr/local/mariadb/columnstore/bin/cpimport -j 500
Bulkload root directory : /usr/local/mariadb/columnstore/data/bulk
job description file : Job_500.xml
2015-10-07 15:14:59 (9952) INFO : successfully load job file /usr/local/mariadb/columnstore/data/bulk/job/Job_500.xml
2015-10-07 15:14:59 (9952) INFO : PreProcessing check starts
2015-10-07 15:15:04 (9952) INFO : PreProcessing check completed
2015-10-07 15:15:04 (9952) INFO : preProcess completed, total run time : 5 seconds
2015-10-07 15:15:04 (9952) INFO : No of Read Threads Spawned = 1
2015-10-07 15:15:04 (9952) INFO : No of Parse Threads Spawned = 3
2015-10-07 15:15:06 (9952) INFO : For table tpch2.customer: 150000 rows processed and 150000 rows inserted.
2015-10-07 15:16:12 (9952) INFO : For table tpch2.nation: 25 rows processed and 25 rows inserted.
2015-10-07 15:16:12 (9952) INFO : For table tpch2.lineitem: 6001215 rows processed and 6001215 rows inserted.
2015-10-07 15:16:31 (9952) INFO : For table tpch2.orders: 1500000 rows processed and 1500000 rows inserted.
2015-10-07 15:16:33 (9952) INFO : For table tpch2.part: 200000 rows processed and 200000 rows inserted.
2015-10-07 15:16:44 (9952) INFO : For table tpch2.partsupp: 800000 rows processed and 800000 rows inserted.
2015-10-07 15:16:44 (9952) INFO : For table tpch2.region: 5 rows processed and 5 rows inserted.
2015-10-07 15:16:45 (9952) INFO : For table tpch2.supplier: 10000 rows processed and 10000 rows inserted.If there are some differences between the input file and table definition then the colxml utility can be utilized to handle these cases:
Different order of columns in the input file from table order
Input file column values to be skipped / ignored.
Target table columns to be defaulted.
In this case run the colxml utility (the -t argument can be useful for producing a job file for one table if preferred) to produce the job xml file and then use this a template for editing and then subsequently use that job file for running cpimport.
Consider the following simple table example:
CREATE TABLE emp (
emp_id INT,
dept_id INT,
name VARCHAR(30),
salary INT,
hire_date DATE) ENGINE=columnstore;This would produce a colxml file with the following table element:
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="salary"/>
<Column colName="hire_date"/>
</Table>If your input file had the data such that hire_date comes before salary then the following modification will allow correct loading of that data to the original table definition (note the last 2 Column elements are swapped):
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="hire_date"/>
<Column colName="salary"/>
</Table>The following example would ignore the last entry in the file and default salary to it's default value (in this case null):
<Table tblName="test.emp"
loadName="emp.tbl" maxErrRow="10">
<Column colName="emp_id"/>
<Column colName="dept_id"/>
<Column colName="name"/>
<Column colName="hire_date"/>
<IgnoreField/>
<DefaultColumn colName="salary"/>
</Table>IgnoreFields instructs cpimport to ignore and skip the particular value at that position in the file.
DefaultColumn instructs cpimport to default the current table column and not move the column pointer forward to the next delimiter.
Both instructions can be used indepedently and as many times as makes sense for your data and table definition.
It is possible to import using a binary file instead of a CSV file using fixed length rows in binary data. This can be done using the '-I' flag which has two modes:
-I1 - binary mode with NULLs accepted Numeric fields containing NULL will be treated as NULL unless the column has a default value
-I2 - binary mode with NULLs saturated NULLs in numeric fields will be saturated
Example
cpimport -I1 mytest mytable /home/mydata/mytable.binThe following table shows how to represent the data in the binary format:
INT/TINYINT/SMALLINT/BIGINT
Little-endian format for the numeric data
FLOAT/DOUBLE
IEEE format native to the computer
CHAR/VARCHAR
Data padded with '\0' for the length of the field. An entry that is all '\0' is treated as NULL
DATE
Using the Date struct below
DATETIME
Using the DateTime struct below
DECIMAL
Stored using an integer representation of the DECIMAL without the decimal point. With precision/width of 2 or less 2 bytes should be used, 3-4 should use 3 bytes, 4-9 should use 4 bytes and 10+ should use 8 bytes
For NULL values the following table should be used:
BIGINT
0x8000000000000000ULL
0xFFFFFFFFFFFFFFFEULL
INT
0x80000000
0xFFFFFFFE
SMALLINT
0x8000
0xFFFE
TINYINT
0x80
0xFE
DECIMAL
As equiv. INT
As equiv. INT
FLOAT
0xFFAAAAAA
N/A
DOUBLE
0xFFFAAAAAAAAAAAAAULL
N/A
DATE
0xFFFFFFFE
N/A
DATETIME
0xFFFFFFFFFFFFFFFEULL
N/A
CHAR/VARCHAR
Fill with '\0'
N/A
struct Date
{
unsigned spare : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};The spare bits in the Date struct "must" be set to 0x3E.
struct DateTime
{
unsigned msecond : 20;
unsigned second : 6;
unsigned minute : 6;
unsigned hour : 6;
unsigned day : 6;
unsigned month : 4;
unsigned year : 16
};As of version 1.4, cpimport uses the /var/lib/columnstore/bulk folder for all work being done. This folder contains:
Logs
Rollback info
Job info
A staging folder
The log folder typically contains:
-rw-r--r--. 1 root root 0 Dec 29 06:41 cpimport_1229064143_21779.err
-rw-r--r--. 1 root root 1146 Dec 29 06:42 cpimport_1229064143_21779.logA typical log might look like this:
2020-12-29 06:41:44 (21779) INFO : Running distributed import (mode 1) on all PMs...
2020-12-29 06:41:44 (21779) INFO2 : /usr/bin/cpimport.bin -s , -E " -R /tmp/columnstore_tmp_files/BrmRpt112906414421779.rpt -m 1 -P pm1-21779 -T SYSTEM -u388952c1-4ab8-46d6-9857-c44827b1c3b9 bts flights
2020-12-29 06:41:58 (21779) INFO2 : Received a BRM-Report from 1
2020-12-29 06:41:58 (21779) INFO2 : Received a Cpimport Pass from PM1
2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 2
2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM2
2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 3
2020-12-29 06:42:03 (21779) INFO2 : BRM updated successfully
2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM3
2020-12-29 06:42:04 (21779) INFO2 : Released Table Lock
2020-12-29 06:42:04 (21779) INFO2 : Cleanup succeed on all PMs
2020-12-29 06:42:04 (21779) INFO : For table bts.flights: 374573 rows processed and 374573 rows inserted.
2020-12-29 06:42:04 (21779) INFO : Bulk load completed, total run time : 20.3052 seconds
2020-12-29 06:42:04 (21779) INFO2 : Shutdown of all child threads Finished!!Prior to version 1.4, this folder was located at /usr/local/mariadb/columnstore/bulk.
Step 8: Test MariaDB MaxScale
This page details step 8 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".
This step tests MariaDB MaxScale 22.08.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Use command to view the global MaxScale configuration.
This action is performed on the MaxScale node:
Output should align to the global MaxScale configuration in the new configuration file you created.
Use the and commands to view the configured server objects.
This action is performed on the MaxScale node:
Obtain the full list of servers objects:
For each server object, view the configuration:
Output should align to the Server Object configuration you performed.
Use the and commands to view the configured monitors.
This action is performed on the MaxScale node:
Obtain the full list of monitors:
For each monitor, view the monitor configuration:
Output should align to the MariaDB Monitor (mariadbmon) configuration you performed.
Use the and commands to view the configured routing services.
This action is performed on the MaxScale node:
Obtain the full list of routing services:
For each service, view the service configuration:
Output should align to the or configuration you performed.
Applications should use a dedicated user account. The user account must be created on the primary server.
When users connect to MaxScale, MaxScale authenticates the user connection before routing it to an Enterprise Server node. Enterprise Server authenticates the connection as originating from the IP address of the MaxScale node.
The application users must have one user account with the host IP address of the application server and a second user account with the host IP address of the MaxScale node.
The requirement of a duplicate user account can be avoided by enabling the proxy_protocol parameter for MaxScale and the proxy_protocol_networks for Enterprise Server.
This action is performed on the primary Enterprise ColumnStore node:
Connect to the primary Enterprise ColumnStore node:
Create the database user account for your MaxScale node:
Replace 192.0.2.10 with the relevant IP address specification for your MaxScale node.
Passwords should meet your organization's password policies.
Grant the privileges required by your application to the database user account for your MaxScale node:
The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.
This action is performed on the primary Enterprise ColumnStore node:
Create the database user account for your application server:
Replace 192.0.2.11 with the relevant IP address specification for your application server.
Passwords should meet your organization's password policies.
Grant the privileges required by your application to the d database user account for your application server:
The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.
To test the connection, use the MariaDB Client from your application server to connect to an Enterprise ColumnStore node through MaxScale.
This action is performed on a client connected to the MaxScale node:
If you configured the Read Connection Router, confirm that MaxScale routes connections to the replica servers.
On the MaxScale node, use the command to view the available listeners and ports:
Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read Connection Router (in the example, 3308):
Use the application user credentials you created for the --user and --password options.
In each terminal, query the hostname and server_id system variable and option to identify to which you're connected:
Different terminals should return different values since MaxScale routes the connections to different nodes.
Since the router was configured with the slave router option, the Read Connection Router only routes connections to replica servers.
If you configured the Read/Write Split Router, confirm that MaxScale routes write queries on this router to the primary Enterprise ColumnStore node.
on the MaxScale node, use the command to view the available listeners and ports:
Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read/Write Split Router (in the example, 3307):
Use the application user credentials you created for the --user and --password options.
In one terminal, create the test table:
In each terminal, issue an insert.md statement to add a row to the example table with the values of the hostname and server_id system variable and option:
In one terminal, issue a SELECT statement to query the results:
While MaxScale is handling multiple connections from different terminals, it routed all connections to the current primary Enterprise ColumnStore node, which in the example is mcs1#.
If you configured the , confirm that MaxScale routes read queries on this router to replica servers.
On the MaxScale node, use the command to view the available listeners and ports:
In a terminal connected to your application server, use MariaDB Client to connect to the listener port for the (in the example, 3307):
Use the application user credentials you created for the --user and --password options.
Query the hostname and server_id to identify which server MaxScale routed you to.
Resend the query:
Confirm that MaxScale routes the SELECT statements to different replica servers.
For more information on different routing criteria, see slave_selection_criteria
"Deploy ColumnStore Shared Local Storage Topology".
This page was step 8 of 9.
Step 4: Start and Configure MariaDB Enterprise Server
This page details step 4 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step starts and configures MariaDB Enterprise Server, and MariaDB Enterprise ColumnStore 23.10.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
The installation process might have started some of the ColumnStore services. The services should be stopped prior to making configuration changes.
On each Enterprise ColumnStore node, stop the MariaDB Enterprise Server service:
On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:
On each Enterprise ColumnStore node, stop the CMAPI service:
On each Enterprise ColumnStore node, configure Enterprise Server.
Mandatory system variables and options for ColumnStore Object Storage include:
Example Configuration
On each Enterprise ColumnStore node, configure S3 Storage Manager to use S3-compatible storage by editing the /etc/columnstore/storagemanager.cnf configuration file:
The S3-compatible object storage options are configured under [S3]:
The bucket option must be set to the name of the bucket that you created in "Create an S3 Bucket".
The endpoint option must be set to the endpoint for the S3-compatible object storage.
The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.
To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.
To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.
The local cache options are configured under [Cache]:
The cache_size option is set to 2 GB by default.
The path option is set to /var/lib/columnstore/storagemanager/cache by default.
Ensure that the specified path has sufficient storage space for the specified cache size.
On each Enterprise ColumnStore node, start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:
On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:
After the CMAPI service is installed in the next step, CMAPI will start the Enterprise ColumnStore service as needed on each node. CMAPI disables the Enterprise ColumnStore service to prevent systemd from automatically starting Enterprise ColumnStore upon reboot.
On each Enterprise ColumnStore node, start and enable the CMAPI service, so that it starts automatically upon reboot:
For additional information, see "".
The ColumnStore Object Storage topology requires several user accounts. Each user account should be created on the primary server, so that it is replicated to the replica servers.
Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.
On the primary server, create the user account with the CREATE USER statement:
On the primary server, grant the user account SELECT privileges on all databases with the GRANT statement:
On each Enterprise ColumnStore node, configure the ColumnStore utility user:
On each Enterprise ColumnStore node, set the password:
For details about how to encrypt the password, see "".
Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.
ColumnStore Object Storage uses MariaDB Replication to replicate writes between the primary and replica servers. As MaxScale can promote a replica server to become a new primary in the event of node failure, all nodes must have a replication user.
The action is performed on the primary server.
Create the replication user and grant it the required privileges:
Use the CREATE USER statement to create replication user.
Replace the referenced IP address with the relevant address for your environment.
Ensure that the user account can connect to the primary server from each replica.
Grant the user account the required privileges with the GRANT statement.
ColumnStore Object Storage 23.10 uses MariaDB MaxScale 22.08 to load balance between the nodes.
This action is performed on the primary server.
Use the statement to create the MaxScale user:
Replace the referenced IP address with the relevant address for your environment.
Ensure that the user account can connect from the IP address of the MaxScale instance.
Use the statement to grant the privileges required by the router:
Use the statement to grant privileges required by the MariaDB Monitor.
On each replica server, configure MariaDB Replication:
Use the CHANGE MASTER TO statement to configure the connection to the primary server:
Start replication using the START REPLICA statement:
Confirm that replication is working using the SHOW REPLICA STATUS statement:
Ensure that the replica server cannot accept local writes by setting the read_only system variable to ON using the SET GLOBAL statement:
Initiate the primary server using CMAPI.
Create an API key for the cluster. This API key should be stored securely and kept confidential, because it can be used to add cluster nodes to the multi-node Enterprise ColumnStore deployment.
For example, to create a random 256-bit API key using openssl rand:
This document will use the following API key in further examples, but users should create their own:
Use CMAPI to add the primary server to the cluster and set the API key. The new API key needs to be provided as part of the X-API-key HTML header.
For example, if the primary server's host name is mcs1 and its IP address is 192.0.2.1, use the following node command:
Use CMAPI to check the status of the cluster node:
Add the replica servers with CMAPI:
For each replica server, use to add the replica server to the cluster. The previously set API key needs to be provided as part of the X-API-key HTML header.
For example, if the primary server's host name is mcs1 and the replica server's IP address is 192.0.2.2, use the following node command:
After all replica servers have been added, use CMAPI to confirm that all cluster nodes have been successfully added:
The specific steps to configure the security module depend on the operating system.
Configure SELinux for Enterprise ColumnStore:
To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:
On RHEL 8, install the following:
Allow the system to run under load for a while to generate SELinux audit events.
After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:
If no audit events were found, this will print the following:
If audit events were found, the new SELinux policy can be loaded using semodule:
Set SELinux to enforcing mode:
Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.
For example, the file will usually look like this after the change:
Confirm that SELinux is in enforcing mode:
For information on how to create a profile, see on Ubuntu.com.
The specific steps to configure the firewall service depend on the platform.
Configure firewalld for Enterprise Cluster on CentOS and RHEL:
Check if the firewalld service is running:
If the firewalld service was stopped to perform the installation, start it now:
For example, if your cluster nodes are in the 192.0.2.0/24 subnet:
Open up the relevant ports using firewall-cmd:
Reload the runtime configuration:
Configure UFW for Enterprise ColumnStore on Ubuntu:
Check if the UFW service is running:
If the UFW service was stopped to perform the installation, start it now:
Open up the relevant ports using ufw.
For example, if your cluster nodes are in the 192.0.2.0/24 subnet in the range 192.0.2.1 - 192.0.2.3:
Reload the runtime configuration:
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 4 of 9.
Step 8: Test MariaDB MaxScale
This page details step 8 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".
This step tests MariaDB MaxScale 22.08.
Interactive commands are detailed. Alternatively, the described operations can be performed using automation.
Use command to view the global MaxScale configuration.
This action is performed on the MaxScale node:
Output should align to the global MaxScale configuration in the new configuration file you created.
Check Server Configuration Use the and commands to view the configured server objects.
This action is performed on the MaxScale node:
Obtain the full list of servers objects:
For each server object, view the configuration:
Output should align to the Server Object configuration you performed.
Use the and commands to view the configured monitors.
This action is performed on the MaxScale node:
Obtain the full list of monitors:
For each monitor, view the monitor configuration:
Output should align to the MariaDB Monitor (mariadbmon) configuration you performed.
Use the and commands to view the configured routing services.
This action is performed on the MaxScale node:
Obtain the full list of routing services:
For each service, view the service configuration:
Output should align to the or configuration you performed.
Applications should use a dedicated user account. The user account must be created on the primary server.
When users connect to MaxScale, MaxScale authenticates the user connection before routing it to an Enterprise Server node. Enterprise Server authenticates the connection as originating from the IP address of the MaxScale node.
The application users must have one user account with the host IP address of the application server and a second user account with the host IP address of the MaxScale node.
The requirement of a duplicate user account can be avoided by enabling the proxy_protocol parameter for MaxScale and the proxy_protocol_networks for Enterprise Server.
This action is performed on the primary Enterprise ColumnStore node:
Connect to the primary Enterprise ColumnStore node:
Create the database user account for your MaxScale node:
Replace 192.0.2.10 with the relevant IP address specification for your MaxScale node.
Passwords should meet your organization's password policies.
Grant the privileges required by your application to the database user account for your MaxScale node:
The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.
This action is performed on the primary Enterprise ColumnStore node:
Create the database user account for your application server:
Replace 192.0.2.11 with the relevant IP address specification for your application server.
Passwords should meet your organization's password policies.
Grant the privileges required by your application to the d database user account for your application server:
The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.
To test the connection, use the MariaDB Client from your application server to connect to an Enterprise ColumnStore node through MaxScale.
This action is performed on a client connected to the MaxScale node:
If you configured the Read Connection Router, confirm that MaxScale routes connections to the replica servers.
On the MaxScale node, use the command to view the available listeners and ports:
Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read Connection Router (in the example, 3308):
Use the application user credentials you created for the --user and --password options.
In each terminal, query the hostname and server_id system variable and option to identify to which you're connected:
Different terminals should return different values since MaxScale routes the connections to different nodes.
Since the router was configured with the slave router option, the Read Connection Router only routes connections to replica servers.
If you configured the Read/Write Split Router, confirm that MaxScale routes write queries on this router to the primary Enterprise ColumnStore node.
on the MaxScale node, use the command to view the available listeners and ports:
Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read/Write Split Router (in the example, 3307):
Use the application user credentials you created for the --user and --password options.
In one terminal, create the test table:
In each terminal, issue an insert.md statement to add a row to the example table with the values of the hostname and server_id system variable and option:
In one terminal, issue a SELECT statement to query the results:
While MaxScale is handling multiple connections from different terminals, it routed all connections to the current primary Enterprise ColumnStore node, which in the example is mcs1#.
If you configured the , confirm that MaxScale routes read queries on this router to replica servers.
On the MaxScale node, use the command to view the available listeners and ports:
In a terminal connected to your application server, use MariaDB Client to connect to the listener port for the (in the example, 3307):
Use the application user credentials you created for the --user and --password options.
Query the hostname and server_id to identify which server MaxScale routed you to.
Resend the query:
Confirm that MaxScale routes the SELECT statements to different replica servers.
For more information on different routing criteria, see slave_selection_criteria
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":
This page was step 8 of 9.
$ maxctrl show maxscale┌──────────────┬───────────────────────────────────────────────────────┐
│ Version │ 22.08.15 │
├──────────────┼───────────────────────────────────────────────────────┤
│ Commit │ 3761fa7a52046bc58faad8b5a139116f9e33364c │
├──────────────┼───────────────────────────────────────────────────────┤
│ Started At │ Thu, 05 Aug 2021 20:21:20 GMT │
├──────────────┼───────────────────────────────────────────────────────┤
│ Activated At │ Thu, 05 Aug 2021 20:21:20 GMT │
├──────────────┼───────────────────────────────────────────────────────┤
│ Uptime │ 868 │
├──────────────┼───────────────────────────────────────────────────────┤
│ Config Sync │ null │
├──────────────┼───────────────────────────────────────────────────────┤
│ Parameters │ { │
│ │ "admin_auth": true, │
│ │ "admin_enabled": true, │
│ │ "admin_gui": true, │
│ │ "admin_host": "0.0.0.0", │
│ │ "admin_log_auth_failures": true, │
│ │ "admin_pam_readonly_service": null, │
│ │ "admin_pam_readwrite_service": null, │
│ │ "admin_port": 8989, │
│ │ "admin_secure_gui": false, │
│ │ "admin_ssl_ca_cert": null, │
│ │ "admin_ssl_cert": null, │
│ │ "admin_ssl_key": null, │
│ │ "admin_ssl_version": "MAX", │
│ │ "auth_connect_timeout": "10000ms", │
│ │ "auth_read_timeout": "10000ms", │
│ │ "auth_write_timeout": "10000ms", │
│ │ "cachedir": "/var/cache/maxscale", │
│ │ "config_sync_cluster": null, │
│ │ "config_sync_interval": "5000ms", │
│ │ "config_sync_password": "*****", │
│ │ "config_sync_timeout": "10000ms", │
│ │ "config_sync_user": null, │
│ │ "connector_plugindir": "/usr/lib64/mysql/plugin", │
│ │ "datadir": "/var/lib/maxscale", │
│ │ "debug": null, │
│ │ "dump_last_statements": "never", │
│ │ "execdir": "/usr/bin", │
│ │ "language": "/var/lib/maxscale", │
│ │ "libdir": "/usr/lib64/maxscale", │
│ │ "load_persisted_configs": true, │
│ │ "local_address": null, │
│ │ "log_debug": false, │
│ │ "log_info": false, │
│ │ "log_notice": true, │
│ │ "log_throttling": { │
│ │ "count": 10, │
│ │ "suppress": 10000, │
│ │ "window": 1000 │
│ │ }, │
│ │ "log_warn_super_user": false, │
│ │ "log_warning": true, │
│ │ "logdir": "/var/log/maxscale", │
│ │ "max_auth_errors_until_block": 10, │
│ │ "maxlog": true, │
│ │ "module_configdir": "/etc/maxscale.modules.d", │
│ │ "ms_timestamp": false, │
│ │ "passive": false, │
│ │ "persistdir": "/var/lib/maxscale/maxscale.cnf.d", │
│ │ "piddir": "/var/run/maxscale", │
│ │ "query_classifier": "qc_sqlite", │
│ │ "query_classifier_args": null, │
│ │ "query_classifier_cache_size": 289073971, │
│ │ "query_retries": 1, │
│ │ "query_retry_timeout": "5000ms", │
│ │ "rebalance_period": "0ms", │
│ │ "rebalance_threshold": 20, │
│ │ "rebalance_window": 10, │
│ │ "retain_last_statements": 0, │
│ │ "session_trace": 0, │
│ │ "skip_permission_checks": false, │
│ │ "sql_mode": "default", │
│ │ "syslog": true, │
│ │ "threads": 1, │
│ │ "users_refresh_interval": "0ms", │
│ │ "users_refresh_time": "30000ms", │
│ │ "writeq_high_water": 16777216, │
│ │ "writeq_low_water": 8192 │
│ │ } │
└──────────────┴───────────────────────────────────────────────────────┘$ maxctrl list servers┌────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs1 │ 192.0.2.1 │ 3306 │ 1 │ Master, Running │ 0-1-25 │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs2 │ 192.0.2.2 │ 3306 │ 1 │ Slave, Running │ 0-1-25 │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs3 │ 192.0.2.3 │ 3306 │ 1 │ Slave, Running │ 0-1-25 │
└────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┘$ maxctrl show server mcs1┌─────────────────────┬───────────────────────────────────────────┐
│ Server │ mcs1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Address │ 192.0.2.1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Port │ 3306 │
├─────────────────────┼───────────────────────────────────────────┤
│ State │ Master, Running │
├─────────────────────┼───────────────────────────────────────────┤
│ Version │ 11.4.5-3-MariaDB-enterprise-log │
├─────────────────────┼───────────────────────────────────────────┤
│ Last Event │ master_up │
├─────────────────────┼───────────────────────────────────────────┤
│ Triggered At │ Thu, 05 Aug 2021 20:22:26 GMT │
├─────────────────────┼───────────────────────────────────────────┤
│ Services │ connection_router_service │
│ │ query_router_service │
├─────────────────────┼───────────────────────────────────────────┤
│ Monitors │ columnstore_monitor │
├─────────────────────┼───────────────────────────────────────────┤
│ Master ID │ -1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Node ID │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Slave Server IDs │ │
├─────────────────────┼───────────────────────────────────────────┤
│ Current Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Total Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Max Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Statistics │ { │
│ │ "active_operations": 0, │
│ │ "adaptive_avg_select_time": "0ns", │
│ │ "connection_pool_empty": 0, │
│ │ "connections": 1, │
│ │ "max_connections": 1, │
│ │ "max_pool_size": 0, │
│ │ "persistent_connections": 0, │
│ │ "reused_connections": 0, │
│ │ "routed_packets": 0, │
│ │ "total_connections": 1 │
│ │ } │
├─────────────────────┼───────────────────────────────────────────┤
│ Parameters │ { │
│ │ "address": "192.0.2.1", │
│ │ "disk_space_threshold": null, │
│ │ "extra_port": 0, │
│ │ "monitorpw": null, │
│ │ "monitoruser": null, │
│ │ "persistmaxtime": "0ms", │
│ │ "persistpoolmax": 0, │
│ │ "port": 3306, │
│ │ "priority": 0, │
│ │ "proxy_protocol": false, │
│ │ "rank": "primary", │
│ │ "socket": null, │
│ │ "ssl": false, │
│ │ "ssl_ca_cert": null, │
│ │ "ssl_cert": null, │
│ │ "ssl_cert_verify_depth": 9, │
│ │ "ssl_cipher": null, │
│ │ "ssl_key": null, │
│ │ "ssl_verify_peer_certificate": false, │
│ │ "ssl_verify_peer_host": false, │
│ │ "ssl_version": "MAX" │
│ │ } │
└─────────────────────┴───────────────────────────────────────────┘$ maxctrl list monitors┌─────────────────────┬─────────┬──────────────────┐
│ Monitor │ State │ Servers │
├─────────────────────┼─────────┼──────────────────┤
│ columnstore_monitor │ Running │ mcs1, mcs2, mcs3 │
└─────────────────────┴─────────┴──────────────────┘$ maxctrl show monitor columnstore_monitor┌─────────────────────┬─────────────────────────────────────┐
│ Monitor │ columnstore_monitor │
├─────────────────────┼─────────────────────────────────────┤
│ Module │ mariadbmon │
├─────────────────────┼─────────────────────────────────────┤
│ State │ Running │
├─────────────────────┼─────────────────────────────────────┤
│ Servers │ mcs1 │
│ │ mcs2 │
│ │ mcs3 │
├─────────────────────┼─────────────────────────────────────┤
│ Parameters │ { │
│ │ "backend_connect_attempts": 1, │
│ │ "backend_connect_timeout": 3, │
│ │ "backend_read_timeout": 3, │
│ │ "backend_write_timeout": 3, │
│ │ "disk_space_check_interval": 0, │
│ │ "disk_space_threshold": null, │
│ │ "events": "all", │
│ │ "journal_max_age": 28800, │
│ │ "module": "mariadbmon", │
│ │ "monitor_interval": 2000, │
│ │ "password": "*****", │
│ │ "script": null, │
│ │ "script_timeout": 90, │
│ │ "user": "mxs" │
│ │ } │
├─────────────────────┼─────────────────────────────────────┤
│ Monitor Diagnostics │ {} │
└─────────────────────┴─────────────────────────────────────┘$ maxctrl list services┌───────────────────────────┬────────────────┬─────────────┬───────────────────┬──────────────────┐
│ Service │ Router │ Connections │ Total Connections │ Servers │
├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
│ connection_router_Service │ readconnroute │ 0 │ 0 │ mcs1, mcs2, mcs3 │
├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
│ query_router_service │ readwritesplit │ 0 │ 0 │ mcs1, mcs2, mcs3 │
└───────────────────────────┴────────────────┴─────────────┴───────────────────┴──────────────────┘$ maxctrl show service query_router_service┌─────────────────────┬─────────────────────────────────────────────────────────────┐
│ Service │ query_router_service │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Router │ readwritesplit │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ State │ Started │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Started At │ Sat Aug 28 21:41:16 2021 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Current Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Total Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Max Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Cluster │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Servers │ mcs1 │
│ │ mcs2 │
│ │ mcs3 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Services │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Filters │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Parameters │ { │
│ │ "auth_all_servers": false, │
│ │ "causal_reads": "false", │
│ │ "causal_reads_timeout": "10000ms", │
│ │ "connection_keepalive": "300000ms", │
│ │ "connection_timeout": "0ms", │
│ │ "delayed_retry": false, │
│ │ "delayed_retry_timeout": "10000ms", │
│ │ "disable_sescmd_history": false, │
│ │ "enable_root_user": false, │
│ │ "idle_session_pool_time": "-1000ms", │
│ │ "lazy_connect": false, │
│ │ "localhost_match_wildcard_host": true, │
│ │ "log_auth_warnings": true, │
│ │ "master_accept_reads": false, │
│ │ "master_failure_mode": "fail_instantly", │
│ │ "master_reconnection": false, │
│ │ "max_connections": 0, │
│ │ "max_sescmd_history": 50, │
│ │ "max_slave_connections": 255, │
│ │ "max_slave_replication_lag": "0ms", │
│ │ "net_write_timeout": "0ms", │
│ │ "optimistic_trx": false, │
│ │ "password": "*****", │
│ │ "prune_sescmd_history": true, │
│ │ "rank": "primary", │
│ │ "retain_last_statements": -1, │
│ │ "retry_failed_reads": true, │
│ │ "reuse_prepared_statements": false, │
│ │ "router": "readwritesplit", │
│ │ "session_trace": false, │
│ │ "session_track_trx_state": false, │
│ │ "slave_connections": 255, │
│ │ "slave_selection_criteria": "LEAST_CURRENT_OPERATIONS", │
│ │ "strict_multi_stmt": false, │
│ │ "strict_sp_calls": false, │
│ │ "strip_db_esc": true, │
│ │ "transaction_replay": false, │
│ │ "transaction_replay_attempts": 5, │
│ │ "transaction_replay_max_size": 1073741824, │
│ │ "transaction_replay_retry_on_deadlock": false, │
│ │ "type": "service", │
│ │ "use_sql_variables_in": "all", │
│ │ "user": "mxs", │
│ │ "version_string": null │
│ │ } │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Router Diagnostics │ { │
│ │ "avg_sescmd_history_length": 0, │
│ │ "max_sescmd_history_length": 0, │
│ │ "queries": 0, │
│ │ "replayed_transactions": 0, │
│ │ "ro_transactions": 0, │
│ │ "route_all": 0, │
│ │ "route_master": 0, │
│ │ "route_slave": 0, │
│ │ "rw_transactions": 0, │
│ │ "server_query_statistics": [] │
│ │ } │
└─────────────────────┴─────────────────────────────────────────────────────────────┘$ sudo mariadbCREATE USER 'app_user'@'192.0.2.10' IDENTIFIED BY 'app_user_passwd';GRANT ALL ON test.* TO 'app_user'@'192.0.2.10';CREATE USER 'app_user'@'192.0.2.11' IDENTIFIED BY 'app_user_passwd';GRANT ALL ON test.* TO 'app_user'@'192.0.2.11';$ mariadb --host 192.0.2.10 --port 3307
--user app_user --password$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3308 \
--user app_user --passwordSELECT @@global.hostname, @@global.server_id;
+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs2 | 2 |
+-------------------+--------------------+$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3307 \
--user app_user --passwordCREATE TABLE test.load_balancing_test (
id INT PRIMARY KEY AUTO_INCREMENT,
hostname VARCHAR(256),
server_id INT
);INSERT INTO test.load_balancing_test (hostname, server_id)
VALUES (@@global.hostname, @@global.server_id);SELECT * FROM test.load_balancing_test;+----+----------+-----------+
| id | hostname | server_id |
+----+----------+-----------+
| 1 | mcs1 | 1 |
| 2 | mcs1 | 1 |
| 3 | mcs1 | 1 |
+----+----------+-----------+$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3307 \
--user app_user --passwordSELECT @@global.hostname, @@global.server_id;+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs2 | 2 |
+-------------------+--------------------+SELECT @@global.hostname, @@global.server_id;+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs3 | 3 |
+-------------------+--------------------+$ sudo systemctl stop mariadb$ sudo systemctl stop mariadb-columnstore$ sudo systemctl stop mariadb-columnstore-cmapiSet this system variable to utf8
Set this system variable to utf8_general_ci
columnstore_use_import_for_batchinsert
Set this system variable to ALWAYS to always use cpimport for LOAD DATA INFILE and INSERT...SELECT statements.
Set this system variable to ON.
Set this option to the file you want to use for the Binary Log. Setting this option enables binary logging.
Set this option to the file you want to use to track binlog filenames.
Set this system variable to ON.
Set this option to the file you want to use for the Relay Logs. Setting this option enables relay logging.
Set this option to the file you want to use to index Relay Log filenames.
Sets the numeric Server ID for this MariaDB Enterprise Server. The value set on this option must be unique to each node.
[mariadb]
bind_address = 0.0.0.0
log_error = mariadbd.err
character_set_server = utf8
collation_server = utf8_general_ci
log_bin = mariadb-bin
log_bin_index = mariadb-bin.index
relay_log = mariadb-relay
relay_log_index = mariadb-relay.index
log_slave_updates = ON
gtid_strict_mode = ON
# This must be unique on each Enterprise ColumnStore node
server_id = 1[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
# ec2_iam_mode = enabled
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path$ sudo systemctl start mariadb$ sudo systemctl enable mariadb$ sudo systemctl stop mariadb-columnstore$ sudo systemctl start mariadb-columnstore-cmapi$ sudo systemctl enable mariadb-columnstore-cmapiCREATE USER 'util_user'@'127.0.0.1'
IDENTIFIED BY 'util_user_passwd';GRANT SELECT, PROCESS ON *.*
TO 'util_user'@'127.0.0.1';$ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1$ sudo mcsSetConfig CrossEngineSupport Port 3306$ sudo mcsSetConfig CrossEngineSupport User util_user$ sudo mcsSetConfig CrossEngineSupport Password util_user_passwdCREATE USER 'repl'@'192.0.2.%' IDENTIFIED BY 'repl_passwd';GRANT REPLICA MONITOR,
REPLICATION REPLICA,
REPLICATION REPLICA ADMIN,
REPLICATION MASTER ADMIN
ON *.* TO 'repl'@'192.0.2.%';CREATE USER 'mxs'@'192.0.2.%'
IDENTIFIED BY 'mxs_passwd';GRANT SHOW DATABASES ON *.* TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.columns_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.db TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.procs_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.proxies_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.roles_mapping TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.tables_priv TO 'mxs'@'192.0.2.%';
GRANT SELECT ON mysql.user TO 'mxs'@'192.0.2.%';GRANT BINLOG ADMIN,
READ_ONLY ADMIN,
RELOAD,
REPLICA MONITOR,
REPLICATION MASTER ADMIN,
REPLICATION REPLICA ADMIN,
REPLICATION REPLICA,
SHOW DATABASES,
SELECT
ON *.* TO 'mxs'@'192.0.2.%';CHANGE MASTER TO
MASTER_HOST='192.0.2.1',
MASTER_USER='repl',
MASTER_PASSWORD='repl_passwd',
MASTER_USE_GTID=slave_pos;START REPLICA;SHOW REPLICA STATUS;SET GLOBAL read_only=ON;$ openssl rand -hex 32
93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":120, "node": "192.0.2.1"}' \
| jq .{
"timestamp": "2020-10-28 00:39:14.672142",
"node_id": "192.0.2.1"
}$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":120, "node": "192.0.2.2"}' \
| jq .{
"timestamp": "2020-10-28 00:42:42.796050",
"node_id": "192.0.2.2"
}$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
},
"192.0.2.2": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"192.0.2.3": {
"timestamp": "2020-12-15 00:40:34.428554",
"uptime": 11437,
"dbrm_mode": "slave",
"cluster_mode": "readonly",
"dbroots": [
"2"
],
"module_id": 2,
"services": [
{
"name": "workernode",
"pid": 17789
},
{
"name": "PrimProc",
"pid": 17813
},
{
"name": "ExeMgr",
"pid": 17854
},
{
"name": "WriteEngine",
"pid": 17877
}
]
},
"num_nodes": 3
}$ sudo yum install policycoreutils policycoreutils-python$ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utils$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local$ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
Nothing to do$ sudo semodule -i mariadb_local.pp$ sudo setenforce enforcing# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted$ sudo getenforceEnforcing$ sudo systemctl status firewalld$ sudo systemctl start firewalld$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="3306" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8600-8630" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8640" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8700" protocol="tcp"
accept'$ sudo firewall-cmd --permanent --add-rich-rule='
rule family="ipv4"
source address="192.0.2.0/24"
destination address="192.0.2.0/24"
port port="8800" protocol="tcp"
accept'$ sudo firewall-cmd --reload$ sudo ufw status verbose$ sudo ufw enable$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 3306 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8600:8630 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8640 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8700 proto tcp
$ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8800 proto tcp$ sudo ufw reload$ maxctrl show maxscale┌──────────────┬───────────────────────────────────────────────────────┐
│ Version │ 22.08.15 │
├──────────────┼───────────────────────────────────────────────────────┤
│ Commit │ 3761fa7a52046bc58faad8b5a139116f9e33364c │
├──────────────┼───────────────────────────────────────────────────────┤
│ Started At │ Thu, 05 Aug 2021 20:21:20 GMT │
├──────────────┼───────────────────────────────────────────────────────┤
│ Activated At │ Thu, 05 Aug 2021 20:21:20 GMT │
├──────────────┼───────────────────────────────────────────────────────┤
│ Uptime │ 868 │
├──────────────┼───────────────────────────────────────────────────────┤
│ Config Sync │ null │
├──────────────┼───────────────────────────────────────────────────────┤
│ Parameters │ { │
│ │ "admin_auth": true, │
│ │ "admin_enabled": true, │
│ │ "admin_gui": true, │
│ │ "admin_host": "0.0.0.0", │
│ │ "admin_log_auth_failures": true, │
│ │ "admin_pam_readonly_service": null, │
│ │ "admin_pam_readwrite_service": null, │
│ │ "admin_port": 8989, │
│ │ "admin_secure_gui": false, │
│ │ "admin_ssl_ca_cert": null, │
│ │ "admin_ssl_cert": null, │
│ │ "admin_ssl_key": null, │
│ │ "admin_ssl_version": "MAX", │
│ │ "auth_connect_timeout": "10000ms", │
│ │ "auth_read_timeout": "10000ms", │
│ │ "auth_write_timeout": "10000ms", │
│ │ "cachedir": "/var/cache/maxscale", │
│ │ "config_sync_cluster": null, │
│ │ "config_sync_interval": "5000ms", │
│ │ "config_sync_password": "*****", │
│ │ "config_sync_timeout": "10000ms", │
│ │ "config_sync_user": null, │
│ │ "connector_plugindir": "/usr/lib64/mysql/plugin", │
│ │ "datadir": "/var/lib/maxscale", │
│ │ "debug": null, │
│ │ "dump_last_statements": "never", │
│ │ "execdir": "/usr/bin", │
│ │ "language": "/var/lib/maxscale", │
│ │ "libdir": "/usr/lib64/maxscale", │
│ │ "load_persisted_configs": true, │
│ │ "local_address": null, │
│ │ "log_debug": false, │
│ │ "log_info": false, │
│ │ "log_notice": true, │
│ │ "log_throttling": { │
│ │ "count": 10, │
│ │ "suppress": 10000, │
│ │ "window": 1000 │
│ │ }, │
│ │ "log_warn_super_user": false, │
│ │ "log_warning": true, │
│ │ "logdir": "/var/log/maxscale", │
│ │ "max_auth_errors_until_block": 10, │
│ │ "maxlog": true, │
│ │ "module_configdir": "/etc/maxscale.modules.d", │
│ │ "ms_timestamp": false, │
│ │ "passive": false, │
│ │ "persistdir": "/var/lib/maxscale/maxscale.cnf.d", │
│ │ "piddir": "/var/run/maxscale", │
│ │ "query_classifier": "qc_sqlite", │
│ │ "query_classifier_args": null, │
│ │ "query_classifier_cache_size": 289073971, │
│ │ "query_retries": 1, │
│ │ "query_retry_timeout": "5000ms", │
│ │ "rebalance_period": "0ms", │
│ │ "rebalance_threshold": 20, │
│ │ "rebalance_window": 10, │
│ │ "retain_last_statements": 0, │
│ │ "session_trace": 0, │
│ │ "skip_permission_checks": false, │
│ │ "sql_mode": "default", │
│ │ "syslog": true, │
│ │ "threads": 1, │
│ │ "users_refresh_interval": "0ms", │
│ │ "users_refresh_time": "30000ms", │
│ │ "writeq_high_water": 16777216, │
│ │ "writeq_low_water": 8192 │
│ │ } │
└──────────────┴───────────────────────────────────────────────────────┘$ maxctrl list servers┌────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs1 │ 192.0.2.1 │ 3306 │ 1 │ Master, Running │ 0-1-25 │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs2 │ 192.0.2.2 │ 3306 │ 1 │ Slave, Running │ 0-1-25 │
├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
│ mcs3 │ 192.0.2.3 │ 3306 │ 1 │ Slave, Running │ 0-1-25 │
└────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┘$ maxctrl show server mcs1┌─────────────────────┬───────────────────────────────────────────┐
│ Server │ mcs1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Address │ 192.0.2.1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Port │ 3306 │
├─────────────────────┼───────────────────────────────────────────┤
│ State │ Master, Running │
├─────────────────────┼───────────────────────────────────────────┤
│ Version │ 11.4.5-3-MariaDB-enterprise-log │
├─────────────────────┼───────────────────────────────────────────┤
│ Last Event │ master_up │
├─────────────────────┼───────────────────────────────────────────┤
│ Triggered At │ Thu, 05 Aug 2021 20:22:26 GMT │
├─────────────────────┼───────────────────────────────────────────┤
│ Services │ connection_router_service │
│ │ query_router_service │
├─────────────────────┼───────────────────────────────────────────┤
│ Monitors │ columnstore_monitor │
├─────────────────────┼───────────────────────────────────────────┤
│ Master ID │ -1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Node ID │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Slave Server IDs │ │
├─────────────────────┼───────────────────────────────────────────┤
│ Current Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Total Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Max Connections │ 1 │
├─────────────────────┼───────────────────────────────────────────┤
│ Statistics │ { │
│ │ "active_operations": 0, │
│ │ "adaptive_avg_select_time": "0ns", │
│ │ "connection_pool_empty": 0, │
│ │ "connections": 1, │
│ │ "max_connections": 1, │
│ │ "max_pool_size": 0, │
│ │ "persistent_connections": 0, │
│ │ "reused_connections": 0, │
│ │ "routed_packets": 0, │
│ │ "total_connections": 1 │
│ │ } │
├─────────────────────┼───────────────────────────────────────────┤
│ Parameters │ { │
│ │ "address": "192.0.2.1", │
│ │ "disk_space_threshold": null, │
│ │ "extra_port": 0, │
│ │ "monitorpw": null, │
│ │ "monitoruser": null, │
│ │ "persistmaxtime": "0ms", │
│ │ "persistpoolmax": 0, │
│ │ "port": 3306, │
│ │ "priority": 0, │
│ │ "proxy_protocol": false, │
│ │ "rank": "primary", │
│ │ "socket": null, │
│ │ "ssl": false, │
│ │ "ssl_ca_cert": null, │
│ │ "ssl_cert": null, │
│ │ "ssl_cert_verify_depth": 9, │
│ │ "ssl_cipher": null, │
│ │ "ssl_key": null, │
│ │ "ssl_verify_peer_certificate": false, │
│ │ "ssl_verify_peer_host": false, │
│ │ "ssl_version": "MAX" │
│ │ } │
└─────────────────────┴───────────────────────────────────────────┘$ maxctrl list monitors┌─────────────────────┬─────────┬──────────────────┐
│ Monitor │ State │ Servers │
├─────────────────────┼─────────┼──────────────────┤
│ columnstore_monitor │ Running │ mcs1, mcs2, mcs3 │
└─────────────────────┴─────────┴──────────────────┘$ maxctrl show monitor columnstore_monitor┌─────────────────────┬─────────────────────────────────────┐
│ Monitor │ columnstore_monitor │
├─────────────────────┼─────────────────────────────────────┤
│ Module │ mariadbmon │
├─────────────────────┼─────────────────────────────────────┤
│ State │ Running │
├─────────────────────┼─────────────────────────────────────┤
│ Servers │ mcs1 │
│ │ mcs2 │
│ │ mcs3 │
├─────────────────────┼─────────────────────────────────────┤
│ Parameters │ { │
│ │ "backend_connect_attempts": 1, │
│ │ "backend_connect_timeout": 3, │
│ │ "backend_read_timeout": 3, │
│ │ "backend_write_timeout": 3, │
│ │ "disk_space_check_interval": 0, │
│ │ "disk_space_threshold": null, │
│ │ "events": "all", │
│ │ "journal_max_age": 28800, │
│ │ "module": "mariadbmon", │
│ │ "monitor_interval": 2000, │
│ │ "password": "*****", │
│ │ "script": null, │
│ │ "script_timeout": 90, │
│ │ "user": "mxs" │
│ │ } │
├─────────────────────┼─────────────────────────────────────┤
│ Monitor Diagnostics │ {} │
└─────────────────────┴─────────────────────────────────────┘$ maxctrl list services┌───────────────────────────┬────────────────┬─────────────┬───────────────────┬──────────────────┐
│ Service │ Router │ Connections │ Total Connections │ Servers │
├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
│ connection_router_Service │ readconnroute │ 0 │ 0 │ mcs1, mcs2, mcs3 │
├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
│ query_router_service │ readwritesplit │ 0 │ 0 │ mcs1, mcs2, mcs3 │
└───────────────────────────┴────────────────┴─────────────┴───────────────────┴──────────────────┘$ maxctrl show service query_router_service┌─────────────────────┬─────────────────────────────────────────────────────────────┐
│ Service │ query_router_service │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Router │ readwritesplit │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ State │ Started │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Started At │ Sat Aug 28 21:41:16 2021 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Current Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Total Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Max Connections │ 0 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Cluster │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Servers │ mcs1 │
│ │ mcs2 │
│ │ mcs3 │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Services │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Filters │ │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Parameters │ { │
│ │ "auth_all_servers": false, │
│ │ "causal_reads": "false", │
│ │ "causal_reads_timeout": "10000ms", │
│ │ "connection_keepalive": "300000ms", │
│ │ "connection_timeout": "0ms", │
│ │ "delayed_retry": false, │
│ │ "delayed_retry_timeout": "10000ms", │
│ │ "disable_sescmd_history": false, │
│ │ "enable_root_user": false, │
│ │ "idle_session_pool_time": "-1000ms", │
│ │ "lazy_connect": false, │
│ │ "localhost_match_wildcard_host": true, │
│ │ "log_auth_warnings": true, │
│ │ "master_accept_reads": false, │
│ │ "master_failure_mode": "fail_instantly", │
│ │ "master_reconnection": false, │
│ │ "max_connections": 0, │
│ │ "max_sescmd_history": 50, │
│ │ "max_slave_connections": 255, │
│ │ "max_slave_replication_lag": "0ms", │
│ │ "net_write_timeout": "0ms", │
│ │ "optimistic_trx": false, │
│ │ "password": "*****", │
│ │ "prune_sescmd_history": true, │
│ │ "rank": "primary", │
│ │ "retain_last_statements": -1, │
│ │ "retry_failed_reads": true, │
│ │ "reuse_prepared_statements": false, │
│ │ "router": "readwritesplit", │
│ │ "session_trace": false, │
│ │ "session_track_trx_state": false, │
│ │ "slave_connections": 255, │
│ │ "slave_selection_criteria": "LEAST_CURRENT_OPERATIONS", │
│ │ "strict_multi_stmt": false, │
│ │ "strict_sp_calls": false, │
│ │ "strip_db_esc": true, │
│ │ "transaction_replay": false, │
│ │ "transaction_replay_attempts": 5, │
│ │ "transaction_replay_max_size": 1073741824, │
│ │ "transaction_replay_retry_on_deadlock": false, │
│ │ "type": "service", │
│ │ "use_sql_variables_in": "all", │
│ │ "user": "mxs", │
│ │ "version_string": null │
│ │ } │
├─────────────────────┼─────────────────────────────────────────────────────────────┤
│ Router Diagnostics │ { │
│ │ "avg_sescmd_history_length": 0, │
│ │ "max_sescmd_history_length": 0, │
│ │ "queries": 0, │
│ │ "replayed_transactions": 0, │
│ │ "ro_transactions": 0, │
│ │ "route_all": 0, │
│ │ "route_master": 0, │
│ │ "route_slave": 0, │
│ │ "rw_transactions": 0, │
│ │ "server_query_statistics": [] │
│ │ } │
└─────────────────────┴─────────────────────────────────────────────────────────────┘$ sudo mariadbCREATE USER 'app_user'@'192.0.2.10' IDENTIFIED BY 'app_user_passwd';GRANT ALL ON test.* TO 'app_user'@'192.0.2.10';CREATE USER 'app_user'@'192.0.2.11' IDENTIFIED BY 'app_user_passwd';GRANT ALL ON test.* TO 'app_user'@'192.0.2.11';$ mariadb --host 192.0.2.10 --port 3307
--user app_user --password$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3308 \
--user app_user --passwordSELECT @@global.hostname, @@global.server_id;
+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs2 | 2 |
+-------------------+--------------------+$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3307 \
--user app_user --passwordCREATE TABLE test.load_balancing_test (
id INT PRIMARY KEY AUTO_INCREMENT,
hostname VARCHAR(256),
server_id INT
);INSERT INTO test.load_balancing_test (hostname, server_id)
VALUES (@@global.hostname, @@global.server_id);SELECT * FROM test.load_balancing_test;+----+----------+-----------+
| id | hostname | server_id |
+----+----------+-----------+
| 1 | mcs1 | 1 |
| 2 | mcs1 | 1 |
| 3 | mcs1 | 1 |
+----+----------+-----------+$ maxctrl list listeners┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
│ Name │ Port │ Host │ State │ Service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ connection_router_listener │ 3308 │ :: │ Running │ connection_router_service │
├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
│ query_router_listener │ 3307 │ :: │ Running │ query_router_service │
└────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘$ mariadb --host 192.0.2.10 --port 3307 \
--user app_user --passwordSELECT @@global.hostname, @@global.server_id;+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs2 | 2 |
+-------------------+--------------------+SELECT @@global.hostname, @@global.server_id;+-------------------+--------------------+
| @@global.hostname | @@global.server_id |
+-------------------+--------------------+
| mcs3 | 3 |
+-------------------+--------------------+









MariaDB ColumnStore enhances MariaDB Enterprise Server with a columnar engine for OLAP and HTAP workloads, using MPP for scalability. It supports cross-engine JOINs, integrates with S3 storage, and provides high-speed bulk loading with multi-node management via REST API.
MariaDB ColumnStore is a columnar storage engine designed for distributed massively parallel processing (MPP), such as for big data analysis. Deployments can be composed of several MariaDB servers or just one, each running several subprocess working together to provide linear scalability and exceptional performance with real-time response to analytical queries.
It provides a highly available, fault tolerant, and performant columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore is designed for data warehousing, decision support systems (DSS), online analytical processing (OLAP), and hybrid transactional-analytical processing (HTAP).
Columnar storage engine that enables MariaDB Enterprise Server to perform new workloads
Optimized for online analytical process (OLAP) workloads including data warehousing, decision support systems, and business intelligence
Single-stack solution for hybrid transactional-analytical workloads to eliminate barriers and prevent data silos
Implements cross-engine JOINs to join Enterprise ColumnStore tables with tables using row-based storage engines, such as
Smart storage engine that plans and optimizes its own queries using a custom select handler
Scalable query execution using massively parallel processing (MPP) strategies, parallel query execution, and distributed function evaluation
S3-compatible object storage can be used for highly available, low-cost, multi-regional, resilient, scalable, secure, and virtually limitless data storage
High availability and automatic failover by leveraging MariaDB MaxScale
REST API for multi-node administration with the Cluster Management API (CMAPI) server
Connectors for popular BI platforms such as Microsoft Power BI and Tableau
High-speed bulk data loading that bypasses the SQL layer and does not block concurrent read-only queries
MariaDB Enterprise ColumnStore supports multiple topologies. Several options are described below. MariaDB Enterprise ColumnStore can be deployed in other topologies. The topologies on this page are representative of basic product capabilities.
MariaDB products can be deployed to form other topologies that leverage advanced product capabilities and combine the capabilities of multiple topologies.
The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use S3-compatible object storage for data
Use Shared Local Storage for the Storage Manager directory.
The MariaDB Enterprise ColumnStore topology with Shared Local Storage delivers production analytics with high availability and fault tolerance by leveraging shared local storage, such as NFS.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use Shared Local Storage for the DB Root directories.
Columnar storage engine
Query execution
Data storage
Enterprise-grade database server
Storage engine plugin
Integrates MariaDB Enterprise ColumnStore into MariaDB Enterprise Server
REST API
Used for administrative tasks
Database proxy
Accepts connections
Routes queries
Performs auto-failover
MariaDB Enterprise ColumnStore is the columnar storage engine that handles data storage and query optimization/execution.
MariaDB Enterprise ColumnStore is a columnar storage engine that is optimized for analytical or online analytical processing (OLAP) workloads, data warehousing, and DSS. MariaDB Enterprise ColumnStore can be used for hybrid transactional-analytical processing (HTAP) workloads when paired with a row-based storage engine, like .
MariaDB Enterprise ColumnStore is built on top of MariaDB Enterprise Server. MariaDB Enterprise ColumnStore 5 is included with the standard MariaDB Enterprise Server 10.5 releases, while MariaDB Enterprise ColumnStore 6 is included with the standard MariaDB Enterprise Server 10.6 releases.
Enterprise ColumnStore interfaces with the Enterprise Server SQL engine through the ColumnStore storage engine plugin.
MariaDB has been continually improving the integration of MariaDB Enterprise ColumnStore with MariaDB Enterprise Server:
MariaDB ColumnStore required special custom-built releases of MariaDB Server.
MariaDB Enterprise ColumnStore was included with the standard MariaDB Enterprise Server 10.5 releases up to ES 10.5.5-3. It was the first release to replace the Operations/Administration/Maintenance (OAM) API with the more modern Cluster Management API (CMAPI), which is still in use.
Starting with ES 10.5.6-4, MariaDB Enterprise ColumnStore is included with the standard MariaDB Enterprise Server 10.5 releases.
MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
The ColumnStore storage engine plugin is a smart storage engine that implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities, such as:
Using a custom query planner
Selecting data by column instead of by row
Parallel query evaluation
Distributed aggregations
Distributed functions
Extent elimination
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
For additional information, see "ColumnStore Storage Engine".
The Cluster Management API (CMAPI) server provides a REST API that can be used to configure and manage Enterprise ColumnStore.
CMAPI must run on every ColumnStore node in a multi-node deployment but is not required in a single-node deployment.
The REST API can be used to perform multiple operations:
Add ColumnStore nodes
Remove ColumnStore nodes
Start Enterprise ColumnStore
Shutdown Enterprise ColumnStore
Check the status of Enterprise ColumnStore
MariaDB Enterprise ColumnStore leverages MariaDB MaxScale as an advanced database proxy and query router.
Multi-node Enterprise ColumnStore deployments must have one or more MaxScale nodes. MaxScale performs many different roles:
Routing writes queries to the primary server
Load balancing read queries on replica servers
Monitoring node health
Performing automatic failover if a node fails
MariaDB Enterprise ColumnStore's storage architecture provides a columnar storage engine with high availability, fault tolerance, compression, and automatic partitioning for production analytics and data warehousing.
For additional information, see "MariaDB Enterprise ColumnStore and ColumnStore Storage Architecture".
MariaDB Enterprise ColumnStore is a columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.
Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads but are less performant for analytical workloads.
Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.
MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.
MariaDB Enterprise ColumnStore supports S3-compatible object storage.
S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use Shared Local Storage (such as NFS) for high availability.
S3-compatible object storage is:
Compatible: Many object storage services are compatible with the Amazon S3 API.
Economical: S3-compatible object storage is often very low cost.
Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.
Limitless: S3-compatible object storage is often virtually limitless.
Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.
Scalable: S3-compatible object storage is often highly optimized for read and write scaling.
Secure: S3-compatible object storage is often encrypted-at-rest.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use shared local storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
MariaDB Enterprise ColumnStore uses distributed query execution and massively parallel processing (MPP) techniques to achieve vertical and horizontal scalability for production analytics and data warehousing.
For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation".
MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as the table size increases.
Most databases are row-based, utilizing manually created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.
Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each extent in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.
When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.
Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.
The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.
All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.
For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.
The generic select handler is not optimal for Enterprise ColumnStore, because:
Enterprise ColumnStore selects data by column, but the generic selects handler selects data by row.
Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not.
Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not.
Enterprise ColumnStore supports distributed functions, but the generic select handler does not.
Enterprise ColumnStore supports extent elimination, but the generic select handler does not.
Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it.
The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.
If a storage engine implements a custom select handler, it is known as a smart storage engine.
As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.
The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its queries using the custom select handler.
MariaDB Enterprise ColumnStore's query planning is divided into two steps:
ES provides the query's SELECT_LEX object to the custom select handler. The custom selects handler builds a ColumnStore Execution Plan (CSEP).
The custom select handler provides the CSEP to the ExeMgr process on the same node. The ExeMgr process performs extent elimination and creates a job list.
When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.
Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:
Some types of job steps perform operations in a distributed manner using multiple nodes to operate on different extents. Distributed operations provide horizontal scalability.
Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.
As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.
MariaDB Enterprise ColumnStore leverages common technologies to provide highly available production analytics with automatic failover:
HA for data
Optional.
With S3: HA for
Without S3: HA for
Schema replication (ColumnStore tables)
Schema and data replication (non-ColumnStore tables)
Database object replication
Monitoring
Automatic failover
Load balancing
REST API
Administration
Add nodes
Remove nodes
MariaDB Enterprise ColumnStore can use shared local storage.
Shared local storage is required for high availability. The specific Shared Local Storage requirements depend on whether Enterprise ColumnStore is configured to use S3-compatible object storage:
When S3-compatible object storage is used, Enterprise ColumnStore requires the Storage Manager directory to use shared local storage for high availability.
When S3-compatible object storage is not used, Enterprise ColumnStore requires the DB Root directories to use Shared Local Storage for high availability.
The most common shared local storage options for on-premises and cloud deployments are:
NFS (Network File System)
GlusterFS
The most common shared local storage options for AWS (Amazon Web Services) deployments are:
EBS (Elastic Block Store) Multi-Attach
EFS (Elastic File System)
The most common shared local storage option for GCP (Google Cloud Platform) deployments is:
Filestore
MariaDB Enterprise ColumnStore requires MariaDB Replication to synchronize various database objects on multiple nodes for high availability.
MariaDB replication synchronizes:
The schemas for all ColumnStore tables on all nodes
The schemas and data for all non-ColumnStore tables on all nodes
All other databases objects (stored procedures, stored functions, user accounts, and other objects) on all nodes
MariaDB Enterprise ColumnStore requires MariaDB MaxScale to achieve high availability, automatic failover, and load balancing.
MariaDB Monitor (mariadbmon) in MaxScale monitors the health of each Enterprise ColumnStore node.
MaxScale provides load balancing by routing queries and/or connections to healthy nodes by:
Providing query-based routing using Read/Write Split Router (readwritesplit).
Providing connection-based routing using Read Connection Router (readconnroute).
When MaxScale's MariaDB Monitor notices the primary node fail, MariaDB Monitor performs automatic failover by:
Promoting a replica node to become the new primary node.
Re-configuring all replica nodes to replicate from the new primary node.
MariaDB Enterprise ColumnStore requires the Cluster Management API (CMAPI) Server for high availability.
The CMAPI server provides a REST API that can be used to manage and configure Enterprise ColumnStore.
The CMAPI server has a role in automatic failover. After MaxScale performs automatic failover, the CMAPI server detects the topology change and automatically re-configures the roles of each Enterprise ColumnStore node.
MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.
For additional information, see "MariaDB Enterprise ColumnStore Data Loading".
MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:
Bypasses the SQL layer to decrease overhead
Does not block read queries
Requires a write metadata lock on the table, which can be monitored with the .
Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.
Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "Load Ordered Data in Proper Order".
Supports parallel distributed bulk loads
Imports data from text files
Imports data from binary files
Imports data from standard input (stdin)
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore, MariaDB Enterprise ColumnStore has special handling for the following statements:
.
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called cpimport.bin.
If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the .
MariaDB Enterprise ColumnStore supports backup and restore using well-known tools and methods.
S3 snapshot
File system snapshot
File copy
MariaDB Enterprise Backup
For additional information, see "MariaDB Enterprise ColumnStore Backup and Restore".
MariaDB Enterprise ColumnStore can leverage S3 snapshots to backup S3-compatible object storage when it is used for Enterprise ColumnStore's data.
The S3-compatible object storage can be backed up by:
Locking the database on the primary node
Performing an S3 snapshot using the vendor's standard snapshot functionality
MariaDB Enterprise ColumnStore can leverage file system snapshots or file copy tools (such as rsync) to backup shared local storage when it is used for the Storage Manager directory or the DB Root directories.
The shared local storage can be backed up by:
Locking the database on the primary node
Performing a file system snapshot or using a file copy tool (such as rsync) to copy the contents of the Storage Manager directory or the DB Root directories.
MariaDB Enterprise ColumnStore can leverage the standard utility to back up the Enterprise Server data directory.
The backup contains:
All ColumnStore schemas
All non-ColumnStore schemas and data
All other database objects
It does not contain:
ColumnStore data
This guide provides steps for deploying a multi-node ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.
This procedure describes the deployment of the ColumnStore Shared Local Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.
MariaDB Enterprise ColumnStore 5 is a columnar storage engine for MariaDB Enterprise Server 10.5. Enterprise ColumnStore is suitable for Online Analytical Processing (OLAP) workloads.
This procedure has 9 steps, which are executed in sequence.
This procedure represents basic product capability and deploys 3 Enterprise ColumnStore nodes and 1 MaxScale node.
This page provides an overview of the topology, requirements, and deployment procedures.
Please read and understand this procedure before executing.
Customers can obtain support by submitting a support case.
The following components are deployed during this procedure:
The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use for the
These requirements are for the ColumnStore Object Storage topology when deployed with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.
Node Count
Operating System
Minimum Hardware Requirements
Recommended Hardware Requirements
Storage Requirements
S3-Compatible Object Storage Requirements
Preferred Object Storage Providers: Cloud
Preferred Object Storage Providers: Hardware
Shared Local Storage Directories
Shared Local Storage Options
Recommended Storage Options
MaxScale nodes, 1 or more are required.
Enterprise ColumnStore nodes, 3 or more are required for high availability. You should always have an odd number of nodes in a multi-node ColumnStore deployment to avoid split brain scenarios.
In alignment to the , the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5 is provided for:
CentOS Linux 7 (x86_64)
Debian 10 (x86_64)
Red Hat Enterprise Linux 7 (x86_64)
Red Hat Enterprise Linux 8 (x86_64)
Ubuntu 18.04 LTS (x86_64)
Ubuntu 20.04 LTS (x86_64)
MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the recommended hardware requirements instead.
The minimum hardware requirements are:
MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.
If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:
And the following error message will be raised to the client:
MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.
The recommended hardware requirements are:
The ColumnStore Object Storage topology requires the following storage types:
The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.
The Storage Manager directory is located at the following path by default:
/var/lib/columnstore/storagemanager
The most common shared local storage options for the ColumnStore Object Storage topology are:
Enterprise ColumnStore's CMAPI (Cluster Management API) is a REST API that can be used to manage a multi-node Enterprise ColumnStore cluster.
Many tools are capable of interacting with REST APIs. For example, the curl utility could be used to make REST API calls from the command-line.
Many programming languages also have libraries for interacting with REST APIs.
The examples below show how to use the CMAPI with curl.
For example:
'x-api-key': '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
'Content-Type': 'application/json'
x-api-key can be set to any value of your choice during the first call to the server. Subsequent connections will require this same key.
MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.
To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.
The systemctl command is used to start and stop the MariaDB Enterprise Server service.
For additional information, see "".
MariaDB Enterprise Server produces log data that can be helpful in problem diagnosis.
Log filenames and locations may be overridden in the server configuration. The default location of logs is the data directory. The data directory is specified by the datadir system variable.
The systemctl command is used to start and stop the ColumnStore service.
In the ColumnStore Object Storage topology, the mariadb-columnstore service should not be enabled. The CMAPI service restarts Enterprise ColumnStore as needed, so it does not need to start automatically upon reboot.
The systemctl command is used to start and stop the CMAPI service.
For additional information on endpoints, see "CMAPI".
MaxScale can be configured using several methods. These methods make use of MaxScale's .
The procedure on these pages configures MaxScale using MaxCtrl.
The systemctl command is used to start and stop the MaxScale service.>
For additional information, see "Start and Stop Services".
Navigation in the procedure Shared Local Storage topology
Enterprise Server 10.5
Enterprise Server 10.6
Enterprise Server 11.4
Columnar storage engine with S3-compatible object storage
Highly available
Automatic failover via MaxScale and CMAPI
Scales read via MaxScale
Bulk data import
Enterprise Server 10.5, Enterprise ColumnStore 5, MaxScale 2.5
Enterprise Server 10.6, Enterprise ColumnStore 23.02, MaxScale 22.08
Prepare ColumnStore Nodes
Configure Shared Local Storage
Install MariaDB Enterprise Server
Start and Configure MariaDB Enterprise Server
Test MariaDB Enterprise Server
Install MariaDB MaxScale
Start and Configure MariaDB MaxScale
Test MariaDB MaxScale
Import Data
Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.
Database proxy that extends the availability, scalability, and security of MariaDB Enterprise Servers
• Columnar storage engine • Highly available • Optimized for Online Analytical Processing (OLAP) workloads • Scalable query execution • Cluster Management API (CMAPI) provides a REST API for multi-node administration.
Listener
Listens for client connections to MaxScale then passes them to the router service
MariaDB Monitor
Tracks changes in the state of MariaDB Enterprise Servers.
Read Connection Router
Routes connections from the listener to any available Enterprise ColumnStore node
Read/Write Split Router
Routes read operations from the listener to any available Enterprise ColumnStore node, and routes write operations from the listener to a specific server that MaxScale uses as the primary server
Server Module
Connection configuration in MaxScale to an Enterprise ColumnStore node
MaxScale node
4+ cores
4+ GB
Enterprise ColumnStore node
4+ cores
4+ GB
Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.MaxScale node
8+ cores
16+ GB
Enterprise ColumnStore node
64+ cores
128+ GB
The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.
EBS (Elastic Block Store) Multi-Attach
AWS
• EBS is a high-performance block-storage service for AWS (Amazon Web Services). • EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported. • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.
EFS (Elastic File System)
AWS
• EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services). • For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data. EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).
Filestore
GCP
• Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform). • For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.
GlusterFS
On-premises
• GlusterFS is a distributed file system. • GlusterFS supports replication and failover.
NFS (Network File System)
On-premises
• NFS is a distributed file system. • If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately. • For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.
https://{server}:{port}/cmapi/{version}/{route}/{command}$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/start \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20}' \
| jq .$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/shutdown \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20}' \
| jq .$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .$ curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .Configuration File
Configuration files (such as /etc/my.cnf) can be used to set system-variables and options. The server must be restarted to apply changes made to configuration files.
Command-line
The server can be started with command-line options that set system-variables and options.
SQL
Users can set system-variables that support dynamic changes on-the-fly using the SET statement.
CentOS
Red Hat Enterprise Linux (RHEL)
/etc/my.cnf.d/z-custom-mariadb.cnf
Debian
Ubuntu
/etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf
Start
sudo systemctl start mariadb
Stop
sudo systemctl stop mariadb
Restart
sudo systemctl restart mariadb
Enable during startup
sudo systemctl enable mariadb
Disable during startup
sudo systemctl disable mariadb
Status
sudo systemctl status mariadb
<hostname>.err
server_audit.log
<hostname>-slow.log
<hostname>.log
<hostname>-bin
Start
sudo systemctl start mariadb-columnstore
Stop
sudo systemctl stop mariadb-columnstore
Restart
sudo systemctl restart mariadb-columnstore
Enable during startup
sudo systemctl enable mariadb-columnstore
Disable during startup
sudo systemctl disable mariadb-columnstore
Status
sudo systemctl status mariadb-columnstore
Start
sudo systemctl start mariadb-columnstore-cmapi
Stop
sudo systemctl stop mariadb-columnstore-cmapi
Restart
sudo systemctl restart mariadb-columnstore-cmapi
Enable during startup
sudo systemctl enable mariadb-columnstore-cmapi
Disable during startup
sudo systemctl disable mariadb-columnstore-cmapi
Status
sudo systemctl status mariadb-columnstore-cmapi
Command-line utility to perform administrative tasks through the REST API. See MaxCtrl Commands.
MaxGUI is a graphical utility that can perform administrative tasks through the REST API.
The REST API can be used directly. For example, the curl utility could be used to make REST API calls from the command-line. Many programming languages also have libraries to interact with REST APIs.
Start
sudo systemctl start maxscale
Stop
sudo systemctl stop maxscale
Restart
sudo systemctl restart maxscale
Enable during startup
sudo systemctl enable maxscale
Disable during startup
sudo systemctl disable maxscale
Status
sudo systemctl status maxscale







This guide provides steps for deploying a single-node S3 ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.
Enterprise Server 10.5
Enterprise Server 10.6
Enterprise Server 11.4
Columnar storage engine with S3-compatible object storage
Highly available
Automatic failover via MaxScale and CMAPI
Scales read via MaxScale
Bulk data import
Enterprise Server 10.5, Enterprise ColumnStore 5, MaxScale 2.5
Enterprise Server 10.6, Enterprise ColumnStore 23.02, MaxScale 22.08
This procedure describes the deployment of the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.
MariaDB Enterprise ColumnStore 5 is a columnar storage engine for MariaDB Enterprise Server 10.5. Enterprise ColumnStore is suitable for Online Analytical Processing (OLAP) workloads.
This procedure has 9 steps, which are executed in sequence.
This procedure represents basic product capability and deploys 3 Enterprise ColumnStore nodes and 1 MaxScale node.
This page provides an overview of the topology, requirements, and deployment procedures.
Please read and understand this procedure before executing.
Prepare ColumnStore Nodes
Configure Shared Local Storage
Install MariaDB Enterprise Server
Start and Configure MariaDB Enterprise Server
Test MariaDB Enterprise Server
Install MariaDB MaxScale
Start and Configure MariaDB MaxScale
Test MariaDB MaxScale
Import Data
Customers can obtain support by submitting a support case.
The following components are deployed during this procedure:
Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.
Database proxy that extends the availability, scalability, and security of MariaDB Enterprise Servers
Columnar storage engine
Highly available
Optimized for Online Analytical Processing (OLAP) workloads
Scalable query execution
provides a REST API for multi-node administration
Listener
Listens for client connections to MaxScale then passes them to the router service
MariaDB Monitor
Tracks changes in the state of MariaDB Enterprise Servers.
Read Connection Router
Routes connections from the listener to any available Enterprise ColumnStore node
Read/Write Split Router
Routes read operations from the listener to any available Enterprise ColumnStore node, and routes write operations from the listener to a specific server that MaxScale uses as the primary server
Server Module
Connection configuration in MaxScale to an Enterprise ColumnStore node
The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.
The topology consists of:
One or more MaxScale nodes
An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI
The MaxScale nodes:
Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)
Accept client and application connections
Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)
The ColumnStore nodes:
Receive queries from MaxScale
Execute queries
Use S3-compatible object storage for data
Use shared local storage for the Storage Manager directory
These requirements are for the ColumnStore Object Storage topology when deployed with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.
Node Count
Operating System
Minimum Hardware Requirements
Recommended Hardware Requirements
Storage Requirements
S3-Compatible Object Storage Requirements
Preferred Object Storage Providers: Cloud
Preferred Object Storage Providers: Hardware
Shared Local Storage Directories
Shared Local Storage Options
MaxScale nodes, 1 or more are required.
Enterprise ColumnStore nodes, 3 or more are required for high availability. You should always have an odd number of nodes in a multi-node ColumnStore deployment to avoid split brain scenarios.
In alignment to the enterprise lifecycle, the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5 is provided for:
CentOS Linux 7 (x86_64)
Debian 10 (x86_64)
Red Hat Enterprise Linux 7 (x86_64)
Red Hat Enterprise Linux 8 (x86_64)
Ubuntu 18.04 LTS (x86_64)
Ubuntu 20.04 LTS (x86_64)
MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the recommended hardware requirements instead.
The minimum hardware requirements are:
MaxScale node
4+ cores
4+ GB
Enterprise ColumnStore node
4+ cores
4+ GB
MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.
If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:
Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.And the following error message will be raised to the client:
ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.
The recommended hardware requirements are:
MaxScale node
8+ cores
16+ GB
Enterprise ColumnStore node
64+ cores
128+ GB
The ColumnStore Object Storage topology requires the following storage types:
The ColumnStore Object Storage topology uses S3-compatible object storage to store data.
The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.
The ColumnStore Object Storage topology uses S3-compatible object storage to store data.
Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.
For the preferred S3-compatible object storage providers that provide cloud and hardware solutions, see the following sections:
The use of non-cloud and non-hardware providers is at your own risk.
If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.
Amazon Web Services (AWS) S3
Google Cloud Storage
Azure Storage
Alibaba Cloud Object Storage Service
Cloudian HyperStore
Cohesity S3
Dell EMC
IBM Cloud Object Storage
Seagate Lyve Rack
Quantum ActiveScale
The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.
The Storage Manager directory is located at the following path by default:
/var/lib/columnstore/storagemanager
The most common shared local storage options for the ColumnStore Object Storage topology are:
EBS (Elastic Block Store) Multi-Attach
AWS
EBS is a high-performance block-storage service for AWS (Amazon Web Services).
EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.
For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.
EFS (Elastic File System)
AWS
EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).
For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data. EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).
Filestore
GCP
Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).
For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.
GlusterFS
On-premises
GlusterFS is a distributed file system.
GlusterFS supports replication and failover.
NFS (Network File System)
On-premises
NFS is a distributed file system.
If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately.
For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.
For best results, MariaDB Corporation would recommend the following storage options:
AWS
Amazon S3 storage
EBS Multi-Attach or EFS
GCP
Google Object Storage (S3-compatible)
Filestore
On-premises
Any S3-compatible object storage
NFS
Enterprise ColumnStore's CMAPI (Cluster Management API) is a REST API that can be used to manage a multi-node Enterprise ColumnStore cluster.
Many tools are capable of interacting with REST APIs. For example, the curl utility could be used to make REST API calls from the command-line.
Many programming languages also have libraries for interacting with REST APIs.
The examples below show how to use the CMAPI with curl.
https://{server}:{port}/cmapi/{version}/{route}/{command}For example:
https://mcs1:8640/cmapi/0.4.0/cluster/shutdown
https://mcs1:8640/cmapi/0.4.0/cluster/start
https://mcs1:8640/cmapi/0.4.0/cluster/status
With CMAPI 1.4 and later:
https://mcs1:8640/cmapi/0.4.0/cluster/node
With CMAPI 1.3 and earlier:
https://mcs1:8640/cmapi/0.4.0/cluster/add-node
https://mcs1:8640/cmapi/0.4.0/cluster/remove-node
'x-api-key': '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'
'Content-Type': 'application/json'
x-api-key can be set to any value of your choice during the first call to the server. Subsequent connections will require this same key.
curl examples remain valid but are now considered legacy.
$ mcs cluster status
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .$ mcs cluster start --timeout 20
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/start \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20}' \
| jq .$ mcs cluster shutdown --timeout 20
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/shutdown \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20}' \
| jq .With CMAPI 1.4 and later:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .With CMAPI 1.3 and earlier:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/add-node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .With CMAPI 1.4 and later:
$ curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .With CMAPI 1.3 and earlier:
$ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/remove-node \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
--data '{"timeout":20, "node": "192.0.2.2"}' \
| jq .Configuration File
Configuration files (such as /etc/my.cnf) can be used to set system-variables and options. The server must be restarted to apply changes made to configuration files.
Command-line
The server can be started with command-line options that set system-variables and options.
SQL
Users can set system-variables that support dynamic changes on-the-fly using the SET statement.
MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.
To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.
Distribution
Example Configuration File Path
CentOS
Red Hat Enterprise Linux (RHEL)
/etc/my.cnf.d/z-custom-mariadb.cnf
Debian
Ubuntu
/etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf
The systemctl command is used to start and stop the MariaDB Enterprise Server service.
Start
sudo systemctl start mariadb
Stop
sudo systemctl stop mariadb
Restart
sudo systemctl restart mariadb
Enable during startup
sudo systemctl enable mariadb
Disable during startup
sudo systemctl disable mariadb
Status
sudo systemctl status mariadb
For additional information, see "".
MariaDB Enterprise Server produces log data that can be helpful in problem diagnosis.
Log filenames and locations may be overridden in the server configuration. The default location of logs is the data directory. The data directory is specified by the datadir system variable.
<hostname>.err
server_audit.log
<hostname>-slow.log
<hostname>.log
<hostname>-bin
The systemctl command is used to start and stop the ColumnStore service.
Start
sudo systemctl start mariadb-columnstore
Stop
sudo systemctl stop mariadb-columnstore
Restart
sudo systemctl restart mariadb-columnstore
Enable during startup
sudo systemctl enable mariadb-columnstore
Disable during startup
sudo systemctl disable mariadb-columnstore
Status
sudo systemctl status mariadb-columnstore
In the ColumnStore Object Storage topology, the mariadb-columnstore service should not be enabled. The CMAPI service restarts Enterprise ColumnStore as needed, so it does not need to start automatically upon reboot.
The systemctl command is used to start and stop the CMAPI service.
Start
sudo systemctl start mariadb-columnstore-cmapi
Stop
sudo systemctl stop mariadb-columnstore-cmapi
Restart
sudo systemctl restart mariadb-columnstore-cmapi
Enable during startup
sudo systemctl enable mariadb-columnstore-cmapi
Disable during startup
sudo systemctl disable mariadb-columnstore-cmapi
Status
sudo systemctl status mariadb-columnstore-cmapi
For additional information on endpoints, see "CMAPI".
MaxScale can be configured using several methods. These methods make use of MaxScale's REST API.
Command-line utility to perform administrative tasks through the REST API. See MaxCtrl Commands.
MaxGUI is a graphical utility that can perform administrative tasks through the REST API.
The REST API can be used directly. For example, the curl utility could be used to make REST API calls from the command-line. Many programming languages also have libraries to interact with REST APIs.
The procedure on these pages configures MaxScale using MaxCtrl.
The systemctl command is used to start and stop the MaxScale service.
Start
sudo systemctl start maxscale
Stop
sudo systemctl stop maxscale
Restart
sudo systemctl restart maxscale
Enable during startup
sudo systemctl enable maxscale
Disable during startup
sudo systemctl disable maxscale
Status
sudo systemctl status maxscale
For additional information, see "".
Navigation in the procedure "Deploy ColumnStore Object Storage Topology":


This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.
This page is: Copyright © 2025 MariaDB. All rights reserved.