arrow-left
All pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 145 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Quickstart Guides

MariaDB ColumnStore Quickstart Guides provide concise, Docker-friendly steps to quickly set up, configure, and explore the ColumnStore analytic engine.

MariaDB ColumnStore Guidechevron-rightMariaDB ColumnStore Hardware Guidechevron-right

MariaDB ColumnStore

Discover MariaDB ColumnStore, the powerful columnar storage engine for analytical workloads. Learn about its architecture, features, and how it enables high-performance data warehousing and analytics.

rabbit-runningQuickstart Guideschevron-right
house-blankColumnStore Architecturechevron-right
gearManaging ColumnStorechevron-right
shield-halvedSecuritychevron-right
briefcase-arrow-rightUse Caseschevron-right
boltHigh Availabilitychevron-right
wrenchClients & Toolschevron-right
graduation-capTutorialschevron-right
paperclipReferencechevron-right

Installing ColumnStore

This section provides instructions for installing and configuring MariaDB ColumnStore. It covers various deployment scenarios, including single- and multi-node setups with both local and S3 storage.

MariaDB ColumnStore Hardware Guidechevron-rightSingle-Node Localstoragechevron-rightSingle-Node Localstoragechevron-rightSingle-Node S3chevron-rightMulti-Node S3chevron-right

Managing ColumnStore

Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.

Backup & Restore

MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.

Managing ColumnStore Database Environment

Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.

Node Maintenance for MariaDB Enterprise Columnstore

High Availability

MariaDB ColumnStore ensures high availability with multi-node setups and shared storage, while MaxScale adds monitoring and failover for continuous analytics.

Security

MariaDB ColumnStore uses MariaDB Server’s security—encryption, access control, auditing, and firewall—for secure analytics.

Clients & Tools

MariaDB ColumnStore supports standard MariaDB tools, BI connectors (e.g., Tableau, Power BI), data ingestion (cpimport, Kafka), and REST APIs for admin.

StorageManager

The ColumnStore StorageManager manages columnar data storage and retrieval, optimizing analytical queries.

ColumnStore System Databases

When using ColumnStore, MariaDB Server creates a series of system databases used for operational purposes.

Database
Description

calpontsys

Database maintains table metadata about ColumnStore tables.

infinidb_querystats

Database maintains information about query performance. For more information, see .

columnstore_info

The database for stored procedures is used to retrieve information about ColumnStore usage. For more information, see the tables.

ColumnStore Architecture

MariaDB ColumnStore uses a shared-nothing, distributed architecture with separate modules for SQL and storage, enabling scalable, high-performance analytics.

Topologies Overviewchevron-right
ColumnStore Architectural Overviewchevron-right
ColumnStore Storage Architecturechevron-right
ColumnStore Storage Enginechevron-right
ColumnStore Read Replicaschevron-right
ColumnStore System Databaseschevron-right
ColumnStore Query Processingchevron-right
Mariadb Enterprise Columnstore Query Evaluationchevron-right
MariaDB Enterprise Columnstore Lockingchevron-right

Deployment

MariaDB ColumnStore Hardware Guidechevron-rightInstalling ColumnStorechevron-rightUpgrading ColumnStorechevron-right

Upgrading ColumnStore

Upgrade Multi-Node MariaDB Enterprise ColumnStore from 6 to 23.10chevron-rightMajor Release Upgrades for MariaDB Enterprise ColumnStorechevron-right

Use Cases

MariaDB ColumnStore is ideal for real-time analytics and complex queries on large datasets across industries.

Query Tuning

MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.

Query Plans and Optimizer Trace

MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.

Query Analysis
ColumnStore Information Schema

MariaDB ColumnStore Hardware Guide

This page details MariaDB ColumnStore hardware requirements (CPU, RAM, storage, and network).

hashtag
Overview

MariaDB ColumnStore is designed for analytical workloads and scales linearly with hardware resources. While the performance generally improves with more CPU cores, memory, and servers, understanding the minimum hardware specifications is crucial for successful deployment, especially in development and production environments.

MariaDB ColumnStore's performance directly benefits from additional hardware resources:

  • More CPU cores enable greater parallel processing, improving query processing time.

  • More memory allows for more data caching (reducing I/O), and more servers enable a larger distributed architecture.

  • HDDs vs. SSDs: SSDs don't deliver as much benefit as you might assume because ColumnStore is optimized towards block streaming, which usually performs well enough on HDDs.

  • Bare metal vs. virtual servers: Bare metal servers are recommended — they provide additional performance because ColumnStore can fully consume CPU cores and memory.

hashtag
Minimum Hardware Recommendations

The specifications differentiate between a basic development environment and a production-ready setup:

hashtag
For Development Environments

  • CPU: A minimum of 8 CPU cores.

  • Memory (RAM): A minimum of 32 GB.

  • Storage: Local disk storage is acceptable for development purposes.

hashtag
For Production Environments

  • CPU: A minimum of 64 CPU cores.

    • This recommendation underscores the highly parallel nature of ColumnStore, which can effectively utilize a large number of cores for analytical processing.

  • Memory (RAM): A minimum of 128 GB.

hashtag
Network Interconnectivity

Network interconnectivity plays a role for multi-server deployments.

  • Minimum Network: A minimum of a 1 Gigabit (1G) network is recommended.

    • This facilitates efficient data transfer between nodes via TCP/IP for replication and query processing across the distributed architecture. For optimal performance in heavy-load scenarios, higher bandwidth (for instance, 10G or more) is highly beneficial.

Adhering to these minimum specifications will provide a baseline for ColumnStore functionality. For specific workload requirements, it's always advisable to conduct performance testing and scale hardware accordingly.

hashtag
AWS Instance Sizes

For AWS, ColumnStore internal testing generally uses m4.4xlarge instance types as a cost-effective middle ground. The R4.8xlarge has also been tested, and performs about twice as fast for about twice the price.

hashtag
See Also

ColumnStore Query Processing

Clients issue a query to the MariaDB Server, which has the ColumnStore storage engine installed. MariaDB Server parses the SQL, identifies the involved ColumnStore tables, and creates an initial logical query execution plan.

Using the ColumnStore storage engine interface (ha_columnstore), MariaDB Server converts involved table references into ColumnStore internal objects. These are then handed off to the ExeMgr, which is responsible for managing and orchestrating query execution across the cluster.

The ExeMgr analyzes the query plan and translates it into a distributed ColumnStore execution plan. It determines the necessary query steps and the execution order, including any required parallelization.

The ExeMgr then references the extent map to identify which PrimProc instances hold the relevant data segments. It applies extent elimination to exclude any PrimProc nodes whose extents do not match the query’s filter criteria.

The ExeMgr dispatches commands to the selected PrimProc instances to perform data block I/O operations.

The PrimProc components perform operations such as

  • Predicate filtering

  • Join processing

  • Initial aggregation

  • Data retrieval from local disk or external storage (e.g., S3 or cloud object storage)

They then return intermediate result sets to the ExeMgr.

The ExeMgr handles:

  • Final-stage aggregation

  • Window function evaluation

  • Result-set sorting and shaping

The completed result set is returned to the MariaDB Server, which performs any remaining SQL operations like ORDER BY, LIMIT, or computed expressions in the SELECT list.

Finally, the MariaDB Server returns the result set to the client.

Switchover of the Primary Node

To switchover to a new primary node with Enterprise ColumnStore, perform the following procedure.

hashtag
Performing Switchover in MaxScale

The primary node can be switched in MaxScale using :

  • Use or another supported REST client.

  • Call a module command using the call command command.

  • As the first argument, provide the name for the module, which is .

  • As the second argument, provide the module command, which is switchover .

  • As the third argument, provide the name of the monitor.

For example:

With the above syntax, MaxScale will choose the most up-to-date replica to be the new primary.

If you want to manually select a new primary, provide the server name of the new primary as the fourth argument:

hashtag
Checking the Replication Status with MaxScale

MaxScale is capable of checking the status of using :

  • List the servers using the list servers command, like this:

If switchover was properly performed, the State column of the new primary shows Master, Running.

ColumnStore Security Vulnerabilities

  1. About CVEs "About CVEs"

  2. CVEs fixed in ColumnStore "CVEs fixed in ColumnStore"

circle-exclamation

This page is about security vulnerabilities that have been fixed for or still affect MariaDB ColumnStore. In addition, links are included to fixed security vulnerabilities in MariaDB Server since MariaDB ColumnStore is based on MariaDB Server.

Sensitive security issues can be sent directly to the persons responsible for MariaDB security: security [AT] mariadb (dot) org.

hashtag
About CVEs

CVE® stands for "Common Vulnerabilities and Exposures". It is a publicly available and free-to-use database of known software vulnerabilities maintained at

hashtag
CVEs fixed in ColumnStore

The appropriate release notes listed document CVEs fixed within a given release. Additional information can also be found at .

There are no known CVEs on ColumnStore-specific infrastructure outside of the MariaDB server at this time.

ColumnStore Table Size Limitations

MariaDB ColumnStore has a hard limit of 4096 columns per table.

However, it's likely that you run into other limitations before hitting that limit, including:

  • Row size limit of tables. This varies, depending on the storage engine you're using. For example, which indirectly limits the number of columns.

  • Size limit of .frm files. Those files hold the column description of tables. Column descriptions vary in length. Once all column descriptions combined reach a length of 64KB, the table's .frm file is full, limiting the number of columns you can have in a table.

Given that, the maximum number of columns a ColumnStore table can effectively have is around 2000 columns.

Execution Plan (CSEP)

hashtag
Overview

The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.

When the select handler receives the SELECT_LEX object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."

hashtag
Viewing the CSEP

The CSEP for a given query can be viewed by performing the following:

  1. Calling the calSetTrace(1) function:

  1. Executing the query:

  1. Calling the calGetTrace() function:

About MariaDB ColumnStore

MariaDB ColumnStore is a columnar storage engine that utilizes a massively parallel distributed data architecture. It's a columnar storage system built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.

is available as a storage engine for MariaDB Server. Before then, it is available as a separate download.

circle-info

Release notes and other documentation for ColumnStore is also available in the Enterprise docs section of the MariaDB website. For example:

It is designed for big data scaling to process petabytes of data, linear scalability, and exceptional performance with real-time response to analytical queries. It leverages the I/O benefits of columnar storage, compression, just-in-time projection, and horizontal and vertical partitioning to deliver tremendous performance when analyzing large data sets.

Links:

  • .

  • A Google Group exists for MariaDB ColumnStore that can be used to discuss ideas and issues and communicate with the community: Send email to mariadb-columnstore@googlegroups.com or use the

  • Bugs can be reported in MariaDB Jira: (see ). Please file bugs under the MCOL project and include the output from the if possible.

MariaDB ColumnStore is released under the GPL license.

Collecting Statistics with ANALYZE TABLE

hashtag
Overview

In MariaDB Enterprise ColumnStore 6, the ExeMgr process uses optimizer statistics in its query planning process.

ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.

In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:

ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.

The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.

Enterprise ColumnStore's optimizer statistics can be collected for ColumnStore tables by executing :

Enterprise ColumnStore's optimizer statistics are not updated automatically. To update the optimizer statistics for a ColumnStore table, must be re-executed.

Enterprise ColumnStore does not implement an interface to show optimizer statistics.

Sample storagemanager.cnf

# Sample storagemanager.cnf

[ObjectStorage]
service = S3
object_size = 5M
metadata_path = /var/lib/columnstore/storagemanager/metadata
journal_path = /var/lib/columnstore/storagemanager/journal
max_concurrent_downloads = 21
max_concurrent_uploads = 21
common_prefix_depth = 3

[S3]
region = us-west-1
bucket = my_columnstore_bucket
endpoint = s3.amazonaws.com
aws_access_key_id = AKIAR6P77BUKULIDIL55
aws_secret_access_key = F38aR4eLrgNSWPAKFDJLDAcax0gZ3kYblU79

[LocalStorage]
path = /var/lib/columnstore/storagemanager/fake-cloud
fake_latency = n
max_latency = 50000

[Cache]
cache_size = 2g
path = /var/lib/columnstore/storagemanager/cache
circle-info

Note: A region is required even when using an on-premises solution like ActiveScalearrow-up-right due to header expectations within the API.

Step 5: Bulk Import of Data

Step 5: Bulk Import of Data

hashtag
Overview

This page details step 5 of a 5-step procedure for deploying .

This step bulk imports data to Enterprise ColumnStore.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

Step 6: Install MariaDB MaxScale

Step 6: Install MariaDB MaxScale

hashtag
Overview

This page details step 6 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

This step installs MariaDB MaxScale 22.08. ColumnStore Object Storage requires 1 or more MaxScale nodes.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

Step 3: Install MariaDB Enterprise Server

Step 3: Install MariaDB Enterprise Server

hashtag
Overview

This page details step 3 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore, CMAPI, and dependencies.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

View and Clear Table Locks

MariaDB Enterprise ColumnStore acquires table locks for some operations, and it provides utilities to view and clear those locks.

MariaDB Enterprise ColumnStore acquires table locks for some operations, such as:

  • DDL statements

  • DML statements

  • Bulk data loads

If an operation fails, the table lock does not always get released. If you try to access the table, you can see errors like the following:

Credentials Management

hashtag
Overview

Starting with MariaDB Enterprise ColumnStore 6.2.3, ColumnStore supports encryption for user passwords stored in Columnstore.xml:

  • Encryption keys are created with the cskeys utility

Rejoining a Node

To rejoin a node with Enterprise ColumnStore, perform the following procedure.

hashtag
Performing Rejoin in MaxScale

The node can be configured to rejoin in MaxScale using :

  • Use or another supported REST client.

MariaDB Enterprise Columnstore Locking

hashtag
Overview

MariaDB Enterprise ColumnStore minimizes locking for analytical workloads, bulk data loads, and online schema changes.

hashtag
Lockless Reads

MariaDB Enterprise ColumnStore supports lockless reads.

Backup and Restore Overview

hashtag
Overview

MariaDB Enterprise ColumnStore supports backup and restore.

hashtag
System of Record

Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the

Using StorageManager With IAM Role

hashtag
AWS IAM Role Configuration

From Columnstore 5.5.2, you can use AWS IAM roles in order to connect to S3 buckets without explicitly entering credentials into the storagemanager.cnf config file.

You need to modify the IAM role of your Amazon EC2 instance to allow for this. Please follow the AWS before beginning this process.

It is important to note that you must update the AWS S3 endpoint based on your chosen region; otherwise, you might face delays in propagation as discussed and .

For a complete list of AWS service endpoints, visit the AWS .

Data Ingestion Methods & Tools

Learn about data ingestion for MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.

ColumnStore provides several mechanisms to ingest data:

  • provides the fastest performance for inserting data and ability to route data to particular PrimProc nodes. Normally, this should be the default choice for loading data .

  • provides another means of bulk inserting data.

Certified S3 Object Storage Providers

hashtag
Hardware (On Premises)

  • (Formerly known as CleverSafe)

Adequate memory is critical for caching data and intermediate results, directly impacting query performance.
  • Storage: StorageManager (S3) is recommended.

    • This implies leveraging cloud-object storage (like AWS S3 or compatible services) for scalable and durable data persistence in production.

  • MariaDB ColumnStore Minimum Hardware Specification Documentation
    MariaDB ColumnStore Overviewarrow-up-right
    MariaDB documentation: MariaDB ColumnStore
    spinner
    hashtag
    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the statement:

    1. For each table that you are importing, create the table with the statement:

    hashtag
    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    hashtag
    cpimport

    MariaDB Enterprise ColumnStore includes cpimportarrow-up-right, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run cpimportarrow-up-right:

    hashtag
    LOAD DATA INFILE

    When data is loaded with the statement, MariaDB Enterprise ColumnStore loads the data using cpimportarrow-up-right, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use statement:

    hashtag
    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into cpimportarrow-up-right, which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 5 of 5.

    This procedure is complete.

    Single-Node Enterprise ColumnStore with Local storage
    spinner
    hashtag
    Retrieve Customer Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install MaxScale

    On the MaxScale node, install MariaDB MaxScale.

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 6 of 9.

    Next: Step 7: Start and Configure MariaDB MaxScale.

    spinner
    hashtag
    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to https://customers.mariadb.com/downloads/token/arrow-up-right

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install Enterprise Server and Enterprise ColumnStore

    1. On each Enterprise ColumnStore node, install additional dependencies:

    Install on CentOS and RHEL (YUM):

    Install on Debian 9 and Ubuntu 18.04 (APT)

    Install on Debian 10 and Ubuntu 20.04 (APT):

    1. On each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology".

    This page was step 3 of 9.

    Next: Step 4: Start and Configure MariaDB Enterprise Server.

    spinner

    To solve this problem, MariaDB Enterprise ColumnStore provides two utilities to view and clear the table locks:

    • cleartablelock

    • viewtablelock

    hashtag
    Viewing Table Locks

    The viewtablelock utility shows table locks currently held by MariaDB Enterprise ColumnStore:

    To view all table locks:

    To view table locks for a specific table, specify the database and table:

    hashtag
    Clearing Table Locks

    The cleartablelock utility clears table locks currently held by MariaDB Enterprise ColumnStore.

    To clear a table lock, specify the lock ID shown by the viewtablelock utility:

    ERROR 1815 (HY000): Internal error: CAL0009: Drop table failed due to IDB-2009: Unable to perform the drop table operation because cpimport with PID 16301 is currently holding the table lock for session -1.
    spinner

    Passwords are encrypted using the cspasswd utility

    hashtag
    Compatibility

    • MariaDB Enterprise ColumnStore 6

    • MariaDB Enterprise ColumnStore 22.08

    • MariaDB Enterprise ColumnStore 23.02

    hashtag
    Encryption Keys

    MariaDB Enterprise ColumnStore stores its password encryption keys in the plain-text file /var/lib/columnstore/.secrets.

    The encryption keys are not created by default, but can be generated by executing the cskeys utility:

    In a multi-node Enterprise ColumnStore cluster, every ColumnStore node should have the same encryption keys. Therefore, it is recommended to execute cskeys on the primary server and then copy /var/lib/columnstore/.secrets to every other ColumnStore node and fix the file's permissions:

    hashtag
    Encrypt a Password

    To encrypt a password:

    Generate an encrypted password using the cspasswd utility:

    • If the --interactive command-line option is specified, cspasswd prompts for the password.

    Set the encrypted password in Columnstore.xml using the mcsSetConfig utility:

    hashtag
    Decrypt a Password

    To decrypt a password, execute the cspasswd utility and specify the --decrypt command-line option:

    spinner

    Call a module command using the call command command.

  • As the first argument, provide the name for the module, which is .

  • As the second argument, provide the module command, which is rejoin .

  • As the third argument, provide the name of the monitor.

  • As the fourth argument, provide the name of the server.

  • For example:

    hashtag
    Checking Replication Status with MaxScale

    MaxScale is capable of checking the status of using :

    • List the servers using the list servers command, like this:

    If the node properly rejoined, the State column of the node shows Slave, Running.

    maxctrl list servers
    spinner

    hashtag
    Locking for Writes

    MariaDB Enterprise ColumnStore requires a table lock for write operations.

    hashtag
    Locking for Data Loading

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    For additional information, see "MariaDB Enterprise ColumnStore Data Loading".

    hashtag
    Online Schema Changes

    MariaDB Enterprise ColumnStore supports online schema changes, so that supported DDL operations can be performed without blocking reads. The supported DDL operations only require a write metadata lock (MDL) on the target table.

    spinner
    system of record
    for your Enterprise ColumnStore data.

    A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.

    Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.

    If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:

    • If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.

    • If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.

    hashtag
    Full Backup and Restore

    MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:

    • Enterprise ColumnStore's data and metadata

    With S3: an S3 snapshot of the S3-compatible object storage and a file system snapshot or copy of the Storage Manager directory Without S3: a file system snapshot or copy of the DB Root directories.

    • The MariaDB data directory from the primary node

    To see the procedure to perform a full backup and restore, choose the storage type:

    Storage Type
    Diagram

    spinner
    hashtag
    Sample Configuration

    Edit your Storage Manager configuration file located at /etc/columnstore/storagemanager.cnf in order to look similar to the example below (replacing those in the [S3] section with your own custom variables):

    circle-info

    Note: This is an AWS only feature. For other deployment methods, see the example here.

    documentationarrow-up-right
    herearrow-up-right
    herearrow-up-right
    reference guidearrow-up-right
    spinner
    By default, with autocommit on, it internally streams the data to an instance of the cpimport process.
  • In transactional mode, DML inserts are performed, which is significantly slower and also consumes both binlog transaction files and ColumnStore VersionBuffer files.

  • DML, i.e. INSERT, UPDATE, and DELETE, provide row-level changes. ColumnStore is optimized towards bulk modifications, so these operations are slower than they would be in, for instance, InnoDB.

    • Currently ColumnStore does not support operating as a replication replica target.

    • Bulk DML operations will in general perform better than multiple individual statements.

      • with autocommit behaves similarly to LOAD DATE INFILE because, internally, it is mapped to cpimport for higher performance.

      • Bulk update operations based on a join with a small staging table can be relatively fast, especially if updating a single column.

  • Using ColumnStore Bulk Write SDK or ColumnStore Streaming Data Adapters.

  • cpimport
    LOAD DATA INFILE
    spinner

    DELL EMCarrow-up-right

    hashtag
    Cloud (IaaS)

    • AWS S3arrow-up-right

    • Google GCSarrow-up-right

    hashtag
    Software-Based

    circle-exclamation

    Due to the frequent code changes and deviation from the AWS standards, none are approved at this time.

    Quantum ActiveScalearrow-up-right
    IBM Cloud Object Storagearrow-up-right
    spinner
    maxctrl list servers
    spinner
    SELECT calSetTrace(1);
    SELECT column1, column2
    FROM columnstore_tab
    WHERE column1 > '2020-04-01'
    AND column1 < '2020-11-01';
    spinner
    [[analyze-table|ANALYZE TABLE]] columnstore_tab;
    spinner
    Security Vulnerabilities Fixed in MariaDBarrow-up-right
    spinner
    Deployment Instructions
    MariaDB Columnstore Blogsarrow-up-right
    forum interfacearrow-up-right
    jira.mariadb.orgarrow-up-right
    support utility
    spinner

    Step 3: Install MariaDB Enterprise Server

    Step 3: Install MariaDB Enterprise Server

    hashtag
    Overview

    This page details step 3 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore, CMAPI, and dependencies.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install Enterprise Server and Enterprise ColumnStore

    1. On each Enterprise ColumnStore node, install additional dependencies:

    Install on CentOS and RHEL (YUM):

    Install on Debian 9 and Ubuntu 18.04 (APT)

    Install on Debian 10 and Ubuntu 20.04 (APT):

    1. On each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 3 of 9.

    Step 9: Import Data

    Step 9: Import Data

    hashtag
    Overview

    This page details step 9 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the CREATE DATABASE statement:

    1. For each table that you are importing, create the table with the CREATE TABLE statement:

    hashtag
    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    Interface
    Method
    Benefits

    hashtag
    cpimport

    MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run :

    hashtag
    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    hashtag
    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 9 of 9.

    This procedure is complete.

    Step 6: Install MariaDB MaxScale

    Step 6: Install MariaDB MaxScale

    hashtag
    Overview

    This page details step 6 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step installs MariaDB MaxScale 22.08.

    ColumnStore Object Storage requires 1 or more MaxScale nodes.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Retrieve Customer Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install MaxScale

    On the MaxScale node, install MariaDB MaxScale.

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 6 of 9.

    .

    spinner
    spinner
    spinner
    spinner

    Topologies Overview

    MariaDB offers varied deployment topologies by workload and technology, each named and diagrammed with benefits listed. Custom configurations are also supported.

    MariaDB products can be deployed in many different topologies. The topologies described in this section are representative of the overall structure. MariaDB products can be deployed to form other topologies, leverage advanced product capabilities, or combine the capabilities of multiple topologies.

    Topologies are the arrangements of nodes and links to achieve a purpose. This documentation describes a few of the many topologies that can be deployed using MariaDB database products.

    We group topologies by workload (transactional, analytical, or hybrid) and technologies (Enterprise Spider). Single-node topologies are listed separately.

    To help you select the correct topology:

    • Each topology is named, and this name is used consistently throughout the documentation.

    ColumnStore Storage Engine

    hashtag
    Overview

    MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

    For deployment instructions and available documentation, see "MariaDB Enterprise ColumnStore."

    The ColumnStore storage engine has the following features:

    Feature
    Detail
    Resources

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    hashtag
    Overview

    This page details step 1 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 4: Test Enterprise ColumnStore

    Step 4: Test Enterprise ColumnStore

    hashtag
    Overview

    This page details step 4 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 4: Test Enterprise ColumnStore

    hashtag
    Overview

    This page details step 4 of a 5-step procedure for deploying .

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    MariaDB ColumnStore Guide

    Quickstart guide for MariaDB ColumnStore

    hashtag
    Quickstart Guide: MariaDB ColumnStore

    MariaDB ColumnStore is a specialized columnar storage engine designed for high-performance analytical processing and big data workloads. Unlike traditional row-based storage engines, ColumnStore organizes data by columns, which is highly efficient for analytical queries that often access only a subset of columns across vast datasets.

    hashtag

    Step 2: Install Enterprise ColumnStore

    Step 2: Install Enterprise ColumnStore

    hashtag
    Overview

    This page details step 2 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 5: Bulk Import of Data

    Step 5: Bulk Import of Data

    hashtag
    Overview

    This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Extent Map Backup & Recovery

    hashtag
    Overview

    MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.​

    The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.​

    Step 2: Install Enterprise ColumnStore

    Step 2: Install Enterprise ColumnStore

    hashtag
    Overview

    This page details step 2 of a 5-step procedure for deploying .

    This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Performance Related Configuration Settings

    hashtag
    MariaDB ColumnStore

    hashtag
    Introduction

    A number of system configuration variables exist to allow fine tuning of the system to suit the physical hardware and query characteristics. In general the default values will work relatively well for many cases.

    Setting a Node to Maintenance Mode

    To set a node to maintenance mode with Enterprise ColumnStore, perform the following procedure.

    hashtag
    Setting the Server State in MaxScale

    The server object for the node can be set to maintenance mode in MaxScale using :

    • Use or another supported REST client.

    Major Release Upgrades for MariaDB Enterprise ColumnStore

    This page provides a major release upgrade procedure for MariaDB Enterprise ColumnStore. A major release upgrade is an upgrade from an older major release to a newer major release, such as an upgrade from MariaDB Enterprise ColumnStore 5 to MariaDB Enterprise ColumnStore 22.08.

    hashtag
    Compatibility

    • Enterprise ColumnStore 5

    Job Steps

    hashtag
    Overview

    When Enterprise ColumnStore executes a query, the on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.

    Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:

    • Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.

    Query Tuning Recommendations

    When tuning queries for MariaDB Enterprise ColumnStore, there are some important details to consider.

    hashtag
    Avoid Selecting Unnecessary Columns

    Enterprise ColumnStore only reads the columns that are necessary to resolve a query.

    For example, the following query selects every column in the table:

    Whereas the following query only selects two columns in the table, so it requires less I/O:

    For best performance, only select the columns that are necessary to resolve a query.

    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    
    $ chmod +x mariadb_es_repo_setup
    
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-server \
          --skip-tools \
          --mariadb-maxscale-version="22.08"
    $ sudo yum install maxscale
    $ sudo apt install maxscale
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install jemalloc jq curl
    $ sudo apt install libjemalloc1 jq curl
    $ sudo apt install libjemalloc2 jq curl
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine \
       MariaDB-columnstore-cmapi
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore \
       mariadb-columnstore-cmapi
    viewtablelock
     There is 1 table lock
    
      Table                     LockID  Process   PID    Session   Txn  CreationTime               State    DBRoots
      hq_sales.invoices         1       cpimport  16301  BulkLoad  n/a  Wed April 7 14:20:42 2021  LOADING  1
    viewtablelock hq_sales invoices
     There is 1 table lock
    
      Table                     LockID  Process   PID    Session   Txn  CreationTime               State    DBRoots
      hq_sales.invoices         1       cpimport  16301  BulkLoad  n/a  Wed April 7 14:20:42 2021  LOADING  1
    cleartablelock 1
    $ cskeys
    $ scp 192.0.2.1:/var/lib/columnstore/.secrets /var/lib/columnstore/.secrets
    $ sudo chown mysql:mysql /var/lib/columnstore/.secrets
    $ sudo chmod 0400 /var/lib/columnstore/.secrets
    $ cspasswd util_user_passwd
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_encrypted_passwd
    $ cspasswd --decrypt util_user_encrypted_passwd
    maxctrl call command \
       mariadbmon \
       rejoin \
       mcs_monitor \
       mcs3
    [ObjectStorage]
    service = S3
    object_size = 5M
    metadata_path = /var/lib/columnstore/storagemanager/metadata
    journal_path = /var/lib/columnstore/storagemanager/journal
    max_concurrent_downloads = 21
    max_concurrent_uploads = 21
    common_prefix_depth = 3
    
    [S3]
    ec2_iam_mode=enabled
    bucket = my_mcs_bucket
    region = us-west-2
    endpoint = s3.us-west-2.amazonaws.com
    
    [LocalStorage]
    path = /var/lib/columnstore/storagemanager/fake-cloud
    fake_latency = n
    max_latency = 50000
    
    [Cache]
    cache_size = 2g
    path = /var/lib/columnstore/storagemanager/cache
    maxctrl call command \
       mariadbmon \
       switchover \
       mcs_monitor
    maxctrl call command \
       mariadbmon \
       switchover \
       mcs_monitor \
       mcs2
    SELECT calGetTrace();

    A thumbnail diagram provides a small-scale summary of the topology's architecture.

  • Finally, we provide a list of the benefits of the topology.

  • Although multiple topologies are listed on this page, the listed topologies are not the only options. MariaDB products are flexible, configurable, and extensible, so it is possible to deploy different topologies that combine the capabilities of multiple topologies listed on this page. The topologies listed on this page are primarily intended to be representative of the most commonly requested use cases.

    hashtag
    Transactional (OLTP)

    hashtag
    Primary/Replica Topology

    Diagram
    Features

    MariaDB Replication

    • Highly available

    • Asynchronous or semi-synchronous replication

    • Automatic failover via MaxScale

    • Manual provisioning of new nodes from backup

    hashtag
    Galera Cluster Topology

    Diagram
    Features

    Galera Cluster Topology Multi-Primary Cluster Powered by Galera for Transactional/OLTP Workloads

    • InnoDB Storage Engine

    • Highly available

    • Virtually synchronous, certification-based replication

    • Automated provisioning of new nodes (IST/SST)

    hashtag
    Analytical (OLAP, Data Warehousing, DSS)

    hashtag
    ColumnStore Shared Local Storage Topology

    Diagram
    Features

    Columnar storage engine with shared local storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales reads via MaxScale

    • Bulk data import

    hashtag
    ColumnStore Object Storage Topology

    Diagram
    Features

    Columnar storage engine with S3-compatible object storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales reads via MaxScale

    • Bulk data import

    hashtag
    Hybrid Workloads

    hashtag
    HTAP Topology

    Diagram
    Features

    • Single-stack hybrid transactional/analytical workloads

    • ColumnStore for analytics with scalable S3-compatible object storage

    • InnoDB for transactions• Cross-engine JOINs

    • Enterprise Server, Enterprise ColumnStore, MaxScale

    spinner

    Storage Engine

    ColumnStore

    Availability

    ES 10.5+, CS 10.5+

    MariaDB Enterprise Server

    Workload Optimization

    OLAP and Hybrid

    Table Orientation

    Columnar

    ACID-compliant

    Yes

    Indexes

    hashtag
    Examples

    hashtag
    Creating a ColumnStore Table

    To create a ColumnStore table, use the statement with the ENGINE=ColumnStore option:

    hashtag
    Multi-Node Configuration

    To deploy a multi-node Enterprise ColumnStore deployment, a configuration similar to below is required:

    hashtag
    Configure the Mandatory Utility User Account

    To configure the mandatory utility user account, use the mcsSetConfig command:

    spinner
    hashtag
    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    hashtag
    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    hashtag
    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    hashtag
    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    hashtag
    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    hashtag
    Create an S3 Bucket

    If you want to use S3-compatible storage, it is important to create the S3 bucket before you start ColumnStore. If you already have an S3 bucket, confirm that the bucket is empty.

    S3 bucket configuration will be performed later in this procedure.

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 1 of 5.

    Next: Step 2: Install MariaDB Enterprise ColumnStore.

    spinner
    hashtag
    Test S3 Connection

    MariaDB Enterprise ColumnStore 23.10 includes a testS3Connection command to test the S3 configuration, permissions, and connectivity.

    On each Enterprise ColumnStore node, test the S3 configuration:

    If the testS3Connection command does not return OK, investigate the S3 configuration.

    hashtag
    Test Local Connection

    Connect to the server using using the root@localhost user account:

    hashtag
    Test ColumnStore Plugin Status

    Query and confirm that the ColumnStore storage engine plugin is ACTIVE:

    hashtag
    Test ColumnStore Table Creation

    1. Create a test database, if it does not exist:

    1. Create a ColumnStore table:

    1. Add sample data into the table:

    1. Read data from table:

    hashtag
    Test Cross Engine Join

    1. Create an InnoDB table:

    1. Add data to the table:

    1. Perform a cross-engine join:

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 4 of 5.

    Next: Step 5: Bulk Import of Data.

    spinner
    hashtag
    Test Local Connection

    Connect to the server using MariaDB Client using the root@localhost user account:

    hashtag
    Test ColumnStore Plugin Status

    Query and confirm that the ColumnStore storage engine plugin is ACTIVE:

    hashtag
    Test ColumnStore Table Creation

    1. Create a test database, if it does not exist:

    1. Create a ColumnStore table:

    1. Add sample data into the table:

    1. Read data from table:

    hashtag
    Test Cross Engine Join

    1. Create an InnoDB table:

    1. Add data to the table:

    1. Perform a cross-engine join:

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 4 of 5.

    Next: Step 5: Bulk Import of Data.

    Single-Node Enterprise ColumnStore with Local storage
    spinner
    What is MariaDB ColumnStore?

    MariaDB ColumnStore is a columnar storage engine that integrates with MariaDB Server. It employs a massively parallel distributed data architecture, making it ideal for processing petabytes of data with linear scalability. It was originally ported from InfiniDB and is released under the GPL license.

    hashtag
    Key Benefits

    • Exceptional Analytical Performance: Delivers superior performance for complex analytical queries (OLAP) due to its columnar nature, which minimizes disk I/O by reading only necessary columns.

    • High Data Compression: Columnar storage allows for much higher compression ratios compared to row-based storage, reducing disk space usage and improving query speed.

    • Massive Scalability: Designed to scale horizontally across multiple nodes, processing petabytes of data with ease.

    • Just-in-Time Projection: Only the required columns are processed and returned, further optimizing query execution.

    • Real-time Analytics: Capable of handling real-time analytical queries efficiently.

    hashtag
    Architecture Concepts (Simplified)

    MariaDB ColumnStore utilizes a distributed architecture with different components working together:

    • User Module (UM): Handles incoming SQL queries, optimizes them for columnar processing, and distributes tasks.

    • Performance Module (PM): Manages data storage, compression, and execution of query fragments on the data segments.

    • Data Files: Data is stored in column-segments across the nodes, highly compressed.

    hashtag
    Installation Overview

    MariaDB ColumnStore is installed as a separate package that integrates with MariaDB Server. The exact installation steps vary depending on your operating system and desired deployment type (single server or distributed cluster).

    General Steps (conceptual):

    1. Install MariaDB Server: Ensure you have a compatible MariaDB Server version installed (e.g., MariaDB 10.5.4 or later).

    2. Install ColumnStore Package: Download and install the specific MariaDB ColumnStore package for your OS. This package includes the ColumnStore storage engine and its associated tools.

      • Linux (e.g., Debian/Ubuntu): You would typically add the MariaDB repository configured for ColumnStore and then install mariadb-plugin-columnstore.

      • Single Server vs. Distributed: For a single-server setup, you install all ColumnStore components on one machine. For a distributed setup, you install and configure components across multiple machines.

    3. Configure MariaDB: After installation, you might need to adjust your MariaDB server configuration (my.cnf or equivalent) to properly load and manage the ColumnStore engine.

    4. Initialize ColumnStore: Run a specific columnstore-setup or post-install script to initialize the ColumnStore environment.

    hashtag
    Basic Usage

    Once MariaDB ColumnStore is installed and configured, you can create and interact with ColumnStore tables using standard SQL.

    hashtag
    Creating a ColumnStore Table

    Specify ENGINE=ColumnStore when creating your table. Note that ColumnStore tables do not support primary keys in the same way as InnoDB, as their primary focus is analytical processing.

    hashtag
    Inserting Data

    You can insert data using standard INSERT statements. For large datasets, bulk loading utilities (for instance, LOAD DATA INFILE) are highly recommended for performance.

    hashtag
    Querying Data

    Perform analytical queries. ColumnStore will efficiently process these, often leveraging its columnar nature and parallelism.

    hashtag
    See Also

    • MariaDB ColumnStore Overviewarrow-up-right

    • DigitalOcean: How to Install MariaDB ColumnStore on Ubuntu 20.04arrow-up-right

    spinner
    hashtag
    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to https://customers.mariadb.com/downloads/token/arrow-up-right

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install Enterprise ColumnStore

    Install additional dependencies:

    Install on CentOS / RHEL (YUM)

    Install of Debian 10 and Ubuntu 20.04 (APT):

    Install on Debian 9 and Ubuntu 18.04 (APT):

    Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 2 of 5.

    Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.

    spinner
    hashtag
    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the statement:

    1. For each table that you are importing, create the table with the statement:

    hashtag
    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    hashtag
    cpimport

    MariaDB Enterprise ColumnStore includes cpimportarrow-up-right, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run cpimportarrow-up-right:

    hashtag
    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using cpimportarrow-up-right, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    hashtag
    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into cpimportarrow-up-right, which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 5 of 5.

    This procedure is complete.

    spinner
    hashtag
    Purpose

    ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.

    The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.

    If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.

    hashtag
    Backup Procedure

    circle-info

    Note: MariaDB recommends implementing regular backups to ensure data integrity and recovery. A common default is to back up every 3 hours and retain backups for at least 10 days.

    To safeguard against potential Extent Map corruption, regularly back up the master copy:

    1. Lock Table:

    1. Save BRM:

    1. Create Backup Directory:

    1. Copy Extent Map:

    1. Unlock Tables:

    hashtag
    Recovery Procedures

    hashtag
    Single-Node System

    1. Stop ColumnStore:

    1. Rename Corrupted Map:

    1. Clear Versioning Files:

    1. Restore Backup:

    1. Set Ownership:

    1. Start ColumnStore:

    hashtag
    Clustered System

    1. Shutdown Cluster:

    1. Rename Corrupted Map:

    1. Clear Versioning Files:

    1. Restore Backup:

    1. Set Ownership:

    1. Start Cluster:

    hashtag
    Automation Recommendation

    Incorporate the save_brm command into your data import scripts (e.g., those using cpimport) to automate Extent Map backups. This practice ensures regular backups without manual intervention.

    Refer to the MariaDB ColumnStore Backup Script for an example implementation.​

    spinner
    hashtag
    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to https://customers.mariadb.com/downloads/token/arrow-up-right

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    hashtag
    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    hashtag
    Install Enterprise ColumnStore

    Install additional dependencies:

    Install on CentOS / RHEL (YUM)

    Install of Debian 10 and Ubuntu 20.04 (APT):

    Install on Debian 9 and Ubuntu 18.04 (APT):

    Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 2 of 5.

    Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.

    Single-Node Enterprise ColumnStore with Local storage
    spinner
    The configuration parameters are maintained in the /etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.

    Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information.

    hashtag
    Memory Management

    hashtag
    NumBlocksPct

    The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 25, to ensure enough physical memory.

    The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 50, to ensure enough physical memory.

    hashtag
    TotalUmMemory

    The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 50.

    The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 25.

    hashtag
    Memory Requirements

    In a single server or combined deployment, the sum of NumBlocksPct and TotalUmMemory should typically not exceed 75% of physical memory. With very large memory servers this could be raised but the key point is to leave enough memory for other processes including mariadbd.

    circle-info

    From version 1.2.2, these can be set to static numeric limits instead of percentages by entering a number with 'M' or 'G' at the end to signify MiB or GiB.

    hashtag
    Query Concurrency - MaxOutstandingRequests

    ColumnStore handles concurrent query execution by managing the rate of concurrent batch primitive steps. This is configured using the MaxOutstandingRequests parameter and has a default value of 20. Each batch primitive step is executed within the context of 1 extent column according to this high level process:

    • ColumnStore issues up to MaxOutstandingRequests number of batch primitive steps.

    • PrimProc processes the request, using many threads and returns its response. These generally take a fraction of a second up to a low number of seconds depending on the amount of Physical I/O and the performance of that storage.

    • ColumnStore issues new requests as prior requests complete maintaining the maximum number of outstanding requests.

    This scheme allows for large queries to use all available resources when not otherwise being consumed and for smaller queries to execute with minimal delay. Lower values optimize for higher throughput of smaller queries while a larger value optimizes for response time of that query. The default value should work well under most circumstances however the value should be increased as the number of nodes is increased.

    How many Queries are running and how many queries are currently in the queue can be checked with

    hashtag
    Join Tuning - PmMaxMemorySmallSide

    ColumnStore maintains statistics for table and utilizes this to determine which is the larger table of the two. This is based both on the number of blocks in that table and estimation of the predicate cardinality. The first step is to apply any filters as appropriate to the smaller table and returning this data set to memory. The size of this data set is compared against the configuration parameter PmMaxMemorySmallSide which has a default value of 64 (MB). This value can be set all the way up to 4GB. This default allows for approximately 1M rows on the small table side to be joined against billions (or trillions) on the large table side. If the size of the small data set is less than PmMaxMemorySmallSide the dataset is sent to PrimProc for creation of a distributed hashmap. Thus this setting is important to tuning of joins and whether the operation can be distributed or not. This should be set to support your largest expected small table join size up to available memory:

    • Although this will increase the size of data between nodes to support the join, it means that the join and subsequent aggregates are pushed down, scaled out, and a smaller data set is returned back.

    • In a multiple server deployment, the sizing should be based from available physical memory on the servers, how much memory to reserve for block caching, and the number of simultaneous join operations that can be expected to run times the average small table join data size.

    hashtag
    Multi-Table Join Tuning

    The above logic for a single table join extrapolates out to multi table joins where the small table values are precalculated and performed as one single scan against the large table. This works well for the typical star schema case joining multiple dimension tables with a large fact table. For some join scenarios it may be necessary to sequence joins to create the intermediate datasets for joining, this would happen for instance with a snowflake schema structure. In some extreme cases it may be hard for the optimizer to be able to determine the most optimal join path. In this case a hint is available to force a join ordering. The INFINIDB_ORDERED hint will force the first table in the from clause to be considered the largest table and override any statistics based decision, for example:

    triangle-exclamation

    Note: INFINIDB\_ORDERED is deprecated and does not work anymore for ColumnStore 1.2 and above.

    use set infinidb_ordered_only=ON;

    and for 1.4 set columnstore_ordered_only=ON;

    hashtag
    Disk-Based Joins - AllowDiskBasedJoin

    When a join is very large and exceeds the PmMaxMemorySmallSide setting, it is performed in memory. For very large joins, this could exceed the available memory, in which case this is detected and a query error reported. Several configuration parameters are available to enable and configure usage of disk overflow should this occur:

    • AllowDiskBasedJoin – Controls the option to use disk Based joins or not. Valid values are Y (enabled) or N (disabled). By default, this option is disabled.

    • TempFileCompression – Controls whether the disk join files are compressed or noncompressed. Valid values are Y (use compressed files) or N (use non-compressed files).

    • TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /tmp/columnstore_tmp_files/). Files (named infinidb-join-data*) in this directory will be created and cleaned on an as-needed basis. The entire directory is removed and recreated by ExeMgr at startup. It is strongly recommended that this directory be stored on a dedicated partition.

    A MariaDB global or session variable is available to specify a memory limit at which point the query is switched over to disk-based joins:

    • infinidb_um_mem_limit - Memory limit in MB per user (i.e., switch to disk-based join if this limit is exceeded). By default, this limit is not set (value of 0).

    spinner

    Set the server object to maintenance mode using the set server command.

  • As the first argument, provide the name for the server.

  • As the second argument, provide maintenance as the state.

  • For example:

    If the specified server is a primary server, then MaxScale will allow open transactions to complete before closing any connections.

    If you would like MaxScale to immediately close all connections, the --force option can be provided as a third argument:

    hashtag
    Confirming Maintenance Mode is Set with MaxScale

    Confirm the state of the server object in MaxScale using :

    • List the servers using the list servers command, like this:

    If the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.

    hashtag
    Performing Maintenance

    Now that the server is in maintenance mode in MaxScale, you can perform your maintenance.

    While the server is in maintenance mode:

    • MaxScale doesn't route traffic to the node.

    • MaxScale doesn't select the node to be primary during failover.

    • The node can be rebooted.

    • The node's services can be restarted.

    hashtag
    Clear the Server State in MaxScale

    Maintenance mode for the server object for the node can be cleared in MaxScale using :

    • Use or another supported REST client.

    • Clear the server object's state using the clear server command.

    • As the first argument, provide the name for the server.

    • As the second argument, provide maintenance as the state.

    For example:

    hashtag
    Confirming Maintenance Mode is Cleared with MaxScale

    Confirm the state of the server object in MaxScale using :

    • List the servers using the list servers command, like this:

    If the node is no longer in maintenance mode, the State column no longer shows Maintenance as one of the states.

    spinner
    Enterprise ColumnStore 6
  • Enterprise ColumnStore 22.08

  • hashtag
    Prerequisites

    This procedure assumes that the new Enterprise ColumnStore version will be installed onto new servers.

    To reuse existing servers for the new Enterprise ColumnStore version, you must adapt the procedure detailed below. After step 1, confirm all data has been backed-up and verify backups. The old version of Enterprise ColumnStore should then be uninstalled, and all Enterprise ColumnStore files should be deleted before continuing with step 2.

    hashtag
    Step 1: Backup/Export Schemas and Data

    On the old ColumnStore cluster, perform a full backup.

    MariaDB recommends backing up the table schemas to a single SQL file and backing up the table data to table-specific CSV files.

    1. For each table, obtain the table's schema by executing the SHOW CREATE TABLE :

      Backup the table schemas by copying the output to an SQL file. This procedure assumes that the SQL file is named schema-backup.sql.

    2. For each table, backup the table data to a CSV file using the SELECT .. INTO OUTFILE :

    3. Copy the SQL file containing the table schemas and the CSV files containing the table data to the primary node of the new ColumnStore cluster.

    hashtag
    Step 2: Install New Major Release

    On the new ColumnStore cluster, follow the deployment instructions of the desired topology for the new ColumnStore version.

    For deployment instructions, see "".

    hashtag
    Step 3: Restore/Import Data

    On the new ColumnStore cluster, restore the table schemas and data.

    1. Restore the schema backup using :

      • HOST and PORT should refer to the following:

        • If you are connecting with MaxScale as a proxy, they should refer to the host and port of the MaxScale listener

        • If you are connecting directly to a multi-node ColumnStore cluster, they should refer to the host and port of the primary ColumnStore node

        • If you are connecting directly to single-node ColumnStore, they should refer to the host and port of the ColumnStore node

      • When the command is executed, mariadb client prompts for the user password

    2. For each table, restore the data from the table's CSV file by executing the on the primary ColumnStore node:

    hashtag
    Step 4: Test

    On the new ColumnStore cluster, verify that the table schemas and data have been restored.

    1. For each table, verify the table's definition by executing the SHOW CREATE TABLE statement:

    2. For each table, verify the number of rows in the table by executing SELECT COUNT(*):

    3. For each table, verify the data in the table executing the statement.

      If the table is very large, you can limit the number of rows in the result set by adding a LIMIT clause:

    spinner

    Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.

    As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.

    For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation.".

    hashtag
    Batch Primitive Step (BPS)

    Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT list).

    In calGetTrace() output, a batch primitive step is abbreviated BPS.

    Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.

    hashtag
    Cross Engine Step (CES)

    Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.

    In calGetTrace() output, a cross-engine step is abbreviated CES.

    Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.

    For additional information, refer to the "Mandatory Utility User Account"

    hashtag
    Dictionary Structure Step (DSS)

    Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.

    In calGetTrace() output, a dictionary structure step is abbreviated DSS.

    Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.

    Dictionary structure steps can require a lot of I/O for a couple of reasons:

    • Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.

    • Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.

    It is generally recommended to avoid queries that will cause dictionary scans.

    For additional information, see "Avoid Creating Long String Columns".

    hashtag
    Hash Join Step (HJS)

    Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.

    In calGetTrace() output, a hash join step is abbreviated HJS.

    Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:

    • Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.

    • Enterprise ColumnStore can be configured to use disk-based joins.

    For additional information, see "Configure in-memory joins" and "Configure Disk-Based Joins".

    hashtag
    Having Step (HVS)

    Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.

    In calGetTrace() output, a having step is abbreviated HVS.

    hashtag
    Subquery Step (SQS)

    Enterprise ColumnStore defines a subquery step to evaluate a subquery.

    In calGetTrace() output, a subquery step is abbreviated SQS.

    hashtag
    Tuple Aggregation Step (TAS)

    Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.

    In calGetTrace() output, a tuple aggregation step is abbreviated TAS.

    Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.

    hashtag
    Tuple Annexation Step (TNS)

    Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.

    In calGetTrace() output, a tuple annexation step is abbreviated TNS.

    Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.

    In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

    For additional information, see "Configure Disk-Based Aggregations".

    hashtag
    Tuple Union Step (TUS)

    Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.

    In calGetTrace() output, a tuple union step is abbreviated TUS.

    Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    hashtag
    Tuple Constant Step (TCS)

    Enterprise ColumnStore defines a tuple constant step to evaluate constant values.

    In calGetTrace() output, a tuple constant step is abbreviated TCS.

    Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    hashtag
    Window Function Step (WFS)

    Enterprise ColumnStore defines a window function step to evaluate window functions.

    In calGetTrace() output, a window function step is abbreviated WFS.

    Window function steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    ExeMgr process
    spinner
    hashtag
    Avoid Large Sorts

    When Enterprise ColumnStore performs ORDER BY and LIMIT operations, the operations are performed in a single-threaded manner after the rest of the query processing has been completed, and the full unsorted result-set has been retrieved. For large data sets, the performance overhead can be significant.

    hashtag
    Avoid Excessive Aggregations

    When Enterprise ColumnStore 5 performs aggregations (i.e., DISTINCT, GROUP BY, COUNT(*), etc.), all of the aggregation work happens in-memory by default. As a consequence, more complex aggregation operations require more memory in that version.

    For example, the following query could require a lot of memory in Enterprise ColumnStore 5, since it has to calculate many distinct values in memory:

    Whereas the following query could require much less memory in Enterprise ColumnStore 5, since it has to calculate fewer distinct values:

    In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

    For best performance, avoid excessive aggregations or enable disk-based aggregations.

    For additional information, see "Configure Disk-Based Aggregations".

    hashtag
    Avoid Non-Distributed Functions

    When Enterprise ColumnStore evaluates built-in functions and aggregate functions, it can often evaluate the function in a distributed manner. Distributed evaluation of functions can significantly improve performance.

    Enterprise ColumnStore supports distributed evaluation for some built-in functions. For other built-in functions, the function must be evaluated serially on the final result set.

    Enterprise ColumnStore also supports distributed evaluation for user-defined functions developed with ColumnStore's User-Defined Aggregate Function (UDAF) C++ API. For functions developed with Enterprise Server's standard User-Defined Function (UDF) API, the function must be evaluated serially on the final result set.

    For best performance, avoid non-distributed functions.

    hashtag
    Optimize Large Joins

    By default, Enterprise ColumnStore performs all joins as in-memory hash joins.

    If the joined tables are very large, the in-memory hash join can require too much memory for the default configuration. There are a couple options to work around this:

    • Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.

    • Enterprise ColumnStore can be configured to use disk-based joins.

    • Enterprise ColumnStore can use optimizer statistics to better optimize the join order.

    For additional information, see "Configure In-Memory Joins", "Configure Disk-Based Joins", and "Optimizer Statistics".

    hashtag
    Load Ordered Data in Proper Order

    Enterprise ColumnStore uses extent elimination to optimize queries. Extent elimination uses the minimum and maximum values in the extent map to determine which extents can be skipped for a query.

    When data is loaded into Enterprise ColumnStore, it appends the data to the latest extent. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent. As a consequence, if ordered data is loaded in its proper order, then similar values will be clustered together in the same extent. This can improve query performance, because extent elimination performs best when similar values are clustered together.

    For example, if you expect to query a table with a filter on a timestamp column, you should sort the data using the timestamp column before loading it into Enterprise ColumnStore. Later, when the table is queried with a filter on the timestamp column, Enterprise ColumnStore would be able to skip many extents using extent elimination.

    For best performance, load ordered data in proper order.

    hashtag
    Enable Decimal Overflow Checks

    When Enterprise ColumnStore performs mathematical operations with very big values using the , , and data types, the operation can sometimes overflow ColumnStore's maximum precision or scale. The maximum precision and scale depend on the version of Enterprise ColumnStore:

    • In Enterprise ColumnStore 6, the maximum precision (M) is 38, and the maximum scale (D) is 38.

    • In Enterprise ColumnStore 5, the maximum precision (M) is 18, and the maximum scale (D) is 18.

    In Enterprise ColumnStore 6, applications can configure Enterprise ColumnStore to check for decimal overflows by setting the columnstore_decimal_overflow_check system variable, but only when the column has a decimal precision that is 18 or more:

    When decimal overflow checks are enabled, math operations have extra overhead.

    When the decimal overflow check fails, MariaDB Enterprise ColumnStore raises an error with the ER_INTERNAL_ERROR error SQL, and it writes detailed information about the overflow check failure to the ColumnStore system logs.

    hashtag
    User-Defined Aggregate Function (UDAF) C++ API

    MariaDB Enterprise ColumnStore supports Enterprise Server's standard User-Defined Function (UDF) API. However, UDFs developed using that API cannot be executed in a distributed manner.

    To support distributed execution of custom SQL, MariaDB Enterprise ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:

    • The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.

    • These functions can also be used as Analytic (Window) functions just like any built-in aggregate function.

    spinner
    https://customers.mariadb.com/downloads/token/arrow-up-right
    Next: Step 4: Start and Configure MariaDB Enterprise Server.
    spinner

    Shell

    cpimport

    • SQL access is not required

    SQL

    LOAD DATA INFILE

    • Shell access is not required

    Remote Database

    Remote Database Import

    • Use normal database client

    • Avoid dumping data to intermediate filed

    cpimportarrow-up-right
    cpimportarrow-up-right
    cpimportarrow-up-right
    cpimportarrow-up-right
    spinner
    https://customers.mariadb.com/downloads/token/arrow-up-right
    Next: Step 7: Start and Configure MariaDB MaxScale
    spinner
    INSERT INTO SELECT
    Enterprise ColumnStore with Object Storage
    Enterprise ColumnStore with Shared Local Storage
    columnstore-topology-s3
    columnstore-topology

    Step 3: Start and Configure Enterprise ColumnStore

    Step 3: Start and Configure Enterprise ColumnStore

    hashtag
    Overview

    This page details step 3 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.

    This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Configure Enterprise ColumnStore

    Mandatory system variables and options for Single-Node Enterprise ColumnStore include:

    Connector
    MariaDB Connector/R2DBC

    hashtag
    Example Configuration

    hashtag
    Start the Enterprise ColumnStore Services

    Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    Start and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:

    hashtag
    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account. By default, it connects to the server using the root user with no password. MariaDB Enterprise Server 10.6 will reject this login attempt by default, so you will need to configure Enterprise ColumnStore to use a different user account and password and create this user account on Enterprise Server.

    1. On the Enterprise ColumnStore node, create the user account with the statement:

    1. On the Enterprise ColumnStore node, grant the user account SELECT privileges on all databases with the GRANT statement:

    1. Configure Enterprise ColumnStore to use the utility user:

    1. Set the password:

    For details about how to encrypt the password, see "".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    hashtag
    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    hashtag
    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Set SELinux to enforcing mode:

    hashtag
    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see on ubuntu.com.

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 3 of 5.

    Step 9: Import Data

    Step 9: Import Data

    hashtag
    Overview

    This page details step 9 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the CREATE DATABASE statement:

    1. For each table that you are importing, create the table with the CREATE TABLE statement:

    hashtag
    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    Interface
    Method
    Benefits

    hashtag
    cpimport

    MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run :

    hashtag
    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    hashtag
    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 9 of 9.

    This procedure is complete.

    ColumnStore Read Replicas

    circle-info

    The ColumnStore Read Replica topology is an Alpha release. Do not use it in production without testing in your development environment first.

    hashtag
    Overview

    The Read Replicas feature in MariaDB ColumnStore enables horizontal scaling of read performance by incorporating read-only nodes into a multi-node cluster. These replicas differ from standard ColumnStore nodes, in that they don't run the WriteEngineServer process. This means Read Replica nodes cannot handle write operations directly — instead, any write queries attempted on a replica are automatically forwarded to a read-write (RW) node.

    Replicas utilize shared storage with other nodes in the cluster, ensuring data consistency without duplication. A key requirement is maintaining at least one RW node — a cluster consisting solely of read replicas is not operational and cannot process reads or writes.

    circle-exclamation

    Read-only nodes are incompatible with S3 as the storage backend.

    Additionally, there is no automatic promotion of a read replica to RW mode if the only RW node fails, which could lead to temporary downtime until manual intervention.

    hashtag
    Key Features

    • Horizontal Read Scaling: Adds compute power for handling more read-intensive queries without impacting write performance.

    • Write Forwarding: Ensures writes on replicas are redirected to RW nodes, maintaining data integrity.

    • Shared Storage: Replicas access the same DBRoots as RW nodes, promoting efficiency and reducing storage overhead.

    hashtag
    Key Commands

    circle-info

    These commands require .

    • Add Read Replica. To introduce a read-only node for scaling reads, run this command:

    • Remove Node. To safely remove any node (RW or replica) from the cluster, run this command:

    This reassigns resources as needed without cluster disruption.

    • Verify Status. To monitor the cluster's health and node roles, issue:

    hashtag
    Limitations

    • Node addition is restricted to private IPs only.

    • Incompatible with S3 storage, limiting use to shared file systems.

    • No automatic failover or promotion mechanism if the sole RW node goes down, requiring manual recovery.

    • At least one RW node must always be present for the cluster to function properly, supporting both read and write operations.

    hashtag
    How-To

    hashtag
    Prerequisites

    circle-info

    Ensure shared storage is mounted on all nodes (at /var/lib/columnstore/data1 for non-s3 configuration), to ensure data consistency across RW nodes and read replicas.

    Refer to for exact mount points details.

    hashtag
    Installation and Setup

    1

    Set Up MariaDB Repository

    Run the following to add the MariaDB repository (adjust "11.4" to the latest stable version):

    See for additional details about the ES repo setup.

    2

    Install Packages

    circle-info

    Step 2: Configure Shared Local Storage

    Step 2: Configure Shared Local Storage

    hashtag
    Overview

    This page details step 2 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step configures shared local storage on systems hosting Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 2: Configure Shared Local Storage

    Step 2: Configure Shared Local Storage

    hashtag
    Overview

    This page details step 2 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step configures shared local storage on systems hosting Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    hashtag
    Overview

    This page details step 1 of a 5-step procedure for deploying .

    This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Optimizing Linux Kernel Parameters for MariaDB ColumnStore

    This page provides information on optimizing Linux kernel parameters for improved performance with MariaDB ColumnStore.

    hashtag
    Introduction

    MariaDB ColumnStore is a high-performance columnar database designed for analytical workloads. By optimizing the Linux kernel parameters, you can further enhance the performance of your MariaDB ColumnStore deployments.

    hashtag

    Performance Concepts

    hashtag
    Introduction

    The high level components of the ColumnStore architecture are:

    • PrimProc: PrimProc (Primitives Processor) is responsible for parsing the SQL requests into an optimized set of primitive job steps executed by one or more servers. PrimProc is thus responsible for query optimization and orchestration of query execution by the servers. While every instance has their own PrimProc in a multi-server deployment, each query begins and ends on the same PrimProc it originated from. A database load balancer such as MariaDB MaxScale can be deployed to appropriately balance external requests against individual servers. PrimProc also executes granular job steps received from the server (mariadbd) in a multi-threaded manner. ColumnStore allows distribution of the work across many servers.

    CREATE DATABASE columnstore_db;
    
    CREATE TABLE columnstore_db.analytics_test (
       id INT,
       str VARCHAR(50)
    ) ENGINE = ColumnStore;
    [mariadb]
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    log_bin                                = mariadb-bin
    log_bin_index                          = mariadb-bin.index
    relay_log                              = mariadb-relay
    relay_log_index                        = mariadb-relay.index
    log_slave_updates                      = ON
    gtid_strict_mode                       = ON
    
    # This must be unique on each cluster node
    server_id                              = 1
    sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    sudo mcsSetConfig CrossEngineSupport Port 3306
    sudo mcsSetConfig CrossEngineSupport User cross_engine
    sudo mcsSetConfig CrossEngineSupport Password cross_engine_passwd
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    $ sudo testS3Connection
    StorageManager[26887]: Using the config file found at /etc/columnstore/storagemanager.cnf
    StorageManager[26887]: S3Storage: S3 connectivity & permissions are OK
    S3 Storage Manager Configuration OK
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    CREATE DATABASE IF NOT EXISTS test;
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE=ColumnStore;
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    SELECT * FROM test.contacts;
    
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    CREATE TABLE test.addresses (
       email VARCHAR(100),
       street_address VARCHAR(255),
       city VARCHAR(100),
       state_code VARCHAR(2)
    ) ENGINE = InnoDB;
    INSERT INTO test.addresses (email, street_address, city, state_code)
       VALUES
       ("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
       ("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");
    SELECT name AS "Name", addr AS "Address"
    FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
       email FROM test.contacts) AS contacts
    INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
       email FROM test.addresses) AS addr
    WHERE  contacts.email = addr.email;
    +----------+-----------------------------------------+
    | Name     | Address                                 |
    +----------+-----------------------------------------+
    | Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
    | Lee Wang | 32620 Little Blvd, Redwood City, CA     |
    +----------+-----------------------------------------+
    
    +-------------------+-------------------------------------+
    | Name              | Address                             |
    +-------------------+-------------------------------------+
    | Walker Percy      | 500 Thomas More Dr., Covington, LA  |
    | Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
    +-------------------+-------------------------------------+
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    CREATE DATABASE IF NOT EXISTS test;
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE=ColumnStore;
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    SELECT * FROM test.contacts;
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    CREATE TABLE test.addresses (
       email VARCHAR(100),
       street_address VARCHAR(255),
       city VARCHAR(100),
       state_code VARCHAR(2)
    ) ENGINE = InnoDB;
    INSERT INTO test.addresses (email, street_address, city, state_code)
       VALUES
       ("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
       ("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");
    SELECT name AS "Name", addr AS "Address"
    FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
       email FROM test.contacts) AS contacts
    INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
       email FROM test.addresses) AS addr
    WHERE  contacts.email = addr.email;
    +----------+-----------------------------------------+
    | Name     | Address                                 |
    +----------+-----------------------------------------+
    | Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
    | Lee Wang | 32620 Little Blvd, Redwood City, CA     |
    +----------+-----------------------------------------+
    
    +-------------------+-------------------------------------+
    | Name              | Address                             |
    +-------------------+-------------------------------------+
    | Walker Percy      | 500 Thomas More Dr., Covington, LA  |
    | Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
    +-------------------+-------------------------------------+
    CREATE TABLE sales_data (
        sale_id INT,
        product_name VARCHAR(255),
        category VARCHAR(100),
        sale_date DATE,
        quantity INT,
        price DECIMAL(10, 2)
    ) ENGINE=ColumnStore;
    INSERT INTO sales_data (sale_id, product_name, category, sale_date, quantity, price) VALUES
    (1, 'Laptop', 'Electronics', '2023-01-15', 1, 1200.00),
    (2, 'Mouse', 'Electronics', '2023-01-15', 2, 25.00),
    (3, 'Keyboard', 'Electronics', '2023-01-16', 1, 75.00);
    -- Get total sales per category
    SELECT category, SUM(quantity * price) AS total_sales
    FROM sales_data
    WHERE sale_date BETWEEN '2023-01-01' AND '2023-01-31'
    GROUP BY category
    ORDER BY total_sales DESC;
    
    -- Count distinct products
    SELECT COUNT(DISTINCT product_name) FROM sales_data;
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
          
     | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install epel-release
    
    $ sudo yum install jemalloc
    $ sudo apt install libjemalloc2
    $ sudo apt install libjemalloc1
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    mariadb -e "FLUSH TABLES WITH READ LOCK;"
    save_brm
    mkdir -p /extent_map_backup
    cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backup
    mariadb -e "UNLOCK TABLES;"
    systemctl stop mariadb-columnstore
    mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
    > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
    cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
    chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/
    systemctl start mariadb-columnstore
    curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
    mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
    > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
    mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
    chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm
    curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
          
     | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install epel-release
    
    $ sudo yum install jemalloc
    $ sudo apt install libjemalloc2
    $ sudo apt install libjemalloc1
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore
    SELECT calgetsqlcount();
    SELECT /*! INFINIDB_ORDERED */ r_regionkey     
    FROM region r, customer c, nation n    
    WHERE r.r_regionkey = n.n_regionkey      
    AND n.n_nationkey = c.c_nationkey
    maxctrl set server \
       mcs3 \
       maintenance
    maxctrl set server \
       mcs3 \
       maintenance \
       --force
    maxctrl list servers
    maxctrl clear server \
       mcs3 \
       maintenance
    maxctrl list servers
    SHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\G
    SELECT * INTO OUTFILE '/path/to/DATABASE_NAME-TABLE_NAME.csv'
    FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
    LINES TERMINATED BY '\n'
    FROM DATABASE_NAME.TABLE_NAME;
    mariadb --host HOST --port PORT --user USER --password < schema-backup.sql
    SHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\G
    SELECT COUNT(*) FROM DATABASE_NAME.TABLE_NAME;
    SELECT * FROM DATABASE_NAME.TABLE_NAME LIMIT 100;
    SELECT * FROM tab;
    SELECT col1, col2 FROM tab;
    SELECT DISTINCT col1 FROM tab LIMIT 10000;
    SELECT DISTINCT col1 FROM tab LIMIT 100;
    SET SESSION columnstore_decimal_overflow_check=ON;
    
    SELECT (big_decimal1 * big_decimal2) AS product
    FROM columnstore_tab;
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install jemalloc jq curl
    $ sudo apt install libjemalloc1 jq curl
    $ sudo apt install libjemalloc2 jq curl
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine \
       MariaDB-columnstore-cmapi
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore \
       mariadb-columnstore-cmapi
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    
    $ chmod +x mariadb_es_repo_setup
    
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-server \
          --skip-tools \
          --mariadb-maxscale-version="22.08"
    $ sudo yum install maxscale
    $ sudo apt install maxscale
    Run the following commands on all nodes.

    For RPM-based systems, run this command:

    Refer to this blog post for additional information.

    For DEB-based systems, run these commands:

    3

    Start and Enable Services

    4

    Configure the Initial RW Node

    On the primary RW node, set up the cluster API key (use a secure API key):

    5

    Add the Initial RW Node to the Cluster

    Run this from the primary RW node:

    6

    Add Read Replica Nodes

    From the primary RW node, add each read replica:

    7

    Verify the Cluster

    Check the status to ensure nodes are added and the cluster is healthy:

    8

    Configure Replication Between Nodes

    See for instructions on how to set up replication, and this page for instructions how to create user accounts and configure replication for multi-node local storage.

    9

    Configure MaxScale

    See this page for instructions.

    CMAPI
    shared storage setup
    spinner

    Scales read via MaxScale.

  • Enterprise Server 10.3+, MaxScale 2.5+

  • Scales reads via MaxScale Enterprise Server 10.3+, MariaDB Enterprise Cluster (powered by Galera), MaxScale 2.5+

  • Enterprise Server, Enterprise ColumnStore, MaxScale

  • Optional Read Replica topology

  • Enterprise Server, Enterprise ColumnStore, MaxScale

    hashtag
    Directories for Shared Local Storage

    In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.

    The Storage Manager directory is at the following path:

    • /var/lib/columnstore/storagemanager

    The Storage Manager directory must be mounted on every ColumnStore node.

    hashtag
    Choose a Shared Local Storage Solution

    Select a Shared Local Storage solution for the Storage Manager directory:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    • Filestore

    • GlusterFS

    For additional information, see "Shared Local Storage Options".

    hashtag
    Configure EBS Multi-Attach

    EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    For Enterprise ColumnStore deployments in AWS:

    • EBS Multi-Attach is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EBS Multi-Attach.

    hashtag
    Configure Elastic File System (EFS)

    EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)

    For deployments in AWS:

    • EFS is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EFS.

    hashtag
    Configure Filestore

    Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    For Enterprise ColumnStore deployments in GCP:

    • Filestore is the recommended option for the Storage Manager directory.

    • Google Object Storage (S3-compatible) is the recommended option for data.

    • Consult the vendor documentation for details on how to configure Filestore.

    hashtag
    Configure GlusterFS

    GlusterFS is a distributed file system. GlusterFS is a shared local storage option, but it is not one of the recommended options.

    For more information, see "Recommended Storage Options".

    hashtag
    Install GlusterFS

    On each Enterprise ColumnStore node, install GlusterFS.

    Install on CentOS / RHEL 8 (YUM):

    Install on CentOS / RHEL 7 (YUM):

    Install on Debian (APT):

    Install on Ubuntu (APT):

    hashtag
    Start the GlusterFS Daemon

    Start the GlusterFS daemon:

    hashtag
    Probe the GlusterFS Peers

    Before you can create a volume with GlusterFS, you must probe each node from a peer node.

    1. On the primary node, probe all of the other cluster nodes:

    1. On one of the replica nodes, probe the primary node to confirm that it is connected:

    1. On the primary node, check the peer status:

    hashtag
    Configure and Mount GlusterFS Volumes

    Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.

    1. On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:

    1. On the primary node, create the GlusterFS volumes:

    1. On the primary node, start the volume:

    1. On each Enterprise ColumnStore node, create mount points for the volumes:

    1. On each Enterprise ColumnStore node, add the mount points to /etc/fstab:

    1. On each Enterprise ColumnStore node, mount the volumes:

    hashtag
    Configure Network File System (NFS)

    NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.

    For on-premises deployments:

    • NFS is the recommended option for the Storage Manager directory.

    • Any S3-compatible storage is the recommended option for data.

    Consult the documentation for your NFS implementation for details on how to configure NFS.

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 2 of 9.

    Next: Step 3: Install MariaDB Enterprise Server.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    spinner
    hashtag
    Directories for Shared Local Storage

    In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.

    The Storage Manager directory is at the following path:

    • /var/lib/columnstore/storagemanager

    The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:

    • /var/lib/columnstore/data1

    • /var/lib/columnstore/data2

    • /var/lib/columnstore/data3

    The DB Root directories must be mounted on every ColumnStore node.

    hashtag
    Choose a Shared Local Storage Solution

    Select a Shared Local Storage solution for the Storage Manager directory:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    • Filestore

    • GlusterFS

    • NFS (Network File System)

    For additional information, see "Shared Local Storage Options".

    hashtag
    Configure EBS Multi-Attach

    EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    For Enterprise ColumnStore deployments in AWS:

    • EBS Multi-Attach is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EBS Multi-Attach.

    hashtag
    Configure Elastic File System (EFS)

    EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)

    For deployments in AWS:

    • EFS is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EFS.

    hashtag
    Configure Filestore

    Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    For Enterprise ColumnStore deployments in GCP:

    • Filestore is the recommended option for the Storage Manager directory.

    • Google Object Storage (S3-compatible) is the recommended option for data.

    • Consult the vendor documentation for details on how to configure Filestore.

    hashtag
    Configure GlusterFS

    GlusterFS is a distributed file system.

    GlusterFS is a shared local storage option, but it is not one of the recommended options.

    For more information, see "Recommended Storage Options".

    hashtag
    Install GlusterFS

    On each Enterprise ColumnStore node, install GlusterFS.

    Install on CentOS / RHEL 8 (YUM):

    Install on CentOS / RHEL 7 (YUM):

    Install on Debian (APT):

    Install on Ubuntu (APT):

    hashtag
    Start the GlusterFS Daemon

    Start the GlusterFS daemon:

    hashtag
    Probe the GlusterFS Peers

    Before you can create a volume with GlusterFS, you must probe each node from a peer node.

    1. On the primary node, probe all of the other cluster nodes:

    1. On one of the replica nodes, probe the primary node to confirm that it is connected:

    1. On the primary node, check the peer status:

    Number of Peers: 2

    hashtag
    Configure and Mount GlusterFS Volumes

    Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.

    1. On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:

    1. On the primary node, create the GlusterFS volumes:

    1. On the primary node, start the volume:

    1. On each Enterprise ColumnStore node, create mount points for the volumes:

    1. On each Enterprise ColumnStore node, add the mount points to /etc/fstab:

    1. On each Enterprise ColumnStore node, mount the volumes:

    hashtag
    Configure Network File System (NFS)

    NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.

    For on-premises deployments:

    • NFS is the recommended option for the Storage Manager directory.

    • Any S3-compatible storage is the recommended option for data.

    Consult the documentation for your NFS implementation for details on how to configure NFS.

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 2 of 9.

    Next: Step 3: Install MariaDB Enterprise Server.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    spinner
    hashtag
    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    hashtag
    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    hashtag
    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    hashtag
    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    hashtag
    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 1 of 5.

    Next: Step 2: Install MariaDB Enterprise ColumnStore.

    Single-Node Enterprise ColumnStore with Local storage
    circle-info

    The instructions were tested against ColumnStore 23.10.

    spinner
    Recommended Parameters

    The following table lists the recommended optimized Linux kernel parameters for MariaDB ColumnStore:

    For more information refer to .

    Parameter
    Recommended Value
    Explanation

    vm.overcommit_memory

    1

    Disables overcommitting of memory, ensuring sufficient memory is available for MariaDB ColumnStore.

    vm.dirty_background_ratio

    5

    Sets the percentage of dirty memory that can be written back to disk in the background. A lower value reduces the amount of dirty memory, improving performance.

    vm.dirty_ratio

    10

    Sets the percentage of dirty memory that can be written back to disk before the kernel starts to write out clean pages. A lower value reduces the amount of dirty memory, improving performance.

    vm.vfs_cache_pressure

    50

    hashtag
    Configuration Example

    To configure these parameters, you can add them to the /etc/sysctl.conf file. For example:

    After making changes to the /etc/sysctl.conf file, you need to apply the changes by running the following command:

    hashtag
    Increase the Limit for Memory-Mapped Areas

    hashtag
    Common Use Cases

    These optimized parameters are recommended for all MariaDB ColumnStore deployments, regardless of the specific workload. They can improve performance for various use cases, including:

    • Large-scale data warehousing

    • Real-time analytics

    • Business intelligence

    • Machine learning

    hashtag
    Related Links

    • MariaDB ColumnStore Documentation

    • Linux Kernel Documentationarrow-up-right

    • MCOL-5165: Add optimized Linux kernel parameters for MariaDB ColumnStorearrow-up-right

    hashtag
    Conclusion

    By optimizing the Linux kernel parameters, you can significantly improve the performance of your MariaDB ColumnStore deployments. These recommendations provide a starting point for optimizing your system, and you may need to adjust the values based on your specific hardware and workload.

    spinner
  • Extent Maps: ColumnStore maintains metadata about each column in a shared distributed object known as the Extent Map. The primary node references the Extent Map to help assist in generating the correct primitive job steps. The primary node server references the Extent Map to identify the correct disk blocks to read. Each column is made up of one or more files and each file can contain multiple extents. As much as possible the system attempts to allocate contiguous physical storage to improve read performance.

  • Storage: ColumnStore can use either local storage or shared storage (e.g. SAN or EBS) to store data. Using shared storage allows for data processing to fail over to another node automatically in case of a server failing.

  • hashtag
    Data Loading

    The system supports full MVCC ACID transactional logic via Insert, Update, and Delete statements. The MVCC architecture allows for concurrent query and DML / batch load. Although DML is supported, the system is optimized more for batch inserts and so larger data loads should be achieved through a batch load. The most flexible and optimal way to load data is via the cpimport tool. This tool optimizes the load path and can be run centrally or in parallel on each server.

    If the data contains a time or (time correlated ascending value) column then significant performance gains will be achieved if the data is sorted by this field and also typically queried with a where clause on that column. This is because the system records a minimum and maximum value for each extent providing for a system maintained range partitioning scheme. This allows the system to completely eliminate scanning an extent map if the query includes a where clause for that field limiting the results to a subset of extent maps.

    hashtag
    Query Execution

    MariaDB ColumnStore has its own query optimizer and execution engine distinct from the MariaDB server implementation. This allows for scaling out query execution to multiple servers, and to optimize for handling data stored as columns rather than rows. As such, the factors influencing query performance are very different:

    A query is first parsed by the MariaDB server (mariadbd) process and passed through to the ColumnStore storage engine. This passes the request onto the PrimProc process which is responsible for optimizing and orchestrating execution of the query. The PrimProc module's optimizer creates a series of batch primitive steps that are executed on all nodes in the cluster. Since multiple servers can be deployed, this allows for scale-out execution of the queries. The optimizer attempts to process query execution in parallel. However, certain operations inherently must be executed centrally, for example final result ordering. Filtering, joins, aggregates, and GROUP BY clauses are general.y pushed down and executed in parallel in PrimProc on all servers. In PrimProc, batch primitive steps are performed at a granular level where individual threads operate on individual 1K-8K blocks within an extent. This enables a larger multi-core server to be fully consumed and scale out within a single server. The current batch primitive steps available in the system include:

    • Single Column Scan: Scan one or more Extents for a given column based on a single column predicate, including operators like =, <>, IN (list), BETWEEN, and ISNULL. See the first scan section of performance configuration for additional details on tuning this.

    • Additional Single Column Filters: Project additional columns for any rows found by a previous scan and apply additional single column predicates as needed. Access of blocks is based on row identifier, going directly to the blocks. See the additional column read section of performance configuration for additional details on tuning this.

    • Table Level Filters: Project additional columns as required for any table level filters such as column1 < column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.

    • Project Join Columns for Joins: Project additional join columns as needed for any join operations. Access of blocks is again based on row identifier, going directly to the blocks. See the join tuning section of for additional details on tuning this.

    • Execute Multi-Join: Apply one or more hash join operation against projected join columns, and use that value to probe a previously built hash map. Build out tuples as needed to satisfy inner or outer join requirements. See the multi-table join section of for additional details on tuning this.

    • Cross-Table Level Filters: Project additional columns from the range of rows for the Primitive Step as needed for any cross-table level filters such as table1.column1 < table2.column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.

    • Aggregation/Distinct Operation Part 1: Apply any local group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. Part 1 of this process is handled by PrimProc.

    • Aggregation/Distinct Operation Part 2: Apply any final group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. This processing is handled by PrimProc. See the memory management section of for additional details on tuning this.

    hashtag
    ColumnStore Query Execution Paradigms

    The following items should be considered when thinking about query execution in ColumnStore vs a row based store such as InnoDB.

    hashtag
    Data Scanning and Filtering

    ColumnStore is optimized for large scale aggregation / OLAP queries over large data sets. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries. Instead ColumnStore gains performance by only scanning necessary columns, utilizing system maintained partitioning, and utilizing multiple threads and servers to scale query response time.

    Since ColumnStore only reads the necessary columns to resolve a query, only include the necessary columns required. For example, SELECT * is significantly slower than SELECT col1, col2 FROM tbl.

    Datatype size is important. If say you have a column that can only have values 0 through 100 then declare this as a tinyint as this will be represented with 1 byte rather than 4 bytes for int. This reduces the I/O cost by 4 times.

    For string types, an important threshold is CHAR(9) and VARCHAR(8) or greater. Each column storage file uses a fixed number of bytes per value. This enables fast positional lookup of other columns to form the row. Currently the upper limit for columnar data storage is 8 bytes. So. for strings longer than this, the system maintains an additional 'dictionary' extent where the values are stored. The columnar extent file then stores a pointer into the dictionary. For example, it is more expensive to read and process a VARCHAR(8) column than a CHAR(8) column. Where possible, you get better performance if you can utilize shorter strings, especially if you avoid the dictionary lookup. All TEXT/BLOB data types in ColumnStore 1.1 onward utilize a dictionary and do a multiple-block 8KB lookup to retrieve that data if required. The longer the data, the more blocks are retrieved, and the greater is a potential performance impact.

    In a row-based system, adding redundant columns adds to the overall query cost, but in a columnar system a cost is only occurred if the column is referenced. Therefore, additional columns should be created to support different access paths. For instance, store a leading portion of a field in one column to allow for faster lookups, but additionally store the long-form value as another column. Scans on a shorter code or a leading-portion column are faster.

    ColumnStore distributes function application across all nodes for greater performance, but this requires a distributed implementation of the function in addition to the MariaDB server implementation. See Distributed Functions for the full list.

    circle-info

    Its important to note that ColumnStore does not have a cost based optimizer, so for optimal extent elimination and performance, your first WHERE clause predicate order should be based on the same column order that the data is imported by. Example: Most use cases with a date column benefit from a natural sort. (Today's data are being inserted after yesterday's data.) Having the first column to filter by date helps efficiently filter through records. WHERE DATE='x' outperforms a query based on a column with random values as the first predicate. Compare different query plans using calSetTrace and calGetTrace. Optimizing for the lowest PIO/LIO and highest PBE. See also CSEP.

    hashtag
    Joins

    Hash joins are utilized by ColumnStore to optimize for large scale joins and avoid the need for indexes and the overhead of nested loop processing. ColumnStore maintains table statistics so as to determine the optimal join order. This is implemented by first identifying the small table side (based on extent map data) and materializing the necessary rows from that table for the join. If the size of this is less than the configuration setting PmMaxMemorySmallSide, the join is pushed down into PrimProc for distributed in-memory processing. Otherwise, the larger side rows is not processed in a distributed manner for joining, and only the WHERE clause on that side is executed across all PrimProc modules in the cluster. If the join is too large for memory, disk-based join can be enabled to allow the query to complete.

    hashtag
    Aggregations

    Similarly to scalar functions ColumnStore distributes aggregate evaluation as much as possible. However some post processing is required to combine the final results. Enough memory must exist to handle queries with a very large number of values in the aggregate columns.

    Aggregation performance is also influenced by the number of distinct aggregate column values. Generally, the same number of rows with 100 distinct values computes faster than 10000 distinct values. This is due to increased memory management as well as transfer overhead.

    circle-info

    SELECT COUNT() is internally optimized to be SELECT COUNT(COL-N), where COL-N is the column that uses the least number of bytes for storage. For example it would be pick a CHAR(1) column over int column because CHAR(1) uses 1 byte for storage and int uses 4 bytes. The implementation still honors ANSI semantics in that SELECT COUNT() will include nulls in the total count as opposed to an explicit SELECT(COL-N) which excludes NULL values in the count.

    hashtag
    ORDER BY and LIMIT

    ORDER BY and LIMIT are implemented at the very end by the mariadbd server process on the temporary result set table. This means that the unsorted results must be fully retrieved before either are applied. The performance overhead of this is minimal on small to medium results, but for larger results, it can be significant.

    hashtag
    Complex Queries

    Subqueries are executed in sequence thus the subquery intermediate results must be materialized and then the join logic applies with the outer query.

    Window functions are executed as part of final aggregation in PrimProc due to the need for ordering of the window results. The ColumnStore window function engines uses a dedicated faster sort process.

    hashtag
    Partitioning

    Automated system partitioning of columns is provided by ColumnStore. As data is loaded into extent maps, the system will capture and maintain min/max values of column data in that extent map. New rows are appended to each extent map until full at which point a new extent map is created. For column values that are ordered or semi-ordered this allows for very effective data partitioning. By using the min and max values, entire extent maps can be eliminated and not read to filter data. This generally works particularly well for time dimension / series data or similar values that increase over time.

    spinner

    Unnecessary

    Extent Elimination

    Compression

    Yes

    High Availability (HA)

    Yes

    High Availability and Failover

    Main Memory Caching

    Yes

    Transaction Logging

    Yes

    Version Buffer

    Garbage Collection

    Yes

    ersion Buffer

    Online Schema changes

    Yes

    Online Schema Changes

    Non-locking Reads

    Yes

    Lockless Reads

    OLAP Workloads
    Hybrid Workloads
    Columnar Storage Engine
    cpimport utility

    Set this system variable to utf8

    Set this system variable to utf8_general_ci

    columnstore_use_import_for_batchinsert

    Set this system variable to ALWAYS to always use cpimport for and statements.

    Credentials Management for MariaDB Enterprise ColumnStorearrow-up-right
    How to create an AppArmor Profilearrow-up-right
    Next: Step 4: Test MariaDB Enterprise ColumnStore.
    spinner
    circle-info

    The instructions were tested against ColumnStore 23.10.

    Shell

    cpimport

    • SQL access is not required

    SQL

    LOAD DATA INFILE

    • Shell access is not required

    Remote Database

    Remote Database Import

    • Use normal database client

    • Avoid dumping data to intermediate filed

    cpimportarrow-up-right
    cpimportarrow-up-right
    cpimportarrow-up-right
    cpimportarrow-up-right
    spinner
    circle-info

    The instructions were tested against ColumnStore 23.10.

    Single-Node Localstorage

    This guide provides steps for deploying a single-node ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.

    This procedure describes the deployment of the Single-Node Enterprise ColumnStore topology with Local storage.

    MariaDB Enterprise ColumnStore is a columnar storage engine for MariaDB Enterprise Server 10.6. Enterprise ColumnStore is best suited for Online Analytical Processing (OLAP) workloads.

    This procedure has 5 steps, which are executed in sequence.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    This page provides an overview of the topology, requirements, and deployment procedures.

    Please read and understand this procedure before executing.

    hashtag
    Procedure Steps

    Step
    Description

    hashtag
    Support

    Customers can obtain support by .

    hashtag
    Components

    The following components are deployed during this procedure:

    Component
    Function

    hashtag
    MariaDB Enterprise Server Components

    Component
    Description

    hashtag
    Topology

    The Single-Node Enterprise ColumnStore topology provides support for Online Analytical Processing (OLAP) workloads to MariaDB Enterprise Server.

    The Enterprise ColumnStore node:

    • Receives queries from the application

    • Executes queries

    • Uses the local disk for storage.

    hashtag
    High Availability

    Single-Node Enterprise ColumnStore does not provide high availability (HA) for Online Analytical Processing (OLAP). If you would like to deploy Enterprise ColumnStore with high availability, see .

    hashtag
    Requirements

    These requirements are for the Single-Node Enterprise ColumnStore, when deployed with MariaDB Enterprise Server 10.6 and MariaDB Enterprise ColumnStore 23.10.

    hashtag
    Operating System

    • Debian 11 (x86_64, ARM64)

    • Debian 12 (x86_64, ARM64)

    • Red Hat Enterprise Linux 8 (x86_64, ARM64)

    • Red Hat Enterprise Linux 9 (x86_64, ARM64)

    hashtag
    Minimum Hardware Requirements

    MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the instead.

    The minimum hardware requirements are:

    Component
    CPU
    Memory

    MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.

    If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:

    And the following error message will be raised to the client:

    hashtag
    Recommended Hardware Requirements

    MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.

    The recommended hardware requirements are:

    Component
    CPU
    Memory

    hashtag
    Quick Reference

    hashtag
    MariaDB Enterprise Server Configuration Management

    Method
    Description

    MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.

    To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.

    Distribution
    Example Configuration File Path

    hashtag
    MariaDB Enterprise Server Service Management

    The systemctl command is used to start and stop the MariaDB Enterprise Server service.

    Operation
    Command

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    • Next: Step 1: Install MariaDB Enterprise ColumnStore 23.10.

    Step 7: Start and Configure MariaDB MaxScale

    Step 7: Start and Configure MariaDB MaxScale

    hashtag
    Overview

    This page details step 7 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step starts and configures MariaDB MaxScale 22.08.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Replace the Default Configuration File

    MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.

    On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:

    For additional information, see "Global Parameters".

    hashtag
    Restart MaxScale

    On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:

    For additional information, see "Start and Stop Services".

    hashtag
    Configure Server Objects

    On the MaxScale node, use to create a server object for each Enterprise ColumnStore node:

    hashtag
    Configure MariaDB Monitor

    MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).

    On the MaxScale node, use to create a MariaDB Monitor:

    In this example:

    • columnstore_monitor is an arbitrary name that is used to identify the new monitor.

    • mariadbmon is the name of the module that implements the MariaDB Monitor.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.

    hashtag
    Choose a MaxScale Router

    Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.

    Router
    Configuration Procedure
    Description

    hashtag
    Configure Read Connection Router

    Use to route connections to replica servers for a read-only pool.

    On the MaxScale node, use to create a router:

    In this example:

    • connection_router_service is an arbitrary name that is used to identify the new service.

    • readconnroute is the name of the module that implements the Read Connection Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    hashtag
    Configure Listener for the Read Connection Router

    These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the command to configure MaxScale to use a listener for the :

    In this example:

    • connection_router_service is the name of the readconnroute service that was previously created.

    • connection_router_listener is an arbitrary name that is used to identify the new listener.

    • 3308 is the TCP port.

    hashtag
    Configure Read/Write Split Router for Queries

    MaxScale performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.

    On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the :

    In this example:

    • query_router_service is an arbitrary name that is used to identify the new service.

    • readwritesplit is the name of the module that implements the Read/Write Split Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    hashtag
    Configure a Listener for the Read/Write Split Router

    These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the command to configure MaxScale to use a listener for the :

    In this example:

    • query_router_service is the name of the readwritesplit service that was previously created.

    • query_router_listener is an arbitrary name that is used to identify the new listener.

    • 3307 is the TCP port.

    • protocol=MariaDBClient

    hashtag
    Start Services

    To start the services and monitors, on the MaxScale node use :

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 7 of 9.

    Step 1: Prepare ColumnStore Nodes

    Step 1: Prepare ColumnStore Nodes

    hashtag
    Overview

    This page details step 1 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    hashtag
    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    hashtag
    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    hashtag
    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    hashtag
    Temporarily Configure Firewall for Installation

    MariaDB Enterprise ColumnStore requires the following TCP ports:

    TCP Ports
    Description

    The firewall should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The firewall will be configured and re-enabled later in this deployment procedure.

    The steps to disable the firewall depend on the specific firewall used by the operating system.

    hashtag
    CentOS / RHEL Stop firewalld

    1. Check if the firewalld service is running:

    1. If the firewalld service is running, stop it:

    Firewalld will be configured and re-enabled later in this deployment procedure.

    hashtag
    Ubuntu Stop UFW

    1. Check if the UFW service is running:

    1. If the UFW service is running, stop it:

    UFW will be configured and re-enabled later in this deployment procedure.

    hashtag
    Configure the AWS Security Group

    To install Enterprise ColumnStore on Amazon Web Services (AWS), the security group must be modified prior to installation.

    Enterprise ColumnStore requires all internal communications to be open between Enterprise ColumnStore nodes. Therefore, the security group should allow all protocols and all ports to be open between the Enterprise ColumnStore nodes and the MaxScale proxy.

    hashtag
    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    hashtag
    Configure DNS

    MariaDB Enterprise ColumnStore requires all nodes to have host names that are resolvable on all other nodes. If your infrastructure does not configure DNS centrally, you may need to configure static DNS entries in the /etc/hosts file of each server.

    On each Enterprise ColumnStore node, edit the /etc/hosts file to map host names to the IP address of each Enterprise ColumnStore node:

    Replace the IP addresses with the addresses in your own environment.

    hashtag
    Create an S3 Bucket

    With the ColumnStore Object Storage topology, it is important to create the S3 bucket before you start ColumnStore. All Enterprise ColumnStore nodes access data from the same bucket.

    If you already have an S3 bucket, confirm that the bucket is empty.

    S3 bucket configuration will be performed later in this procedure.

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 1 of 9.

    .

    Backup and Restore with Object Storage

    hashtag
    Overview

    MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.

    hashtag
    Recovery Planning

    MariaDB Enterprise ColumnStore supports multiple .

    This page discusses how to backup and restore Enterprise ColumnStore when it uses for data and shared local storage (such as NFS) for the .

    Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.

    If Enterprise ColumnStore is your , it should be backed up regularly.

    If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the , the following items must be backed up:

    • The MariaDB Data directory is backed up using .

    • The S3 bucket must be backed up using the vendor's snapshot procedure.

    • The must be backed up.

    See the instructions below for more details.

    hashtag
    Backup

    Use the following process to take a backup:

    1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

    The output will show "dbrm_mode": "master" for the primary server:

    1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

    1. Lock the database with the statement:

    Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

    1. Make a copy or snapshot of the . By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to make a copy of the directory with rsync:

    1. Use to backup the MariaDB data directory:

    1. Use to prepare the backup:

    1. Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.

    2. Ensure that all previous operations are complete.

    3. In the original client connection to the primary server, unlock the database with the statement:

    hashtag
    Restore

    Use the following process to restore a backup:

    1. , so that you can restore the backup to an empty deployment.

    2. Ensure that all services are stopped on each node:

    1. Restore the backup of the . By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to restore the backup with rsync:

    1. Use to restore the backup of the MariaDB data directory:

    1. Restore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.

    2. Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf.

    For example:

    • The default local cache size is 2 GB.

    • The default local cache path is /var/lib/columnstore/storagemanager/cache.

    • Ensure that the local cache path has sufficient store space to store the local cache.

    • The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.

    1. Start the services on each node:

    Step 7: Start and Configure MariaDB MaxScale

    Step 7: Start and Configure MariaDB MaxScale

    hashtag
    Overview

    This page details step 7 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step starts and configures MariaDB MaxScale 22.08.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Replace the Default Configuration File

    MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.

    On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:

    For additional information, see "Global Parameters".

    hashtag
    Restart MaxScale

    On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:

    For additional information, see "Start and Stop Services".

    hashtag
    Configure Server Objects

    On the MaxScale node, use to create a server object for each Enterprise ColumnStore node:

    hashtag
    Configure MariaDB Monitor

    MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).

    On the MaxScale node, use to create a MariaDB Monitor:

    In this example:

    • columnstore_monitor is an arbitrary name that is used to identify the new monitor.

    • mariadbmon is the name of the module that implements the MariaDB Monitor.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.

    hashtag
    Choose a MaxScale Router

    Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.

    Router
    Configuration Procedure
    Description

    hashtag
    Configure Read Connection Router

    Use to route connections to replica servers for a read-only pool.

    On the MaxScale node, use to create a router:

    In this example:

    • connection_router_service is an arbitrary name that is used to identify the new service.

    • readconnroute is the name of the module that implements the Read Connection Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    hashtag
    Configure Listener for the Read Connection Router

    These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the command to configure MaxScale to use a listener for the :

    In this example:

    • connection_router_service is the name of the readconnroute service that was previously created.

    • connection_router_listener is an arbitrary name that is used to identify the new listener.

    • 3308 is the TCP port.

    hashtag
    Configure Read/Write Split Router for Queries

    MaxScale performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.

    On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the :

    In this example:

    • query_router_service is an arbitrary name that is used to identify the new service.

    • readwritesplit is the name of the module that implements the Read/Write Split Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    hashtag
    Configure a Listener for the Read/Write Split Router

    These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the command to configure MaxScale to use a listener for the :

    In this example:

    • query_router_service is the name of the readwritesplit service that was previously created.

    • query_router_listener is an arbitrary name that is used to identify the new listener.

    • 3307 is the TCP port.

    • protocol=MariaDBClient

    hashtag
    Start Services

    To start the services and monitors, on the MaxScale node use :

    hashtag
    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 7 of 9.

    Backup and Restore with Shared Local Storage

    hashtag
    Overview

    MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.

    hashtag
    Recovery Planning

    MariaDB Enterprise ColumnStore supports multiple .

    This page discusses how to backup and restore Enterprise ColumnStore when it uses (such as NFS) for the .

    Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.

    If Enterprise ColumnStore is your , it should be backed up regularly.

    If Enterprise ColumnStore uses for the , the following items must be backed up:

    • The MariaDB Data directory is backed up using

    • The must be backed up

    • Each must be backed up

    See the instructions below for more details.

    hashtag
    Backup

    Use the following process to take a backup:

    1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

    The output will show dbrm_mode: master for the primary server:

    1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

    1. Lock the database with the statement:

    Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

    1. Make a copy or snapshot of the . By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to make a copy of the directory with rsync:

    1. Make a copy or snapshot of the . By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

    For example, to make a copy of the directories with rsync in a 3-node deployment:

    1. Use to backup the :

    1. Use to prepare the backup:

    1. Ensure that all previous operations are complete.

    2. In the original client connection to the primary server, unlock the database with the statement:

    hashtag
    Restore

    Use the following process to restore a backup:

    1. , so that you can restore the backup to an empty deployment.

    2. Ensure that all services are stopped on each node:

    1. Restore the backup of the . By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to restore the backup with rsync:

    1. Restore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

    For example, to restore the backup with rsync in a 3-node deployment:

    1. Use to restore the backup of the MariaDB data directory:

    1. Start the services on each node:

    Step 3: Start and Configure Enterprise ColumnStore

    Step 3: Start and Configure Enterprise ColumnStore

    hashtag
    Overview

    This page details step 3 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storagearrow-up-right.

    This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    hashtag
    Configure Enterprise ColumnStore

    Mandatory system variables and options for Single-Node Enterprise ColumnStore include:

    Connector
    MariaDB Connector/R2DBC

    hashtag
    Example Configuration

    hashtag
    Configure the S3 Storage Manager

    Configure Enterprise ColumnStore S3 Storage Manager to use S3-compatible storage by editing the /etc/columnstore/storagemanager.cnf configuration file:

    The S3-compatible object storage options are configured under [S3]:

    • The bucket option must be set to the name of the bucket that you created in "Create an S3 Bucket".

    • The endpoint option must be set to the endpoint for the S3-compatible object storage.

    • The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

    The local cache options are configured under [Cache]:

    • The cache_size option is set to 2 GB by default.

    • The path option is set to /var/lib/columnstore/storagemanager/cache by default.

    Ensure that the specified path has sufficient storage space for the specified cache size.

    hashtag
    Start the Enterprise ColumnStore Services

    Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    Start and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:

    hashtag
    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.

    1. Create the user account with the statement:

    1. Grant the user account SELECT privileges on all databases with the statement:

    1. Configure Enterprise ColumnStore to use the utility user:

    1. Set the password:

    For details about how to encrypt the password, see "".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    hashtag
    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    hashtag
    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Set SELinux to enforcing mode:

    hashtag
    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see on ubuntu.com.

    hashtag
    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 3 of 5.

    Removing a Node

    To remove a node from Enterprise ColumnStore, perform the following procedure.

    hashtag
    Unlinking from Service in MaxScale

    The server object for the node must be unlinked from the service using :

    • Unlink the server object from the service using the unlink service command.

    • As the first argument, provide the name of the service.

    • As the second argument, provide the name of the server.

    hashtag
    Checking the Service in MaxScale

    To confirm that the server object was properly unlinked from the service, the service should be checked using :

    • Show the services using the show services command, like this:

    hashtag
    Unlinking from Monitor in MaxScale

    The server object for the node must be unlinked from the monitor using :

    • Unlink a server object from the monitor using the unlink monitor command.

    • As the first argument, provide the name of the monitor.

    • As the second argument, provide the name of the server.

    hashtag
    Checking the Monitor in MaxScale

    To confirm that the server object was properly unlinked from the monitor, the monitor should be checked using :

    • Show the monitors using the show monitors command, like this:

    hashtag
    Removing the Server from MaxScale

    The server object for the node must also be removed from MaxScale using :

    • Use or another supported REST client.

    • Remove the server object using the destroy server command.

    • As the first argument, provide the name for the server.

    For example:

    hashtag
    Checking the Server in MaxScale

    To confirm that the server object was properly removed, the server objects should be checked using :

    • Show the server objects using the show servers command, like this:

    hashtag
    Stopping the Enterprise ColumnStore Services

    The Enterprise Server. Enterprise ColumnStore, and CMAPI services can be stopped using the systemctl command.

    Perform the following procedure on the node:

    1. Stop the MariaDB Enterprise Server service:

    2. Stop the MariaDB Enterprise ColumnStore service:

    3. Stop the CMAPI service:

    hashtag
    Removing the Node from Enterprise ColumnStore

    The node must be removed from Enterprise ColumnStore using :

    • Remove the node using the endpoint path.

    • Use a , such as curl .

    • Format the JSON output using jq for enhanced readability.

    For example, if the primary node's host name is mcs1 and the IP address for the node to remove is 192.0.2.3:

    • In ES 10.5.10-7 and later:

    • In ES 10.5.9-6 and earlier:

    Example output:

    hashtag
    Checking the Enterprise ColumnStore Status

    To confirm that the node was properly removed, the status of Enterprise ColumnStore should be checked using :

    • Check the status using the endpoint path.

    For example, if the primary node's host name is mcs1:

    Example output:

    Data Import

    Learn how to import data into MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.

    hashtag
    Overview

    MariaDB Enterprise ColumnStore supports very efficient bulk data loads.

    MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms, including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.

    hashtag
    cpimport

    MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:

    • Bypasses the SQL layer to decrease overhead

    • Does not block read queries

    • Requires a write metadata lock on the table, which can be monitored with the

    • Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.

    hashtag
    Batch Insert Mode

    MariaDB Enterprise ColumnStore enables batch insert mode by default.

    When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for the following statements:

    • [[|load-data-infileLOAD DATA [ LOCAL ] INFILE]]

    Enterprise ColumnStore uses the following rules:

    • If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. It executes cpimport using a wrapper called cpimport.bin.

    • If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.

    Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.

    hashtag
    Locking

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    hashtag
    Choose a Data Load Method

    Performance
    Method
    Interface
    Format(s)
    Location(s)
    Benefits

    Data Loading with INSERT .. SELECT

    hashtag
    Overview

    MariaDB Enterprise ColumnStore automatically translates INSERT INTO .. SELECT FROM .. statements into bulk data loads. By default, it translates the statement into a bulk data load that uses cpimport.bin, which is an internal wrapper around the cpimport tool.

    hashtag
    Intended Use Cases

    You can load data using INSERT INTO .. SELECT FROM .. in the following cases:

    You are loading data into a ColumnStore table by querying one or more local tables.

    hashtag
    Batch Insert Mode

    MariaDB Enterprise ColumnStore enables batch insert mode by default.

    When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for statements.

    Enterprise ColumnStore uses the following rules:

    • If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called cpimport.bin.

    • If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.

    Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.

    hashtag
    Locking

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    hashtag
    Importing the Schema

    Before data can be imported into the tables, the schema must be created.

    1. Connect to the primary server using :

    After the command is executed, it will prompt you for a password.

    1. For each database that you are importing, create the database with the statement:

    1. For each table that you are importing, create the table with the statement:

    circle-info

    To get the best performance from Enterprise ColumnStore, make sure to follow Enterprise ColumnStore's best practices for schema design.

    hashtag
    Appending Data

    When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.

    While the bulk load is in progress, the newly appended data is temporarily hidden from queries.

    After the bulk load is complete, the newly appended data is visible to queries.

    hashtag
    Sorting the Query Results

    When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.

    The order of data can have a significant effect on performance with Enterprise ColumnStore. If your data is not already sorted, it can be helpful to sort the query results using an ORDER BY clause.

    For example:

    For additional information, see "".

    hashtag
    Confirming the Field Delimiter

    Before importing a table's data into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.

    The field delimiter is determined by the columnstore_import_for_batchinsert_delimiter system variable. By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7, which corresponds to the BEL character.

    To use a different delimiter, you can set the field delimiter.

    hashtag
    Setting the Field Delimiter

    When the data is passed to cpimport, each value is separated by a field delimiter. The field delimiter is determined by the columnstore_import_for_batchinsert_delimiter system variable.

    By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7, which corresponds to the BEL character. In general, setting the field delimiter is only required if your data contains this value.

    Set the field delimiter by setting the columnstore_import_for_batchinsert_delimiter system variable to the ASCII value for the desired delimiter character.

    For example, if you want to use a comma (,) as the field delimiter, you would set columnstore_import_for_batchinsert_delimiter to 44:

    hashtag
    Setting the Quoting Style

    When the data is passed to cpimport, each value is enclosed by a quote character. The quote character is determined by the columnstore_import_for_batchinsert_enclosed_by system variable.

    By default, Enterprise ColumnStore sets the quote character to the ASCII value 17, which corresponds to the DC1 character. In general, setting the quote character is only required if your data contains this value.

    Set the quote character by setting the columnstore_import_for_batchinsert_enclosed_by system variable to the ASCII value for the desired quote character.

    For example, if you want to use a double quote (") as the quote character, you would set columnstore_import_for_batchinsert_enclosed_by to 34:

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    circle-info

    The instructions were tested against ColumnStore 23.10.

    Step 1: Prepare ColumnStore Nodes

    Step 1: Prepare ColumnStore Nodes

    hashtag
    Overview

    This page details step 1 of the 9-step procedure "".

    This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    ColumnStore Partition Management

    hashtag
    Introduction

    MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.

    It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:

    ColumnStore and Recursive CTE Limitations

    The ColumnStore engine does not fully support recursive Common Table Expressions (CTEs). Attempting to use recursive CTEs directly against ColumnStore tables typically results in an error.

    The purpose of the following examples is to demonstrate three potential workarounds for this issue. The best fit for your organization will depend on your specific needs and ability to refactor queries and adjust your approach.

    hashtag
    Setup: Simulating an Org Chart

    It simulates a simple organizational chart with employees and managers to illustrate the problem and the workarounds.

    First, an InnoDB table for comparison:

    Query Accelerator

    circle-exclamation

    MariaDB Query Accelerator is an Alpha release. Do not use it in production environments. Query Accelerator works only in ColumnStore 25.10.0 and with MariaDB Enterprise Server 11.8.3+.

    hashtag
    What is Query Accelerator

    Query Accelerator allows MariaDB to use ColumnStore to execute queries that are otherwise executed by InnoDB. Under the hood, Columnstore:

    sudo systemctl start mariadb
    sudo systemctl enable mariadb
    sudo systemctl start mariadb-columnstore-cmapi
    sudo systemctl enable mariadb-columnstore-cmapi
    sudo mcs cluster set api-key --key <your-api-key-here>
    sudo mcs node add --node <private-ip-of-rw-node>
    sudo mcs node add --read-replica --node <private-ip-of-replica>
    sudo mcs cluster status
    sudo mcs node add --read-replica --node <private-ip>
    sudo mcs node remove --node <private-ip>
    sudo mcs cluster status
    wget https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup ;
    chmod +x mariadb_es_repo_setup;
    ./mariadb_es_repo_setup --token="xxxxx" --apply --mariadb-server-version="11.4"
    sudo dnf install -y \
    MariaDB-server MariaDB-columnstore-engine MariaDB-columnstore-cmapi
    sudo apt update
    sudo apt install -y mariadb-server mariadb-plugin-columnstore mariadb-columnstore-cmapi
    $ sudo yum install --enablerepo=PowerTools glusterfs-server
    $ sudo yum install centos-release-gluster
    $ sudo yum install glusterfs-server
    $ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
    
    $ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
    $ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
    $ DEBARCH=$(dpkg --print-architecture)
    $ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    $ sudo gluster peer probe mcs2
    $ sudo gluster peer probe mcs3
    $ sudo gluster peer probe mcs1
    peer probe: Host mcs1 port 24007 already in peer list
    $ sudo gluster peer status
    Number of Peers: 2
    
    Hostname: mcs2
    Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
    State: Peer in Cluster (Connected)
    
    Hostname: mcs3
    Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
    State: Peer in Cluster (Connected)
    $ sudo mkdir -p /brick/storagemanager
    $ sudo gluster volume create storagemanager \
          replica 3 \
          mcs1:/brick/storagemanager \
          mcs2:/brick/storagemanager \
          mcs3:/brick/storagemanager \
          force
    $ sudo gluster volume start storagemanager
    $ sudo mkdir -p /var/lib/columnstore/storagemanager
    127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0
    $ sudo mount -a
    $ sudo yum install --enablerepo=PowerTools glusterfs-server
    $ sudo yum install centos-release-gluster
    $ sudo yum install glusterfs-server
    $ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
    
    $ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
    $ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
    $ DEBARCH=$(dpkg --print-architecture)
    $ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    $ sudo gluster peer probe mcs2
    $ sudo gluster peer probe mcs3
    $ sudo gluster peer probe mcs1
    
    
    peer probe: Host mcs1 port 24007 already in peer list
    $ sudo gluster peer status
    Hostname: mcs2
    Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
    State: Peer in Cluster (Connected)
    
    Hostname: mcs3
    Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
    State: Peer in Cluster (Connected)
    $ sudo mkdir -p /brick/storagemanager
    $ sudo gluster volume create storagemanager \
          replica 3 \
          mcs1:/brick/storagemanager \
          mcs2:/brick/storagemanager \
          mcs3:/brick/storagemanager \
          force
    $ sudo gluster volume start storagemanager
    $ sudo mkdir -p /var/lib/columnstore/storagemanager
    127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0
    $ sudo mount -a
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    vm.overcommit_memory=1 
    vm.dirty_background_ratio=5 
    vm.dirty_ratio=10 
    vm.vfs_cache_pressure=50 
    net.core.netdev_max_backlog=2500 
    net.core.rmem_max=16777216 
    net.core.wmem_max=16777216 
    net.ipv4.tcp_max_syn_backlog=8192 
    net.ipv4.tcp_timestamps=0
    sudo sysctl -p
    cat /proc/sys/kernel/threads-max
    cat /proc/sys/kernel/pid_max
    cat /proc/sys/vm/max_map_count
    
    
    # Rhel /etc/sysctl.conf
    sudo echo "vm.max_map_count=4262144" >> /etc/sysctl.conf
    sudo echo "kernel.pid_max = 4194304" >> /etc/sysctl.conf
    sudo echo "kernel.threads-max = 2000000" >> /etc/sysctl.conf
    
    # There may be a file called 50-pid-max.conf or perhaps something similar. If so, modify it 
    sudo echo "vm.max_map_count=4262144" > /usr/lib/sysctl.d/50-max_map_count.conf
    sudo echo "kernel.pid_max = 4194304" > /usr/lib/sysctl.d/50-pid-max.conf
    sudo sysctl -p