All pages
Powered by GitBook
1 of 14

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Managing ColumnStore

Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.

ColumnStore Minimum Hardware Specification

The following table outlines the minimum recommended production server specifications which can be followed for both on premise and cloud deployments:

Per Server

Item
Development Environment
Production Environment

Physical Server

8 Core CPU, 32 GB Memory

64 Core CPU, 128 GB Memory

Storage

Local disk

StorageManager (S3)

Network

Network Interconnect

In a multi server deployment data will be passed around via TCP/IP networking. At least a 1G network is recommended.

Details

These are minimum recommendations and in general the system will perform better with more hardware:

  • More CPU cores and servers will improve query processing response time.

  • More memory will allow the system to cache more data blocks in memory. We have users running system with anywhere from 64G RAM to 512 G RAM for UM and 32 to 64 G RAM for PM.

  • Faster network will allow data to flow faster between UM and PM nodes.

  • SSD's may be used, however the system is optimized towards block streaming which may perform well enough with HDD's for lower cost.

  • Where it is an option, it is recommended to use bare metal servers for additional performance since ColumnStore will fully consume CPU cores and memory.

  • In general it makes more sense to use a higher core count / higher memory server for single server or 2 server combined deployments.

  • In a deployment with multiple UM nodes the system will round robin requests from the mysqld handling the query to any ExeMgr in the cluster for load balancing. A higher bandwidth network such as 10g or 40g will be of benefit for large result set queries.

AWS instance sizes

For AWS our own internal testing generally uses m4.4xlarge instance types as a cost effective middle ground. The R4.8xlarge has also been tested and performs about twice as fast for about twice the price.

This page is licensed: CC BY-SA / Gnu FDL

Managing ColumnStore System

Managing ColumnStore Database Environment

Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.

Release notes and other documentation for ColumnStore is also available in the Enterprise docs section of the MariaDB website. For example:

  • ColumnStore 23.10 Release Notes

  • ColumnStore 6 Release Notes

  • ColumnStore 5.6 Release Notes

  • Deployment Instructions

Backup & Restore

MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.

ColumnStore Audit Plugin

  1. Introduction "Introduction"

  2. Installation "Installation"

  3. Enabling the audit plugin "Enabling the audit plugin"

Introduction

MariaDB server includes an optional that enables logging and tracking of all user access and statements. This is included and can be enabled for ColumnStore

Installation

To enable the audit plugin for the currently running instance (but no across restarts) run the following as mcsmysql with the default root account:

INSTALL PLUGIN server_audit 
SONAME 'server_audit.so';

To have this persist across restarts edit the ColumnStore my.cnf file (example shown for root install):

$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[mysqld]
... 
plugin_load=server_audit=server_audit.so

For more details see the

Enabling the audit plugin

To enable audit logging the following global variable must be set to ON:

SET GLOBAL server_audit_logging=ON;

To ensure this persists across restarts edit the ColumnStore my.cnf file (example shown for root install):

$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[server]
... 
server_audit_logging=ON

This will enable logging to the file /usr/local/mariadb/columnstore/mysql/db/server_audit.log. For example:

20170914 17:31:24,centos,root,localhost,11,114,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg, FROM loanstats GROUP BY grade ORDER BY grade',0

To have the log entries written to syslog the global variable server_audit_output_type should be changed from 'file' to 'syslog'. In this case the 'syslog_info' entry contains the ColumnStore server instance name, for example:

Sep 14 17:46:51 centos mysql-server_auditing: columnstore-1 centos,root,localhost,11,117,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg,FROM loanstats GROUP BY grade ORDER BY grade',0

For additional configuration and customization options see the documentation.

This page is licensed: CC BY-SA / Gnu FDL

ColumnStore Configuration File Update and Distribution

In the case where an entry in the MariaDB ColumnStore's configuration needs to be updated and distributed, this can be done from the command line from Performance Module #1. All changes made to MariaDB ColumnStore's configuration file need to be applied on PM1.

NOTE: 'mcsadmin distributeconfigfile' only needs to be run if the system is Active. If the system is down, then just make the change and when the system is started, the update will get distributed.

Here is an example

/usr/local/mariadb/columnstore/bin/configxml.sh setconfig SystemConfig SystemName mcs-1
mcsadmin distributeconfigfile

In the cases where the MariaDB ColumnStore's configuration files gets out of sync with the PM1 copy, run the following to get MariaDB ColumnStore's configuration file redistribute to the other nodes.

To do this run the following on PM1:

mscadmin distributeconfigfile

This page is licensed: CC BY-SA / Gnu FDL

Extent Map Backup & Recovery

MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.​

The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.​

Purpose

ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.

The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.

If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.

Backup Procedure

Note: MariaDB recommends implementing regular backups to ensure data integrity and recovery. A common default is to back up every 3 hours and retain backups for at least 10 days.

To safeguard against potential Extent Map corruption, regularly back up the master copy:

  1. Lock Table:

  1. Save BRM:

  1. Create Backup Directory:

  1. Copy Extent Map:

  1. Unlock Tables:

Recovery Procedures

Single-Node System

  1. Stop ColumnStore:

  1. Rename Corrupted Map:

  1. Clear Versioning Files:

  1. Restore Backup:

  1. Set Ownership:

  1. Start ColumnStore:

Clustered System

  1. Shutdown Cluster:

  1. Rename Corrupted Map:

  1. Clear Versioning Files:

  1. Restore Backup:

  1. Set Ownership:

  1. Start Cluster:

Automation Recommendation

Incorporate the save_brm command into your data import scripts (e.g., those using cpimport) to automate Extent Map backups. This practice ensures regular backups without manual intervention. Refer to the MariaDB ColumnStore Backup Script for an example implementation.​

This page is licensed: CC BY-SA / Gnu FDL

mariadb -e "FLUSH TABLES WITH READ LOCK;"
save_brm
mkdir -p /extent_map_backup
cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backup
mariadb -e "UNLOCK TABLES;"
systemctl stop mariadb-columnstore
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/
systemctl start mariadb-columnstore
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k

Backup and Restore Overview

Overview

MariaDB Enterprise ColumnStore supports backup and restore.

System of Record

Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the system of record for your Enterprise ColumnStore data.

A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.

Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.

If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:

  • If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.

  • If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.

Full Backup and Restore

MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:

  • Enterprise ColumnStore's data and metadata

With S3: an S3 snapshot of the S3-compatible object storage and a file system snapshot or copy of the Storage Manager directory Without S3: a file system snapshot or copy of the DB Root directories.

  • The MariaDB data directory from the primary node

To see the procedure to perform a full backup and restore, choose the storage type:

Storage Type
Diagram

This page is: Copyright © 2025 MariaDB. All rights reserved.

Enterprise ColumnStore with Object Storage
Enterprise ColumnStore with Shared Local Storage

Backup and Restore with Shared Local Storage

Overview

MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.

Recovery Planning

MariaDB Enterprise ColumnStore supports multiple storage options.

This page discusses how to backup and restore Enterprise ColumnStore when it uses shared local storage (such as NFS) for the DB Root directories.

Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.

If Enterprise ColumnStore is your system of record, it should be backed up regularly.

If Enterprise ColumnStore uses shared local storage for the DB Root directories, the following items must be backed up:

  • The MariaDB Data directory is backed up using MariaDB Backup

  • The Storage Manager directory must be backed up

  • Each DB Root directories must be backed up

See the instructions below for more details.

Backup

Use the following process to take a backup:

  1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
   --header 'Content-Type:application/json' \
   --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
   | jq .

The output will show "dbrm_mode": "master" for the primary server:

{
  "timestamp": "2020-12-15 00:40:34.353574",
  "192.0.2.1": {
    "timestamp": "2020-12-15 00:40:34.362374",
    "uptime": 11467,
    "dbrm_mode": "master",
    "cluster_mode": "readwrite",
    "dbroots": [
      "1"
    ],
    "module_id": 1,
    "services": [
      {
        "name": "workernode",
        "pid": 19202
      },
      {
        "name": "controllernode",
        "pid": 19232
      },
      {
        "name": "PrimProc",
        "pid": 19254
      },
      {
        "name": "ExeMgr",
        "pid": 19292
      },
      {
        "name": "WriteEngine",
        "pid": 19316
      },
      {
        "name": "DMLProc",
        "pid": 19332
      },
      {
        "name": "DDLProc",
        "pid": 19366
      }
    ]
  }
  1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

$ mariadb --host=192.0.2.1 \
   --user=root \
   --password
  1. Lock the database with the statement:

FLUSH TABLES WITH READ LOCK;

Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

  1. Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.

For example, to make a copy of the directory with rsync:

$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
  1. Make a copy or snapshot of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

For example, to make a copy of the directories with rsync in a 3-node deployment:

$ sudo rsync -av /var/lib/columnstore/data1 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data2 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data3 /backups/columnstore/202101291600/
  1. Use MariaDB Backup to backup the Storage Manager directory:

$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
   --target-dir=/backups/mariadb/202101291600/ \
   --user=mariadb-backup \
   --password=mbu_passwd
  1. Use MariaDB Backup to prepare the backup:

$ sudo mariadb-backup --prepare \
   --target-dir=/backups/mariadb/202101291600/
  1. Ensure that all previous operations are complete.

  2. In the original client connection to the primary server, unlock the database with the statement:

UNLOCK TABLES;

Restore

Use the following process to restore a backup:

  1. Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.

  2. Ensure that all services are stopped on each node:

$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
  1. Restore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.

For example, to restore the backup with rsync:

$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
  1. Restore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

For example, to restore the backup with rsync in a 3-node deployment:

$ sudo rsync -av /backups/columnstore/202101291600/data1/ /var/lib/columnstore/data1/
$ sudo rsync -av /backups/columnstore/202101291600/data2/ /var/lib/columnstore/data2/
$ sudo rsync -av /backups/columnstore/202101291600/data3/ /var/lib/columnstore/data3/
$ sudo chown -R mysql:mysql /var/lib/columnstore/data1
$ sudo chown -R mysql:mysql /var/lib/columnstore/data2
$ sudo chown -R mysql:mysql /var/lib/columnstore/data3
  1. Use MariaDB Backup to restore the backup of the MariaDB data directory:

$ sudo mariadb-backup --copy-back \
   --target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
  1. Start the services on each node:

$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi

This page is: Copyright © 2025 MariaDB. All rights reserved.

ColumnStore System Monitoring Configuration

Introduction

ColumnStore is designed to be somewhat self managing and healing. The following 2 processes help achieve this:

  • ProcMon runs on each node and is responsible for ensuring that the other required ColumnStore processes are started and automatically restarted as appropriate on that server. This in turn is started and monitored by the run.sh shell script which ensures it is restarted should it be killed. The run.sh script is invoked and automatically started by the columnstore systemd service at bootup time. This can also be utilized to restart the service on an individual node though generally it is preferred to use the mcsadmin stop, shutdown, and start commands from the PM1 node.

  • ProcMgr runs on each PM node with only one taking an active role at a time, the others remaining in warm standby mode. This process manager is responsible for overall system health, resource monitoring, and PM node failover management.

To provide additional monitoring guarantees, an external monitoring tool should monitor the health of these 3 processes and potentially all. If the run.sh process fails then the system is at potential risk of not being able to self heal.

System monitoring configuration

A number of system configuration variables exist to allow fine tuning of the system monitoring capabilities. In general the default values will work relatively well for many cases.

The configuration parameters are maintained in the /usr/local/mariadb/columnstore/etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.

Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information. The section value will be SystemConfig for all settings in this document. For example:

Module heartbeats

Heartbeat monitoring occurs between modules (both and ) to determine the module is up and functioning. The module heartbeat settings are the same for all modules.

  1. ModuleHeartbeatPeriod refers to how often the heartbeat test is performed. For example, if you set the period to 5, then the heartbeat test is performed every 5 seconds. The initial default value is 1. To disable heartbeat monitoring set the value to -1.

  2. ModuleHeartbeatCount refers to how many failures in a row must take place before a fault is processed. The initial default value is 3.

Disk threshold

Thresholds can be set to trigger a local alert when file system usage crosses a specified percentage of a file system on a server. Critical, Major or Minor thresholds can be set for the disk usage for each server. However it is recommend to use an external system monitoring tool configured to monitor for free disk space to perform proactive external alerting or paging. Actual columnstore data is stored within the data directories of the installation and mariadb db files are stored under the mysql/db directory.

  1. ExternalMinorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 70.

  2. ExternalMajorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 80.

  3. ExternalCriticalThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 90.

The value is a numeric percentage value between 0 and 100. To disable a particular threshold use value 0. To disable a threshold alarm, set it to 0.

Memory utilization

A couple of mcsadmin commands provide convenience functions for monitoring memory utilization across nodes. getSystemMemory returns server level memory statistics and getSystemMemoryUsers shows the top 5 processes by server. The following examples are for a 2 server combined setup:

Viewing storage configuration

To view the storage configuration, use the getStorageConfig command in , or simply use getStorageConfig from the operating system prompt. This will provide information on DBRoots and which PM they are assigned to, if any.

Example:

Module monitoring configuration

An internal alarm system is used to keep track of internal notable events as a convenience or reference point. It is recommended to use a dedicated system monitoring tool for more proactive alerting of critical CPU, memory, or disk utilization issues for each of the servers.

Alarms are logged to the /var/log/mariadb/columnstore/alarm.log file and a summary is displayed in mcsadmin. The getActiveAlarms command in mcsadmin can be used to retrieve current alarm conditions.

For each module (PM and UM), the following resource monitoring parameters can be configured:

Resource Monitoring Parameter
mcsadmin command

Alarm trigger count threshold

For an alarm, a threshold can be set for how many times the alarm can be triggered in 30 minutes. The default threshold is 100.

(where n= maximum number of times an alarm can be triggered in 30 minutes),

Example to change Alarm ID 22's threshold to 50:

Clearing alarms

The resetAlarm command is used to clear and acknowledge the issue is resolved. The resetAlarm command can be invoked with the argument ALL to clear all outstanding local alarms.

Automated restart based on excessive swapping

ColumnStore by default has behavior that will restart a server should swap space utilization exceed the configured module swap major threshold (default is 80%). At this point the system will likely be near unusable and so this is an attempt to recover from very large queries or data loads. The behavior of this is configured by the SystemConfig section configuration variable SwapAction which contains the oam command to be run if the threshold is exceeded. The default value is 'restartSystem' but it can be set to 'none' to disable this behavior. The fact that this has happened can be determined by the following log entry:

Logging level management

There are five levels of logging in MariaDB ColumnStore.

  • Critical

  • Error

  • Warning

  • Info

  • Debug

Application log files are written to /var/log/mariadb/columnstore on each server and log rotation / archiving is configured to manage these automatically.

To get details about current logging configuration:

The system logging configuration file referenced is a standard syslog configuration file and may be edited to enable and or disable specific levels, for example to disable debug logging and to only log at the specific level in each file:

After making changes to this restart the syslog process, e.g:

Log rotation and archiving are also configured by the installer and the settings for this may be found and managed similarly in the file /etc/logrotate.d/columnstore. If the current log files are manually deleted restart the syslog process to resume logging.

This page is licensed: CC BY-SA / Gnu FDL

Backup and Restore with Object Storage

Overview

MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the , the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.

Recovery Planning

MariaDB Enterprise ColumnStore supports multiple .

This page discusses how to backup and restore Enterprise ColumnStore when it uses for data and shared local storage (such as NFS) for the .

Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.

If Enterprise ColumnStore is your , it should be backed up regularly.

If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the , the following items must be backed up:

  • The MariaDB Data directory is backed up using .

  • The S3 bucket must be backed up using the vendor's snapshot procedure.

  • The must be backed up.

See the instructions below for more details.

Backup

Use the following process to take a backup:

  1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

The output will show "dbrm_mode": "master" for the primary server:

  1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

  1. Lock the database with the statement:

Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

  1. Make a copy or snapshot of the . By default, it is located at /var/lib/columnstore/storagemanager.

For example, to make a copy of the directory with rsync:

  1. Use to backup the MariaDB data directory:

  1. Use to prepare the backup:

  1. Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.

  2. Ensure that all previous operations are complete.

  3. In the original client connection to the primary server, unlock the database with the statement:

Restore

Use the following process to restore a backup:

  1. , so that you can restore the backup to an empty deployment.

  2. Ensure that all services are stopped on each node:

  1. Restore the backup of the . By default, it is located at /var/lib/columnstore/storagemanager.

For example, to restore the backup with rsync:

  1. Use to restore the backup of the MariaDB data directory:

  1. Restore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.

  2. Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf.

For example:

  • The default local cache size is 2 GB.

  • The default local cache path is /var/lib/columnstore/storagemanager/cache.

  • Ensure that the local cache path has sufficient store space to store the local cache.

  • The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.

  • To use an IAM role, you must also uncomment and set iam_role_name, sts_region, and sts_endpoint.

  1. Start the services on each node:

This page is: Copyright © 2025 MariaDB. All rights reserved.

# ./setConfig SystemConfig ModuleHeartbeatPeriod 5
# ./getConfig SystemConfig ModuleHeartbeatPeriod
5
mcsadmin> getSystemMemory
getsystemmemory   Tue Nov 29 11:14:21 2016

System Memory Usage per Module (in K bytes)

Module  Mem Total  Mem Used  Cache    Mem Usage %  Swap Total  Swap Used  Swap Usage % 
------  ---------  --------  -------  -----------  ----------  ---------  ------------ 
pm1     7979488    1014772   6438432      12       3145724     0               0
pm2     3850724    632712    1134324      16       3145724     0               0

mcsadmin> getSystemMemoryUsers
getsystemmemoryusers   Tue Nov 29 11:41:10 2016

System Process Top Memory Users per Module

Module 'pm1' Top Memory Users (in bytes)

Process             Memory Used  Memory Usage %
-----------------   -----------  --------------
mysqld              19621              3
PrimProc            18990              3
gnome-shell         10192              2
systemd-journald    4236               1
DDLProc             3004               1

Module 'pm2' Top Memory Users (in bytes)

Process             Memory Used  Memory Usage %
-----------------   -----------  --------------
mysqld              19046              5
PrimProc            18891              5
ProcMon             2343               1
workernode          1806               1
WriteEngineServ     1507               1
# mcsadmin getstorageconfig Wed Mar 28 10:40:34 2016

System Storage Configuration

Storage Type = internal
System DBRoot count = 6
DBRoot IDs assigned to 'pm1' = 1
DBRoot IDs assigned to 'pm2' = 2
DBRoot IDs assigned to 'pm3' = 3
DBRoot IDs assigned to 'pm4' = 4
DBRoot IDs assigned to 'pm5' = 5
DBRoot IDs assigned to 'pm6' = 6

CPU thresholds

setModuleTypeConfig (module name) ModuleCPU(Clear/ Minor/Major/Critical)Threshold n (where n= percentage of CPU usage)

Disk file system use threshold

setModuleTypeConfig (module name) ModuleDisk(Minor/ Major/Critical)Threshold n (where n= percentage of disk system used)

Module swap thresholds

setModuleTypeConfig (module name) ModuleSwap(Minor/ Major/Crictical)Threshold n (where n= percentage of swap space used)

setAlarmConfig (alarmID#) Threshold n
# mcsadmin setAlarmConfig 22 Threshold 50
Nov 01 11:23:13 [ServerMonitor] 13.306324 |0|0|0| C 09 CAL0000: Swap Space usage over Major threashold, perform OAM command restartSystem
# mcsadmin getlogconfig
getlogconfig   Wed Oct 19 06:58:47 2016

MariaDB Columnstore System Log Configuration Data

System Logging Configuration File being used: /etc/rsyslog.d/49-columnstore.conf

Module    Configured Log Levels
------    ---------------------------------------
pm1       Critical Error Warning Info
# cat /etc/rsyslog.d/49-columnstore.conf
# MariaDb Columnstore Database Platform Logging
local1.=crit -/var/log/mariadb/columnstore/crit.log
local1.=err -/var/log/mariadb/columnstore/err.log
local1.=warning -/var/log/mariadb/columnstore/warning.log
local1.=info -/var/log/mariadb/columnstore/info.log
# systemctl restart rsyslog
Introduction "Introduction"
System monitoring configuration "System monitoring configuration"
Module heartbeats "Module heartbeats"
Disk threshold "Disk threshold"
Memory utilization "Memory utilization"
Viewing storage configuration "Viewing storage configuration"
Module monitoring configuration "Module monitoring configuration"
Alarm trigger count threshold "Alarm trigger count threshold"
Clearing alarms "Clearing alarms"
Automated restart based on excessive swapping "Automated restart based on excessive swapping"
Logging level management "Logging level management"
UM
PM
mcsadmin
mcsadmin
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/mcs cluster status \
   --header 'Content-Type:application/json' \
   --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
   | jq .
{
  "timestamp": "2020-12-15 00:40:34.353574",
  "192.0.2.1": {
    "timestamp": "2020-12-15 00:40:34.362374",
    "uptime": 11467,
    "dbrm_mode": "master",
    "cluster_mode": "readwrite",
    "dbroots": [
      "1"
    ],
    "module_id": 1,
    "services": [
      {
        "name": "workernode",
        "pid": 19202
      },
      {
        "name": "controllernode",
        "pid": 19232
      },
      {
        "name": "PrimProc",
        "pid": 19254
      },
      {
        "name": "ExeMgr",
        "pid": 19292
      },
      {
        "name": "WriteEngine",
        "pid": 19316
      },
      {
        "name": "DMLProc",
        "pid": 19332
      },
      {
        "name": "DDLProc",
        "pid": 19366
      }
    ]
  }
$ mariadb --host=192.0.2.1 \
   --user=root \
   --password
FLUSH TABLES WITH READ LOCK;
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
   --target-dir=/backups/mariadb/202101291600/ \
   --user=mariadb-backup \
   --password=mbu_passwd
$ sudo mariadb-backup --prepare \
   --target-dir=/backups/mariadb/202101291600/
UNLOCK TABLES;
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
$ sudo mariadb-backup --copy-back \
   --target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint

[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi
Storage Manager directory
storage options
S3-compatible object storage
Storage Manager directory
system of record
Storage Manager directory
mariadb-backup
Storage Manager directory
Storage Manager directory
MariaDB Backup
MariaDB Backup
Deploy Enterprise ColumnStore
Storage Manager directory
MariaDB Backup
Enterprise-ColumnStore-Backup-with-S3-Flow-Chart

ColumnStore Partition Management

  1. Introduction "Introduction"

  2. Managing partitions by partition number "Managing partitions by partition number"

  3. Displaying partition information "Displaying partition information"

  4. Disabling partitions "Disabling partitions"

  5. Enabling partitions "Enabling partitions"

  6. Dropping partitions "Dropping partitions"

  7. Managing partitions by column value "Managing partitions by column value"

  8. Displaying partition information "Displaying partition information"

  9. Disabling partitions "Disabling partitions"

  10. Enabling partitions "Enabling partitions"

  11. Dropping partitions "Dropping partitions"

  12. Dropping data not wholly within one partition "Dropping data not wholly within one partition"

Introduction

MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.

It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:

  1. Create 4 extents in 4 files

  2. When these are filled up (after 32M rows), create 4 more extents in the 4 files created in step 1.

  3. When these are filled up (after 64M rows), create a new partition.

Managing partitions by partition number

Displaying partition information

Information about all partitions for a given column can be retrieved using the calShowPartitions stored procedure which takes either two or three mandatory parameters: [database_name], table_name, and column_name. If two parameters are provided the current database is assumed. For example:

select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Enabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+

1 row in set (0.05 sec)

Disabling partitions

The calDisablePartitions stored procedure allows for disabling of one or more partitions. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes either two or three mandatory parameters: [database_name], table_name, and partition_numbers separated by commas. If two parameters are provided the current database is assumed. For example:

select calDisablePartitions('orders','0.0.1');
+----------------------------------------+
| calDisablePartitions('orders','0.0.1') |
+----------------------------------------+
| Partitions are disabled successfully.  |
+----------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been disabled:

select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Disabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)

Enabling partitions

The calEnablePartitions stored procedure allows for enabling of one or more partitions. The procedure takes the same set of parameters as calDisablePartitions. For example:

select calEnablePartitions('orders', '0.0.1');
+----------------------------------------+
| calEnablePartitions('orders', '0.0.1') |
+----------------------------------------+
| Partitions are enabled successfully.   |
+----------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been enabled:

select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Enabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 rows in set (0.05 sec)

Dropping partitions

The calDropPartitions stored procedure allows for dropping of one or more partitions. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of parameters as calDisablePartitions. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:

select calDropPartitions('orders', '0.0.1');
+--------------------------------------+
| calDropPartitions('orders', '0.0.1') |
+--------------------------------------+
| Partitions are enabled successfully  |
+--------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been dropped:

select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min        Max        Status
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)

Managing partitions by column value

Displaying partition information

Information about a range of parititions for a given column can be retrieved using the calShowPartitionsByValue stored procedure. This procedure takes either four or five mandatory parameters: [database_name], table_name, column_name, start_value, and end_value. If four parameters are provided the current database is assumed. Only casual partition column types (, , , , up to 8 bytes and up to 7 bytes) are supported for this function.

The function returns a list of partitions whose minimum and maximum values for the column 'col_name' fall completely within the range of 'start_value' and 'end_value'. For example:

select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02', '2010-07-24') |
+----------------------------------------------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Enabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)

Disabling partitions

The calDisablePartitionsByValue stored procedure allows for disabling of one or more partitions by value. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be disabled and then the same argument values used to construct the calDisablePartitionsByValue call. For example:

select calDisablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+---------------------------------------------------------------------------------+
| caldisablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+---------------------------------------------------------------------------------+
| Partitions are disabled successfully                                            |
+---------------------------------------------------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been disabled:

select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24’ ) |
+----------------------------------------------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Disabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)

Enabling partitions

The calEnablePartitionsbyValue stored procedure allows for enabling of one or more partitions by value. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calEnablePartitionsbyValue call. For example:

select calEnablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+--------------------------------------------------------------------------------+
| calenablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+--------------------------------------------------------------------------------+
| Partitions are enabled successfully                                            |
+--------------------------------------------------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been enabled:

select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min        Max        Status
  0.0.1 1992-01-01 1998-08-02 Enabled
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 rows in set (0.05 sec)

Dropping partitions

The calDropPartitionsByValue stored procedure allows for dropping of one or more partitions by value. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calDropPartitionsByValue call. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:

select calDropPartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+------------------------------------------------------------------------------+
| caldroppartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+------------------------------------------------------------------------------+
| Partitions are enabled successfully.                                         |
+------------------------------------------------------------------------------+
1 row in set (0.28 sec)

The result showing the first partition has been dropped:

select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min        Max        Status
  0.1.2 1998-08-03 2004-05-15 Enabled
  0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)

Dropping data not wholly within one partition

Since the partitioning scheme is system maintained the min and max values are not directly specified but influenced by the order of data loading. If the goal is to drop a specific date range then additional deletes are required to achieve this. The following cases may occur:

  • For semi-ordered data, there may be overlap between min and max values between partitions.

  • As in the example above, the partition ranged from 1992-01-01 to 1998-08-02. Potentially it may be desirable to drop the remaining 1998 rows.

A bulk delete statement can be used to delete the remaining rows that do not fall exactly within partition ranges. The partition drops will be fastest, however the system optimizes bulk delete statements to delete by block internally so are still relatively fast.

delete from orders where orderdate <= '1998-12-31';

This page is licensed: CC BY-SA / Gnu FDL

ColumnStore Redistribute Data

Introduction

When new PM nodes are added to a running instance it may be desirable to redistribute the data in current nodes across all of the nodes. This is not strictly required as ongoing data ingestion will prioritize the new empty nodes for data loading to rebalance the system.

An important point is that the operation works at Partition granularity, so a minimal data set is 64M rows in a table for this to run.

Usage

A command redistributeData is available in the admin console to initiate a data distribution:

mcsadmin> redistributeData start
redistributedata   Tue Dec 13 04:42:31 2016
redistributeData START
Source dbroots: 1 2
Destination dbroots: 1 2

WriteEngineServer returned status 1: Cleared.
WriteEngineServer returned status 2: Redistribute is started.

The command has 3 possible options:

  • Start : start a new redistribution to redistribute data equally amongst the current set of DBRoots in the system.

  • Stop : abort a redistribution leaving the system in a usable state.

  • Status : return status information on an active redistribution.

The start command can take an option of Remove. The Start Remove option is used to remove all data from the enumerated list of dbroots and redistribute the data to the remaining dbroots. This should be done before taking a dbroot out of service in order to preserve the data on that dbroot. The dbroot list is a space delimited list of integers representing the dbroots to be emptied.

As mentioned above, the operation works at partition granularity, which means that a minimal move is 64 million rows. Any table smaller than that will not be redistributed and there may be as much as one full Partition difference in the resulting balance. The redistribute logic does not currently consolidate individually deleted records.

Redistribute can take a long time. During this time, it is required that all data manipulation including bulk inserts are suspended. SuspendDatabaseWrites must be called before redistributedata and ResumeDatabaseWrites should be called when the redistribution is complete.

If "redistributeData stop" is called, all processing stops where it's at, but in a usable state. "redistributeData status" can be used to see how much has been done. A further "redistributeData start" will start over using the new state of the system. This may lead to a less optimal distribution, so stop-start sequences aren't recommended.

While the system is working, "redistributeData status" can be called to see what's happening. a -r option can be used on the status command line to repeat the call and act as a monitor.

To see how much data resides on any given DBRoot for a table, you can use a query like:

select count(*) from <table> where idbdbroot(<any column>)=<dbrootnum>;

This page is licensed: CC BY-SA / Gnu FDL

ColumnStore 23.02 Release Notes
ColumnStore 22.08 Release Notes
Audit Plugin
audit plugin installation guide
Audit Plugin
FLUSH TABLES WITH READ LOCK
UNLOCK TABLES
FLUSH TABLES WITH READ LOCK
UNLOCK TABLES
INTEGER
DECIMAL
DATE
DATETIME
CHAR
VARCHAR