Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.
The following table outlines the minimum recommended production server specifications which can be followed for both on premise and cloud deployments:
Physical Server
8 Core CPU, 32 GB Memory
64 Core CPU, 128 GB Memory
Storage
Local disk
StorageManager (S3)
Network Interconnect
In a multi server deployment data will be passed around via TCP/IP networking. At least a 1G network is recommended.
These are minimum recommendations and in general the system will perform better with more hardware:
More CPU cores and servers will improve query processing response time.
More memory will allow the system to cache more data blocks in memory. We have users running system with anywhere from 64G RAM to 512 G RAM for UM and 32 to 64 G RAM for PM.
Faster network will allow data to flow faster between UM and PM nodes.
SSD's may be used, however the system is optimized towards block streaming which may perform well enough with HDD's for lower cost.
Where it is an option, it is recommended to use bare metal servers for additional performance since ColumnStore will fully consume CPU cores and memory.
In general it makes more sense to use a higher core count / higher memory server for single server or 2 server combined deployments.
In a deployment with multiple UM nodes the system will round robin requests from the mysqld handling the query to any ExeMgr in the cluster for load balancing. A higher bandwidth network such as 10g or 40g will be of benefit for large result set queries.
For AWS our own internal testing generally uses m4.4xlarge instance types as a cost effective middle ground. The R4.8xlarge has also been tested and performs about twice as fast for about twice the price.
This page is licensed: CC BY-SA / Gnu FDL
Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.
MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.
MariaDB server includes an optional that enables logging and tracking of all user access and statements. This is included and can be enabled for ColumnStore
To enable the audit plugin for the currently running instance (but no across restarts) run the following as mcsmysql with the default root account:
INSTALL PLUGIN server_audit
SONAME 'server_audit.so';
To have this persist across restarts edit the ColumnStore my.cnf file (example shown for root install):
$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[mysqld]
...
plugin_load=server_audit=server_audit.so
For more details see the
To enable audit logging the following global variable must be set to ON:
SET GLOBAL server_audit_logging=ON;
To ensure this persists across restarts edit the ColumnStore my.cnf file (example shown for root install):
$ vi /usr/local/mariadb/columnstore/mysql/my.cnf
[server]
...
server_audit_logging=ON
This will enable logging to the file /usr/local/mariadb/columnstore/mysql/db/server_audit.log. For example:
20170914 17:31:24,centos,root,localhost,11,114,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg, FROM loanstats GROUP BY grade ORDER BY grade',0
To have the log entries written to syslog the global variable server_audit_output_type should be changed from 'file' to 'syslog'. In this case the 'syslog_info' entry contains the ColumnStore server instance name, for example:
Sep 14 17:46:51 centos mysql-server_auditing: columnstore-1 centos,root,localhost,11,117,QUERY,loans,'SELECT grade, AVG(loan_amnt) avg,FROM loanstats GROUP BY grade ORDER BY grade',0
For additional configuration and customization options see the documentation.
This page is licensed: CC BY-SA / Gnu FDL
In the case where an entry in the MariaDB ColumnStore's configuration needs to be updated and distributed, this can be done from the command line from Performance Module #1. All changes made to MariaDB ColumnStore's configuration file need to be applied on PM1.
NOTE: 'mcsadmin distributeconfigfile' only needs to be run if the system is Active. If the system is down, then just make the change and when the system is started, the update will get distributed.
Here is an example
/usr/local/mariadb/columnstore/bin/configxml.sh setconfig SystemConfig SystemName mcs-1
mcsadmin distributeconfigfile
In the cases where the MariaDB ColumnStore's configuration files gets out of sync with the PM1 copy, run the following to get MariaDB ColumnStore's configuration file redistribute to the other nodes.
To do this run the following on PM1:
mscadmin distributeconfigfile
This page is licensed: CC BY-SA / Gnu FDL
MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.
The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.
ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.
The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.
If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.
To safeguard against potential Extent Map corruption, regularly back up the master copy:
Lock Table:
Save BRM:
Create Backup Directory:
Copy Extent Map:
Unlock Tables:
Stop ColumnStore:
Rename Corrupted Map:
Clear Versioning Files:
Restore Backup:
Set Ownership:
Start ColumnStore:
Shutdown Cluster:
Rename Corrupted Map:
Clear Versioning Files:
Restore Backup:
Set Ownership:
Start Cluster:
Incorporate the save_brm
command into your data import scripts (e.g., those using cpimport
) to automate Extent Map backups. This practice ensures regular backups without manual intervention. Refer to the MariaDB ColumnStore Backup Script for an example implementation.
This page is licensed: CC BY-SA / Gnu FDL
mariadb -e "FLUSH TABLES WITH READ LOCK;"
save_brm
mkdir -p /extent_map_backup
cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backup
mariadb -e "UNLOCK TABLES;"
systemctl stop mariadb-columnstore
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/
systemctl start mariadb-columnstore
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
> /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm
curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
MariaDB Enterprise ColumnStore supports backup and restore.
Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the system of record for your Enterprise ColumnStore data.
A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.
Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.
If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:
If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.
If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.
MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:
Enterprise ColumnStore's data and metadata
With S3: an S3 snapshot of the S3-compatible object storage and a file system snapshot or copy of the Storage Manager directory Without S3: a file system snapshot or copy of the DB Root directories.
The MariaDB data directory from the primary node
To see the procedure to perform a full backup and restore, choose the storage type:
This page is: Copyright © 2025 MariaDB. All rights reserved.
MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple storage options.
This page discusses how to backup and restore Enterprise ColumnStore when it uses shared local storage (such as NFS) for the DB Root directories.
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.
If Enterprise ColumnStore is your system of record, it should be backed up regularly.
If Enterprise ColumnStore uses shared local storage for the DB Root directories, the following items must be backed up:
The MariaDB Data directory is backed up using MariaDB Backup
The Storage Manager directory must be backed up
Each DB Root directories must be backed up
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .
The output will show "dbrm_mode": "master" for the primary server:
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}
Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
$ mariadb --host=192.0.2.1 \
--user=root \
--password
Lock the database with the statement:
FLUSH TABLES WITH READ LOCK;
Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to make a copy of the directory with rsync:
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
Make a copy or snapshot of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN
, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to make a copy of the directories with rsync in a 3-node deployment:
$ sudo rsync -av /var/lib/columnstore/data1 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data2 /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/data3 /backups/columnstore/202101291600/
Use MariaDB Backup to backup the Storage Manager directory:
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwd
Use MariaDB Backup to prepare the backup:
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/
Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the statement:
UNLOCK TABLES;
Use the following process to restore a backup:
Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
Restore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to restore the backup with rsync:
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
Restore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN
, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.
For example, to restore the backup with rsync in a 3-node deployment:
$ sudo rsync -av /backups/columnstore/202101291600/data1/ /var/lib/columnstore/data1/
$ sudo rsync -av /backups/columnstore/202101291600/data2/ /var/lib/columnstore/data2/
$ sudo rsync -av /backups/columnstore/202101291600/data3/ /var/lib/columnstore/data3/
$ sudo chown -R mysql:mysql /var/lib/columnstore/data1
$ sudo chown -R mysql:mysql /var/lib/columnstore/data2
$ sudo chown -R mysql:mysql /var/lib/columnstore/data3
Use MariaDB Backup to restore the backup of the MariaDB data directory:
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
Start the services on each node:
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi
This page is: Copyright © 2025 MariaDB. All rights reserved.
ColumnStore is designed to be somewhat self managing and healing. The following 2 processes help achieve this:
ProcMon runs on each node and is responsible for ensuring that the other required ColumnStore processes are started and automatically restarted as appropriate on that server. This in turn is started and monitored by the run.sh shell script which ensures it is restarted should it be killed. The run.sh script is invoked and automatically started by the columnstore systemd service at bootup time. This can also be utilized to restart the service on an individual node though generally it is preferred to use the mcsadmin stop, shutdown, and start commands from the PM1 node.
ProcMgr runs on each PM node with only one taking an active role at a time, the others remaining in warm standby mode. This process manager is responsible for overall system health, resource monitoring, and PM node failover management.
To provide additional monitoring guarantees, an external monitoring tool should monitor the health of these 3 processes and potentially all. If the run.sh process fails then the system is at potential risk of not being able to self heal.
A number of system configuration variables exist to allow fine tuning of the system monitoring capabilities. In general the default values will work relatively well for many cases.
The configuration parameters are maintained in the /usr/local/mariadb/columnstore/etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.
Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information. The section value will be SystemConfig for all settings in this document. For example:
Heartbeat monitoring occurs between modules (both and ) to determine the module is up and functioning. The module heartbeat settings are the same for all modules.
ModuleHeartbeatPeriod refers to how often the heartbeat test is performed. For example, if you set the period to 5, then the heartbeat test is performed every 5 seconds. The initial default value is 1. To disable heartbeat monitoring set the value to -1.
ModuleHeartbeatCount refers to how many failures in a row must take place before a fault is processed. The initial default value is 3.
Thresholds can be set to trigger a local alert when file system usage crosses a specified percentage of a file system on a server. Critical, Major or Minor thresholds can be set for the disk usage for each server. However it is recommend to use an external system monitoring tool configured to monitor for free disk space to perform proactive external alerting or paging. Actual columnstore data is stored within the data directories of the installation and mariadb db files are stored under the mysql/db directory.
ExternalMinorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 70.
ExternalMajorThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 80.
ExternalCriticalThreshold - Percentage threshold for when a minor local alarm is triggered. Default value is 90.
The value is a numeric percentage value between 0 and 100. To disable a particular threshold use value 0. To disable a threshold alarm, set it to 0.
A couple of mcsadmin commands provide convenience functions for monitoring memory utilization across nodes. getSystemMemory returns server level memory statistics and getSystemMemoryUsers shows the top 5 processes by server. The following examples are for a 2 server combined setup:
To view the storage configuration, use the getStorageConfig command in , or simply use getStorageConfig from the operating system prompt. This will provide information on DBRoots and which PM they are assigned to, if any.
Example:
An internal alarm system is used to keep track of internal notable events as a convenience or reference point. It is recommended to use a dedicated system monitoring tool for more proactive alerting of critical CPU, memory, or disk utilization issues for each of the servers.
Alarms are logged to the /var/log/mariadb/columnstore/alarm.log file and a summary is displayed in mcsadmin. The getActiveAlarms command in mcsadmin can be used to retrieve current alarm conditions.
For each module (PM and UM), the following resource monitoring parameters can be configured:
For an alarm, a threshold can be set for how many times the alarm can be triggered in 30 minutes. The default threshold is 100.
(where n= maximum number of times an alarm can be triggered in 30 minutes),
Example to change Alarm ID 22's threshold to 50:
The resetAlarm command is used to clear and acknowledge the issue is resolved. The resetAlarm command can be invoked with the argument ALL to clear all outstanding local alarms.
ColumnStore by default has behavior that will restart a server should swap space utilization exceed the configured module swap major threshold (default is 80%). At this point the system will likely be near unusable and so this is an attempt to recover from very large queries or data loads. The behavior of this is configured by the SystemConfig section configuration variable SwapAction which contains the oam command to be run if the threshold is exceeded. The default value is 'restartSystem' but it can be set to 'none' to disable this behavior. The fact that this has happened can be determined by the following log entry:
There are five levels of logging in MariaDB ColumnStore.
Critical
Error
Warning
Info
Debug
Application log files are written to /var/log/mariadb/columnstore on each server and log rotation / archiving is configured to manage these automatically.
To get details about current logging configuration:
The system logging configuration file referenced is a standard syslog configuration file and may be edited to enable and or disable specific levels, for example to disable debug logging and to only log at the specific level in each file:
After making changes to this restart the syslog process, e.g:
Log rotation and archiving are also configured by the installer and the settings for this may be found and managed similarly in the file /etc/logrotate.d/columnstore. If the current log files are manually deleted restart the syslog process to resume logging.
This page is licensed: CC BY-SA / Gnu FDL
MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the , the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.
MariaDB Enterprise ColumnStore supports multiple .
This page discusses how to backup and restore Enterprise ColumnStore when it uses for data and shared local storage (such as NFS) for the .
Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata were to become corrupt, Enterprise ColumnStore could become unusable, and data loss could occur.
If Enterprise ColumnStore is your , it should be backed up regularly.
If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the , the following items must be backed up:
The MariaDB Data directory is backed up using .
The S3 bucket must be backed up using the vendor's snapshot procedure.
The must be backed up.
See the instructions below for more details.
Use the following process to take a backup:
Determine which node is the primary server using curl to send the status command to the CMAPI Server:
The output will show "dbrm_mode
": "master" for the primary server:
Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:
Lock the database with the statement:
Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.
Make a copy or snapshot of the . By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to make a copy of the directory with rsync:
Use to backup the MariaDB data directory:
Use to prepare the backup:
Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.
Ensure that all previous operations are complete.
In the original client connection to the primary server, unlock the database with the statement:
Use the following process to restore a backup:
, so that you can restore the backup to an empty deployment.
Ensure that all services are stopped on each node:
Restore the backup of the . By default, it is located at /var/lib/columnstore/storagemanager
.
For example, to restore the backup with rsync:
Use to restore the backup of the MariaDB data directory:
Restore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.
Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf
.
For example:
The default local cache size is 2 GB.
The default local cache path is /var/lib/columnstore/storagemanager/cache
.
Ensure that the local cache path has sufficient store space to store the local cache.
The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.
To use an IAM role, you must also uncomment and set iam_role_name, sts_region, and sts_endpoint
.
Start the services on each node:
This page is: Copyright © 2025 MariaDB. All rights reserved.
# ./setConfig SystemConfig ModuleHeartbeatPeriod 5
# ./getConfig SystemConfig ModuleHeartbeatPeriod
5
mcsadmin> getSystemMemory
getsystemmemory Tue Nov 29 11:14:21 2016
System Memory Usage per Module (in K bytes)
Module Mem Total Mem Used Cache Mem Usage % Swap Total Swap Used Swap Usage %
------ --------- -------- ------- ----------- ---------- --------- ------------
pm1 7979488 1014772 6438432 12 3145724 0 0
pm2 3850724 632712 1134324 16 3145724 0 0
mcsadmin> getSystemMemoryUsers
getsystemmemoryusers Tue Nov 29 11:41:10 2016
System Process Top Memory Users per Module
Module 'pm1' Top Memory Users (in bytes)
Process Memory Used Memory Usage %
----------------- ----------- --------------
mysqld 19621 3
PrimProc 18990 3
gnome-shell 10192 2
systemd-journald 4236 1
DDLProc 3004 1
Module 'pm2' Top Memory Users (in bytes)
Process Memory Used Memory Usage %
----------------- ----------- --------------
mysqld 19046 5
PrimProc 18891 5
ProcMon 2343 1
workernode 1806 1
WriteEngineServ 1507 1
# mcsadmin getstorageconfig Wed Mar 28 10:40:34 2016
System Storage Configuration
Storage Type = internal
System DBRoot count = 6
DBRoot IDs assigned to 'pm1' = 1
DBRoot IDs assigned to 'pm2' = 2
DBRoot IDs assigned to 'pm3' = 3
DBRoot IDs assigned to 'pm4' = 4
DBRoot IDs assigned to 'pm5' = 5
DBRoot IDs assigned to 'pm6' = 6
CPU thresholds
setModuleTypeConfig (module name) ModuleCPU(Clear/ Minor/Major/Critical)Threshold n (where n= percentage of CPU usage)
Disk file system use threshold
setModuleTypeConfig (module name) ModuleDisk(Minor/ Major/Critical)Threshold n (where n= percentage of disk system used)
Module swap thresholds
setModuleTypeConfig (module name) ModuleSwap(Minor/ Major/Crictical)Threshold n (where n= percentage of swap space used)
setAlarmConfig (alarmID#) Threshold n
# mcsadmin setAlarmConfig 22 Threshold 50
Nov 01 11:23:13 [ServerMonitor] 13.306324 |0|0|0| C 09 CAL0000: Swap Space usage over Major threashold, perform OAM command restartSystem
# mcsadmin getlogconfig
getlogconfig Wed Oct 19 06:58:47 2016
MariaDB Columnstore System Log Configuration Data
System Logging Configuration File being used: /etc/rsyslog.d/49-columnstore.conf
Module Configured Log Levels
------ ---------------------------------------
pm1 Critical Error Warning Info
# cat /etc/rsyslog.d/49-columnstore.conf
# MariaDb Columnstore Database Platform Logging
local1.=crit -/var/log/mariadb/columnstore/crit.log
local1.=err -/var/log/mariadb/columnstore/err.log
local1.=warning -/var/log/mariadb/columnstore/warning.log
local1.=info -/var/log/mariadb/columnstore/info.log
# systemctl restart rsyslog
$ curl -k -s https://mcs1:8640/cmapi/0.4.0/mcs cluster status \
--header 'Content-Type:application/json' \
--header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
| jq .
{
"timestamp": "2020-12-15 00:40:34.353574",
"192.0.2.1": {
"timestamp": "2020-12-15 00:40:34.362374",
"uptime": 11467,
"dbrm_mode": "master",
"cluster_mode": "readwrite",
"dbroots": [
"1"
],
"module_id": 1,
"services": [
{
"name": "workernode",
"pid": 19202
},
{
"name": "controllernode",
"pid": 19232
},
{
"name": "PrimProc",
"pid": 19254
},
{
"name": "ExeMgr",
"pid": 19292
},
{
"name": "WriteEngine",
"pid": 19316
},
{
"name": "DMLProc",
"pid": 19332
},
{
"name": "DDLProc",
"pid": 19366
}
]
}
$ mariadb --host=192.0.2.1 \
--user=root \
--password
FLUSH TABLES WITH READ LOCK;
$ sudo mkdir -p /backups/columnstore/202101291600/
$ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
$ sudo mkdir -p /backups/mariadb/202101291600/
$ sudo mariadb-backup --backup \
--target-dir=/backups/mariadb/202101291600/ \
--user=mariadb-backup \
--password=mbu_passwd
$ sudo mariadb-backup --prepare \
--target-dir=/backups/mariadb/202101291600/
UNLOCK TABLES;
$ sudo systemctl stop mariadb-columnstore-cmapi
$ sudo systemctl stop mariadb-columnstore
$ sudo systemctl stop mariadb
$ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
$ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
$ sudo mariadb-backup --copy-back \
--target-dir=/backups/mariadb/202101291600/
$ sudo chown -R mysql:mysql /var/lib/mysql
[ObjectStorage]
…
service = S3
…
[S3]
bucket = your_columnstore_bucket_name
endpoint = your_s3_endpoint
aws_access_key_id = your_s3_access_key_id
aws_secret_access_key = your_s3_secret_key
# iam_role_name = your_iam_role
# sts_region = your_sts_region
# sts_endpoint = your_sts_endpoint
[Cache]
cache_size = your_local_cache_size
path = your_local_cache_path
$ sudo systemctl start mariadb
$ sudo systemctl start mariadb-columnstore-cmapi
MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.
It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:
Create 4 extents in 4 files
When these are filled up (after 32M rows), create 4 more extents in the 4 files created in step 1.
When these are filled up (after 64M rows), create a new partition.
Information about all partitions for a given column can be retrieved using the calShowPartitions stored procedure which takes either two or three mandatory parameters: [database_name], table_name, and column_name. If two parameters are provided the current database is assumed. For example:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
The calDisablePartitions stored procedure allows for disabling of one or more partitions. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes either two or three mandatory parameters: [database_name], table_name, and partition_numbers separated by commas. If two parameters are provided the current database is assumed. For example:
select calDisablePartitions('orders','0.0.1');
+----------------------------------------+
| calDisablePartitions('orders','0.0.1') |
+----------------------------------------+
| Partitions are disabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been disabled:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
The calEnablePartitions stored procedure allows for enabling of one or more partitions. The procedure takes the same set of parameters as calDisablePartitions. For example:
select calEnablePartitions('orders', '0.0.1');
+----------------------------------------+
| calEnablePartitions('orders', '0.0.1') |
+----------------------------------------+
| Partitions are enabled successfully. |
+----------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been enabled:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 rows in set (0.05 sec)
The calDropPartitions stored procedure allows for dropping of one or more partitions. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of parameters as calDisablePartitions. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:
select calDropPartitions('orders', '0.0.1');
+--------------------------------------+
| calDropPartitions('orders', '0.0.1') |
+--------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been dropped:
select calShowPartitions('orders','orderdate');
+-----------------------------------------+
| calShowPartitions('orders','orderdate') |
+-----------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+-----------------------------------------+
1 row in set (0.05 sec)
Information about a range of parititions for a given column can be retrieved using the calShowPartitionsByValue stored procedure. This procedure takes either four or five mandatory parameters: [database_name], table_name, column_name, start_value, and end_value. If four parameters are provided the current database is assumed. Only casual partition column types (, , , , up to 8 bytes and up to 7 bytes) are supported for this function.
The function returns a list of partitions whose minimum and maximum values for the column 'col_name' fall completely within the range of 'start_value' and 'end_value'. For example:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02', '2010-07-24') |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
The calDisablePartitionsByValue stored procedure allows for disabling of one or more partitions by value. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be disabled and then the same argument values used to construct the calDisablePartitionsByValue call. For example:
select calDisablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+---------------------------------------------------------------------------------+
| caldisablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+---------------------------------------------------------------------------------+
| Partitions are disabled successfully |
+---------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been disabled:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24’ ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Disabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
The calEnablePartitionsbyValue stored procedure allows for enabling of one or more partitions by value. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calEnablePartitionsbyValue call. For example:
select calEnablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+--------------------------------------------------------------------------------+
| calenablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+--------------------------------------------------------------------------------+
| Partitions are enabled successfully |
+--------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been enabled:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.0.1 1992-01-01 1998-08-02 Enabled
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 rows in set (0.05 sec)
The calDropPartitionsByValue stored procedure allows for dropping of one or more partitions by value. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calDropPartitionsByValue call. Extra caution should be used with this procedure since it is destructive and cannot be reversed. For example:
select calDropPartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
+------------------------------------------------------------------------------+
| caldroppartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
+------------------------------------------------------------------------------+
| Partitions are enabled successfully. |
+------------------------------------------------------------------------------+
1 row in set (0.28 sec)
The result showing the first partition has been dropped:
select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
+----------------------------------------------------------------------------+
| calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
+----------------------------------------------------------------------------+
| Part# Min Max Status
0.1.2 1998-08-03 2004-05-15 Enabled
0.2.3 2004-05-16 2010-07-24 Enabled |
+----------------------------------------------------------------------------+
1 row in set (0.05 sec)
Since the partitioning scheme is system maintained the min and max values are not directly specified but influenced by the order of data loading. If the goal is to drop a specific date range then additional deletes are required to achieve this. The following cases may occur:
For semi-ordered data, there may be overlap between min and max values between partitions.
As in the example above, the partition ranged from 1992-01-01 to 1998-08-02. Potentially it may be desirable to drop the remaining 1998 rows.
A bulk delete statement can be used to delete the remaining rows that do not fall exactly within partition ranges. The partition drops will be fastest, however the system optimizes bulk delete statements to delete by block internally so are still relatively fast.
delete from orders where orderdate <= '1998-12-31';
This page is licensed: CC BY-SA / Gnu FDL
When new PM nodes are added to a running instance it may be desirable to redistribute the data in current nodes across all of the nodes. This is not strictly required as ongoing data ingestion will prioritize the new empty nodes for data loading to rebalance the system.
An important point is that the operation works at Partition granularity, so a minimal data set is 64M rows in a table for this to run.
A command redistributeData is available in the admin console to initiate a data distribution:
mcsadmin> redistributeData start
redistributedata Tue Dec 13 04:42:31 2016
redistributeData START
Source dbroots: 1 2
Destination dbroots: 1 2
WriteEngineServer returned status 1: Cleared.
WriteEngineServer returned status 2: Redistribute is started.
The command has 3 possible options:
Start : start a new redistribution to redistribute data equally amongst the current set of DBRoots in the system.
Stop : abort a redistribution leaving the system in a usable state.
Status : return status information on an active redistribution.
The start command can take an option of Remove. The Start Remove option is used to remove all data from the enumerated list of dbroots and redistribute the data to the remaining dbroots. This should be done before taking a dbroot out of service in order to preserve the data on that dbroot. The dbroot list is a space delimited list of integers representing the dbroots to be emptied.
As mentioned above, the operation works at partition granularity, which means that a minimal move is 64 million rows. Any table smaller than that will not be redistributed and there may be as much as one full Partition difference in the resulting balance. The redistribute logic does not currently consolidate individually deleted records.
Redistribute can take a long time. During this time, it is required that all data manipulation including bulk inserts are suspended. SuspendDatabaseWrites must be called before redistributedata and ResumeDatabaseWrites should be called when the redistribution is complete.
If "redistributeData stop" is called, all processing stops where it's at, but in a usable state. "redistributeData status" can be used to see how much has been done. A further "redistributeData start" will start over using the new state of the system. This may lead to a less optimal distribution, so stop-start sequences aren't recommended.
While the system is working, "redistributeData status" can be called to see what's happening. a -r option can be used on the status command line to repeat the call and act as a monitor.
To see how much data resides on any given DBRoot for a table, you can use a query like:
select count(*) from <table> where idbdbroot(<any column>)=<dbrootnum>;
This page is licensed: CC BY-SA / Gnu FDL