Only this pageAll pages
Powered by GitBook
Couldn't generate the PDF for 152 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Analytics

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Quickstart Guides

MariaDB ColumnStore Quickstart Guides provide concise, Docker-friendly steps to quickly set up, configure, and explore the ColumnStore analytic engine.

Analytics

MariaDB Enterprise offers powerful solutions to break down the barriers to insight. Whether you need to run ad hoc queries on massive datasets or power the most demanding AI workloads.

MariaDB ColumnStore

For fast, ad hoc analytics at scale, MariaDB ColumnStore is a powerful columnar database that can be deployed as a standalone analytics solution or integrated with MariaDB Enterprise Server to act as a powerful query accelerator. It stores data in a columnar format and can be distributed across a cluster of servers, allowing it to execute complex queries in parallel on petabytes of data.

This integration allows you to access your InnoDB data in near-real time, processing it directly in the ColumnStore engine to run fast, parallel OLAP queries straight from your transactional data. This eliminates the need to maintain a separate pipeline or use delayed batch inserts to analyze your live data.

MariaDB ColumnStore

MariaDB Exa

For the ultimate in analytical performance, the joint solution between MariaDB and Exasol connects your mission-critical transactional data to the world’s fastest analytics engine. Available on-premise or in the cloud on platforms like AWS and Microsoft Azure, this solution brings high-speed analytics to any environment.

MariaDB Exa erases the barrier between live operational data and high-speed analytics, leveraging Exasol’s massively parallel processing (MPP) and in-memory engine. It is the ideal solution for powering your most demanding analytics and AI/ML workloads with unmatched speed and efficiency.

MariaDB ColumnStore

Discover MariaDB ColumnStore, the powerful columnar storage engine for analytical workloads. Learn about its architecture, features, and how it enables high-performance data warehousing and analytics.

Quickstart GuidesColumnStore ArchitectureManaging ColumnStoreSecurityUse CasesHigh AvailabilityClients & ToolsTutorialsReference
MariaDB Exa

MariaDB ColumnStore Hardware Guide

Quickstart guide for MariaDB ColumnStore hardware requirements

Overview

MariaDB ColumnStore is designed for analytical workloads and scales linearly with hardware resources. While the performance generally improves with more CPU cores, memory, and servers, understanding the minimum hardware specifications is crucial for successful deployment, especially in development and production environments.

MariaDB ColumnStore's performance directly benefits from additional hardware resources. More CPU cores enable greater parallel processing, increased memory allows for more data caching (reducing I/O), and more servers enable a larger distributed architecture.

Minimum Hardware Recommendations

The specifications differentiate between a basic development environment and a production-ready setup:

1. For Development Environments:

  • CPU: A minimum of 8 CPU cores.

  • Memory (RAM): A minimum of 32 GB.

  • Storage: Local disk storage is acceptable for development purposes.

2. For Production Environments:

  • CPU: A minimum of 64 CPU cores.

    • Note: This recommendation underscores the highly parallel nature of ColumnStore, which can effectively utilize a large number of cores for analytical processing.

  • Memory (RAM): A minimum of 128 GB.

Network Interconnectivity (for Multi-Server Deployments)

  • Minimum Network: For multi-server ColumnStore deployments, a minimum of a 1 Gigabit (1G) network is recommended.

    • Note: This facilitates efficient data transfer between nodes via TCP/IP for replication and query processing across the distributed architecture. For optimal performance in heavy-load scenarios, higher bandwidth (e.g., 10G or more) is highly beneficial.

Adhering to these minimum specifications will provide a baseline for ColumnStore functionality. For specific workload requirements, it's always advisable to conduct performance testing and scale hardware accordingly.

See Also

Deployment

Installing ColumnStore

This section provides instructions for installing and configuring MariaDB ColumnStore. It covers various deployment scenarios, including single- and multi-node setups with both local and S3 storage.

ColumnStore Architecture

MariaDB ColumnStore uses a shared-nothing, distributed architecture with separate modules for SQL and storage, enabling scalable, high-performance analytics.

Managing ColumnStore

Managing MariaDB ColumnStore involves setup, configuration, and tools like mcsadmin and cpimport for efficient analytics.

Use Cases

MariaDB ColumnStore is ideal for real-time analytics and complex queries on large datasets across industries.

Security

MariaDB ColumnStore uses MariaDB Server’s security—encryption, access control, auditing, and firewall—for secure analytics.

Upgrading ColumnStore

High Availability

MariaDB ColumnStore ensures high availability with multi-node setups and shared storage, while MaxScale adds monitoring and failover for continuous analytics.

Clients & Tools

MariaDB ColumnStore supports standard MariaDB tools, BI connectors (e.g., Tableau, Power BI), data ingestion (cpimport, Kafka), and REST APIs for admin.

Query Plans and Optimizer Trace

MariaDB ColumnStore's query plans and Optimizer Trace show how analytical queries run in parallel across its distributed, columnar architecture, aiding performance tuning.

Backup & Restore

MariaDB ColumnStore backup and restore manage distributed data using snapshots or tools like mariadb-backup, with restoration ensuring cluster sync via cpimport or file system recovery.

Query Tuning

MariaDB ColumnStore query tuning optimizes analytics using data types, joins, projection elimination, WHERE clauses, and EXPLAIN for performance insights.

Note: Adequate memory is critical for caching data and intermediate results, directly impacting query performance.

  • Storage: StorageManager (S3) is recommended.

    • Note: This implies leveraging cloud-object storage (like AWS S3 or compatible services) for scalable and durable data persistence in production.

  • MariaDB ColumnStore Minimum Hardware Specification Documentation
    MariaDB ColumnStore Overview
    MariaDB documentation: MariaDB ColumnStore

    MariaDB ColumnStore Guide

    Quickstart guide for MariaDB ColumnStore

    Quickstart Guide: MariaDB ColumnStore

    MariaDB ColumnStore is a specialized columnar storage engine designed for high-performance analytical processing and big data workloads. Unlike traditional row-based storage engines, ColumnStore organizes data by columns, which is highly efficient for analytical queries that often access only a subset of columns across vast datasets.

    What is MariaDB ColumnStore?

    MariaDB ColumnStore is a columnar storage engine that integrates with MariaDB Server. It employs a massively parallel distributed data architecture, making it ideal for processing petabytes of data with linear scalability. It was originally ported from InfiniDB and is released under the GPL license.

    Key Benefits

    • Exceptional Analytical Performance: Delivers superior performance for complex analytical queries (OLAP) due to its columnar nature, which minimizes disk I/O by reading only necessary columns.

    • High Data Compression: Columnar storage allows for much higher compression ratios compared to row-based storage, reducing disk space usage and improving query speed.

    • Massive Scalability: Designed to scale horizontally across multiple nodes, processing petabytes of data with ease.

    Architecture Concepts (Simplified)

    MariaDB ColumnStore utilizes a distributed architecture with different components working together:

    • User Module (UM): Handles incoming SQL queries, optimizes them for columnar processing, and distributes tasks.

    • Performance Module (PM): Manages data storage, compression, and execution of query fragments on the data segments.

    • Data Files: Data is stored in column-segments across the nodes, highly compressed.

    Installation Overview

    MariaDB ColumnStore is installed as a separate package that integrates with MariaDB Server. The exact installation steps vary depending on your operating system and desired deployment type (single server or distributed cluster).

    General Steps (conceptual):

    1. Install MariaDB Server: Ensure you have a compatible MariaDB Server version installed (e.g., MariaDB 10.5.4 or later).

    2. Install ColumnStore Package: Download and install the specific MariaDB ColumnStore package for your OS. This package includes the ColumnStore storage engine and its associated tools.

      • Linux (e.g., Debian/Ubuntu): You would typically add the MariaDB repository configured for ColumnStore and then install mariadb-plugin-columnstore.

    Basic Usage

    Once MariaDB ColumnStore is installed and configured, you can create and interact with ColumnStore tables using standard SQL.

    Creating a ColumnStore Table

    Specify ENGINE=ColumnStore when creating your table. Note that ColumnStore tables do not support primary keys in the same way as InnoDB, as their primary focus is analytical processing.

    Inserting Data

    You can insert data using standard INSERT statements. For large datasets, bulk loading utilities (for instance, LOAD DATA INFILE) are highly recommended for performance.

    Querying Data

    Perform analytical queries. ColumnStore will efficiently process these, often leveraging its columnar nature and parallelism.

    See Also

    ColumnStore System Databases

    When using ColumnStore, MariaDB Server creates a series of system databases used for operational purposes.

    Database
    Description

    ColumnStore Query Processing

    Clients issue a query to the MariaDB Server, which has the ColumnStore storage engine installed. MariaDB Server parses the SQL, identifies the involved ColumnStore tables, and creates an initial logical query execution plan.

    Using the ColumnStore storage engine interface (ha_columnstore), MariaDB Server converts involved table references into ColumnStore internal objects. These are then handed off to the ExeMgr, which is responsible for managing and orchestrating query execution across the cluster.

    The ExeMgr analyzes the query plan and translates it into a distributed ColumnStore execution plan. It determines the necessary query steps and the execution order, including any required parallelization.

    The ExeMgr then references the extent map to identify which PrimProc instances hold the relevant data segments. It applies extent elimination to exclude any PrimProc nodes whose extents do not match the query’s filter criteria.

    The ExeMgr

    ColumnStore Table Size Limitations

    MariaDB ColumnStore has a hard limit of 4096 columns per table.

    However, it's likely that you run into other limitations before hitting that limit, including:

    • Row size limit of tables. This varies, depending on the storage engine you're using. For example, which indirectly limits the number of columns.

    • Size limit of .frm files. Those files hold the column description of tables. Column descriptions vary in length. Once all column descriptions combined reach a length of 64KB, the table's .frm file is full, limiting the number of columns you can have in a table.

    Node Maintenance for MariaDB Enterprise Columnstore

    Managing ColumnStore Database Environment

    Managing MariaDB ColumnStore means deploying its architecture, scaling modules, and maintaining performance through monitoring, optimization, and backups.

    StorageManager

    The ColumnStore StorageManager manages columnar data storage and retrieval, optimizing analytical queries.

    Just-in-Time Projection: Only the required columns are processed and returned, further optimizing query execution.
  • Real-time Analytics: Capable of handling real-time analytical queries efficiently.

  • Single Server vs. Distributed: For a single-server setup, you install all ColumnStore components on one machine. For a distributed setup, you install and configure components across multiple machines.

  • Configure MariaDB: After installation, you might need to adjust your MariaDB server configuration (my.cnf or equivalent) to properly load and manage the ColumnStore engine.

  • Initialize ColumnStore: Run a specific columnstore-setup or post-install script to initialize the ColumnStore environment.

  • MariaDB ColumnStore Overview
    DigitalOcean: How to Install MariaDB ColumnStore on Ubuntu 20.04
    dispatches commands to the selected PrimProc instances to perform data block I/O operations.

    The PrimProc components perform operations such as

    • Predicate filtering

    • Join processing

    • Initial aggregation

    • Data retrieval from local disk or external storage (e.g., S3 or cloud object storage)

    They then return intermediate result sets to the ExeMgr.

    The ExeMgr handles:

    • Final-stage aggregation

    • Window function evaluation

    • Result-set sorting and shaping

    The completed result set is returned to the MariaDB Server, which performs any remaining SQL operations like ORDER BY, LIMIT, or computed expressions in the SELECT list.

    Finally, the MariaDB Server returns the result set to the client.

    Given that, the maximum number of columns a ColumnStore table can effectively have is around 2000 columns.

    calpontsys

    Database maintains table metadata about ColumnStore tables.

    infinidb_querystats

    Database maintains information about query performance. For more information, see Query Analysis.

    columnstore_info

    The database for stored procedures is used to retrieve information about ColumnStore usage. For more information, see the ColumnStore Information Schema tables.

    Step 5: Bulk Import of Data

    Step 5: Bulk Import of Data

    Overview

    This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the statement:

    1. For each table that you are importing, create the table with the statement:

    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    cpimport

    MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run :

    LOAD DATA INFILE

    When data is loaded with the statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use statement:

    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 5 of 5.

    This procedure is complete.

    Step 6: Install MariaDB MaxScale

    Step 6: Install MariaDB MaxScale

    Overview

    This page details step 6 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step installs MariaDB MaxScale 22.08.

    ColumnStore Object Storage requires 1 or more MaxScale nodes.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Retrieve Customer Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install MaxScale

    On the MaxScale node, install MariaDB MaxScale.

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 6 of 9.

    .

    Backup and Restore Overview

    Overview

    MariaDB Enterprise ColumnStore supports backup and restore.

    System of Record

    Before you determine a backup strategy for your Enterprise ColumnStore deployment, it is a good idea to determine the system of record for your Enterprise ColumnStore data.

    A system of record is the authoritative data source for a given piece of information. Organizations often store duplicate information in several systems, but only a single system can be the authoritative data source.

    Enterprise ColumnStore is designed to handle analytical processing for OLAP, data warehousing, DSS, and hybrid workloads on very large data sets. Analytical processing does not generally happen on the system of record. Instead, analytical processing generally occurs on a specialized database that is loaded with data from the separate system of record. Additionally, very large data sets can be difficult to back up. Therefore, it may be beneficial to only backup the system of record.

    If Enterprise ColumnStore is not acting as the system of record for your data, you should determine how the system of record affects your backup plan:

    • If your system of record is another database server, you should ensure that the other database server is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the other database server.

    • If your system of record is a set of data files, you should ensure that the set of data files is properly backed up and that your organization has procedures to reload Enterprise ColumnStore from the set of data files.

    Full Backup and Restore

    MariaDB Enterprise ColumnStore supports full backup and restore for all storage types. A full backup includes:

    • Enterprise ColumnStore's data and metadata

    With S3: an S3 snapshot of the and a file system snapshot or copy of the Without S3: a file system snapshot or copy of the .

    • The MariaDB data directory from the primary node

    To see the procedure to perform a full backup and restore, choose the storage type:

    Storage Type
    Diagram

    Step 5: Bulk Import of Data

    Step 5: Bulk Import of Data

    Overview

    This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the statement:

    1. For each table that you are importing, create the table with the statement:

    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    cpimport

    MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run :

    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 5 of 5.

    This procedure is complete.

    Switchover of the Primary Node

    To switchover to a new primary node with Enterprise ColumnStore, perform the following procedure.

    Performing Switchover in MaxScale

    The primary node can be switched in MaxScale using :

    • Use or another supported REST client.

    • Call a module command using the call command command.

    • As the first argument, provide the name for the module, which is .

    • As the second argument, provide the module command, which is switchover .

    • As the third argument, provide the name of the monitor.

    For example:

    With the above syntax, MaxScale will choose the most up-to-date replica to be the new primary.

    If you want to manually select a new primary, provide the server name of the new primary as the fourth argument:

    Checking the Replication Status with MaxScale

    MaxScale is capable of checking the status of using :

    • List the servers using the list servers command, like this:

    If switchover was properly performed, the State column of the new primary shows Master, Running.

    ColumnStore Security Vulnerabilities

    1. About CVEs "About CVEs"

    2. CVEs fixed in ColumnStore "CVEs fixed in ColumnStore"

    This page is about security vulnerabilities that have been fixed for or still affect MariaDB ColumnStore. In addition, links are included to fixed security vulnerabilities in MariaDB Server since MariaDB ColumnStore is based on MariaDB Server.

    Sensitive security issues can be sent directly to the persons responsible for MariaDB security: security [AT] mariadb (dot) org.

    About CVEs

    CVE® stands for "Common Vulnerabilities and Exposures". It is a publicly available and free-to-use database of known software vulnerabilities maintained at

    CVEs fixed in ColumnStore

    The appropriate release notes listed document CVEs fixed within a given release. Additional information can also be found at .

    There are no known CVEs on ColumnStore-specific infrastructure outside of the MariaDB server at this time.

    Credentials Management

    Overview

    Starting with MariaDB Enterprise ColumnStore 6.2.3, ColumnStore supports encryption for user passwords stored in Columnstore.xml:

    • Encryption keys are created with the cskeys utility

    • Passwords are encrypted using the cspasswd utility

    Compatibility

    • MariaDB Enterprise ColumnStore 6

    • MariaDB Enterprise ColumnStore 22.08

    • MariaDB Enterprise ColumnStore 23.02

    Encryption Keys

    MariaDB Enterprise ColumnStore stores its password encryption keys in the plain-text file /var/lib/columnstore/.secrets.

    The encryption keys are not created by default, but can be generated by executing the cskeys utility:

    In a multi-node Enterprise ColumnStore cluster, every ColumnStore node should have the same encryption keys. Therefore, it is recommended to execute cskeys on the primary server and then copy /var/lib/columnstore/.secrets to every other ColumnStore node and fix the file's permissions:

    Encrypt a Password

    To encrypt a password:

    Generate an encrypted password using the cspasswd utility:

    • If the --interactive command-line option is specified, cspasswd prompts for the password.

    Set the encrypted password in Columnstore.xml using the mcsSetConfig utility:

    Decrypt a Password

    To decrypt a password, execute the cspasswd utility and specify the --decrypt command-line option:

    Data Ingestion Methods & Tools

    Learn about data ingestion for MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.

    ColumnStore provides several mechanisms to ingest data:

    • cpimport provides the fastest performance for inserting data and ability to route data to particular PrimProc nodes. Normally, this should be the default choice for loading data .

    • LOAD DATA INFILE provides another means of bulk inserting data.

      • By default, with autocommit on, it internally streams the data to an instance of the cpimport process.

      • In transactional mode, DML inserts are performed, which is significantly slower and also consumes both binlog transaction files and ColumnStore VersionBuffer files.

    • DML, i.e. INSERT, UPDATE, and DELETE, provide row-level changes. ColumnStore is optimized towards bulk modifications, so these operations are slower than they would be in, for instance, InnoDB.

      • Currently ColumnStore does not support operating as a replication replica target.

      • Bulk DML operations will in general perform better than multiple individual statements.

    • Using ColumnStore Bulk Write SDK or .

    Certified S3 Object Storage Providers

    Hardware (On Premises)

    • Quantum ActiveScale

    • IBM Cloud Object Storage (Formerly known as CleverSafe)

    Cloud (IaaS)

    Software-Based

    Due to the frequent code changes and deviation from the AWS standards, none are approved at this time.

    Execution Plan (CSEP)

    Overview

    The ColumnStore storage engine uses a ColumnStore Execution Plan (CSEP) to represent a query plan internally.

    When the select handler receives the SELECT_LEX object, it transforms it into a CSEP as part of the query planning and optimization process. For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation."

    Viewing the CSEP

    The CSEP for a given query can be viewed by performing the following:

    1. Calling the calSetTrace(1) function:

    1. Executing the query:

    1. Calling the calGetTrace() function:

    Sample storagemanager.cnf

    # Sample storagemanager.cnf
    
    [ObjectStorage]
    service = S3
    object_size = 5M
    metadata_path = /var/lib/columnstore/storagemanager/metadata
    journal_path = /var/lib/columnstore/storagemanager/journal
    max_concurrent_downloads = 21
    max_concurrent_uploads = 21
    common_prefix_depth = 3
    
    [S3]
    region = us-west-1
    bucket = my_columnstore_bucket
    endpoint = s3.amazonaws.com
    aws_access_key_id = AKIAR6P77BUKULIDIL55
    aws_secret_access_key = F38aR4eLrgNSWPAKFDJLDAcax0gZ3kYblU79
    
    [LocalStorage]
    path = /var/lib/columnstore/storagemanager/fake-cloud
    fake_latency = n
    max_latency = 50000
    
    [Cache]
    cache_size = 2g
    path = /var/lib/columnstore/storagemanager/cache

    Note: A region is required even when using an on-premises solution like ActiveScale due to header expectations within the API.

    Collecting Statistics with ANALYZE TABLE

    Overview

    In MariaDB Enterprise ColumnStore 6, the ExeMgr process uses optimizer statistics in its query planning process.

    ColumnStore uses the optimizer statistics to add support for queries that contain circular inner joins.

    In Enterprise ColumnStore 5 and before, ColumnStore would raise the following error when a query containing a circular inner join was executed:

    ERROR 1815 (HY000): Internal error: IDB-1003: Circular joins are not supported.

    The optimizer statistics store each column's NDV (Number of Distinct Values), which can help the ExeMgr process choose the optimal join order for queries with circular joins. When Enterprise ColumnStore executes a query with a circular join, the query's execution can take longer if ColumnStore chooses a sub-optimal join order. When you collect optimizer statistics for your ColumnStore tables, the ExeMgr process is less likely to choose a sub-optimal join order.

    Enterprise ColumnStore's optimizer statistics can be collected for ColumnStore tables by executing :

    Enterprise ColumnStore's optimizer statistics are not updated automatically. To update the optimizer statistics for a ColumnStore table, must be re-executed.

    Enterprise ColumnStore does not implement an interface to show optimizer statistics.

    View and Clear Table Locks

    MariaDB Enterprise ColumnStore acquires table locks for some operations, and it provides utilities to view and clear those locks.

    MariaDB Enterprise ColumnStore acquires table locks for some operations, such as:

    • DDL statements

    • DML statements

    • Bulk data loads

    If an operation fails, the table lock does not always get released. If you try to access the table, you can see errors like the following:

    To solve this problem, MariaDB Enterprise ColumnStore provides two utilities to view and clear the table locks:

    • cleartablelock

    • viewtablelock

    Viewing Table Locks

    The viewtablelock utility shows table locks currently held by MariaDB Enterprise ColumnStore:

    To view all table locks:

    To view table locks for a specific table, specify the database and table:

    Clearing Table Locks

    The cleartablelock utility clears table locks currently held by MariaDB Enterprise ColumnStore.

    To clear a table lock, specify the lock ID shown by the viewtablelock utility:

    ColumnStore Minimum Hardware Specification

    The following table outlines the minimum recommended production server specifications which can be followed for both on premise and cloud deployments:

    Per Server

    Item
    Development Environment
    Production Environment

    Step 9: Import Data

    Step 9: Import Data

    Overview

    This page details step 9 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 2: Install Enterprise ColumnStore

    Step 2: Install Enterprise ColumnStore

    Overview

    This page details step 2 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Major Release Upgrades for MariaDB Enterprise ColumnStore

    This page provides a major release upgrade procedure for MariaDB Enterprise ColumnStore. A major release upgrade is an upgrade from an older major release to a newer major release, such as an upgrade from MariaDB Enterprise ColumnStore 5 to MariaDB Enterprise ColumnStore 22.08.

    Compatibility

    • Enterprise ColumnStore 5

    MariaDB Enterprise Columnstore Locking

    Overview

    MariaDB Enterprise ColumnStore minimizes locking for analytical workloads, bulk data loads, and online schema changes.

    Lockless Reads

    MariaDB Enterprise ColumnStore supports lockless reads.

    Rejoining a Node

    To rejoin a node with Enterprise ColumnStore, perform the following procedure.

    Performing Rejoin in MaxScale

    The node can be configured to rejoin in MaxScale using :

    • Use or another supported REST client.

    Step 6: Install MariaDB MaxScale

    Step 6: Install MariaDB MaxScale

    Overview

    This page details step 6 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step installs MariaDB MaxScale 22.08. ColumnStore Object Storage requires 1 or more MaxScale nodes.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Using StorageManager With IAM Role

    AWS IAM Role Configuration

    From Columnstore 5.5.2, you can use AWS IAM roles in order to connect to S3 buckets without explicitly entering credentials into the storagemanager.cnf config file.

    You need to modify the IAM role of your Amazon EC2 instance to allow for this. Please follow the AWS before beginning this process.

    It is important to note that you must update the AWS S3 endpoint based on your chosen region; otherwise, you might face delays in propagation as discussed and .

    For a complete list of AWS service endpoints, visit the AWS

    CREATE TABLE sales_data (
        sale_id INT,
        product_name VARCHAR(255),
        category VARCHAR(100),
        sale_date DATE,
        quantity INT,
        price DECIMAL(10, 2)
    ) ENGINE=ColumnStore;
    INSERT INTO sales_data (sale_id, product_name, category, sale_date, quantity, price) VALUES
    (1, 'Laptop', 'Electronics', '2023-01-15', 1, 1200.00),
    (2, 'Mouse', 'Electronics', '2023-01-15', 2, 25.00),
    (3, 'Keyboard', 'Electronics', '2023-01-16', 1, 75.00);
    -- Get total sales per category
    SELECT category, SUM(quantity * price) AS total_sales
    FROM sales_data
    WHERE sale_date BETWEEN '2023-01-01' AND '2023-01-31'
    GROUP BY category
    ORDER BY total_sales DESC;
    
    -- Count distinct products
    SELECT COUNT(DISTINCT product_name) FROM sales_data;

    INSERT INTO SELECT with autocommit behaves similarly to LOAD DATE INFILE because, internally, it is mapped to cpimport for higher performance.

  • Bulk update operations based on a join with a small staging table can be relatively fast, especially if updating a single column.

  • ColumnStore Streaming Data Adapters

    64 Core CPU, 128 GB Memory

    Storage

    Local disk

    StorageManager (S3)

    Network

    Network Interconnect

    In a multi server deployment data will be passed around via TCP/IP networking. At least a 1G network is recommended.

    Details

    These are minimum recommendations and in general the system will perform better with more hardware:

    • More CPU cores and servers will improve query processing response time.

    • More memory will allow the system to cache more data blocks in memory. We have users running system with anywhere from 64G RAM to 2T RAM.

    • Faster network will allow data to flow faster between PrimProc nodes.

    • SSD's may be used, however the system is optimized towards block streaming which may perform well enough with HDD's for lower cost.

    • Where it is an option, it is recommended to use bare metal servers for additional performance since ColumnStore will fully consume CPU cores and memory.

    • In general it makes more sense to use a higher core count / higher memory server for single server or 2 server combined deployments.

    AWS Instance Sizes

    For AWS, our own internal testing generally uses m4.4xlarge instance types as a cost-effective middle ground. The R4.8xlarge has also been tested and performs about twice as fast for about twice the price.

    Physical Server

    8 Core CPU, 32 GB Memory

    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the CREATE DATABASE statement:

    1. For each table that you are importing, create the table with the CREATE TABLE statement:

    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    Interface
    Method
    Benefits

    Shell

    • SQL access is not required

    SQL

    • Shell access is not required

    Remote Database

    • Use normal database client

    • Avoid dumping data to intermediate filed

    cpimport

    MariaDB Enterprise ColumnStore includes cpimport, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run cpimport:

    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using cpimport, which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 9 of 9.

    This procedure is complete.

    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to https://customers.mariadb.com/downloads/token/

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install Enterprise ColumnStore

    Install additional dependencies:

    Install on CentOS / RHEL (YUM)

    Install of Debian 10 and Ubuntu 20.04 (APT):

    Install on Debian 9 and Ubuntu 18.04 (APT):

    Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 2 of 5.

    Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.

    Enterprise ColumnStore 6
  • Enterprise ColumnStore 22.08

  • Prerequisites

    This procedure assumes that the new Enterprise ColumnStore version will be installed onto new servers.

    To reuse existing servers for the new Enterprise ColumnStore version, you must adapt the procedure detailed below. After step 1, confirm all data has been backed-up and verify backups. The old version of Enterprise ColumnStore should then be uninstalled, and all Enterprise ColumnStore files should be deleted before continuing with step 2.

    Step 1: Backup/Export Schemas and Data

    On the old ColumnStore cluster, perform a full backup.

    MariaDB recommends backing up the table schemas to a single SQL file and backing up the table data to table-specific CSV files.

    1. For each table, obtain the table's schema by executing the SHOW CREATE TABLE :

      Backup the table schemas by copying the output to an SQL file. This procedure assumes that the SQL file is named schema-backup.sql.

    2. For each table, backup the table data to a CSV file using the SELECT .. INTO OUTFILE :

    3. Copy the SQL file containing the table schemas and the CSV files containing the table data to the primary node of the new ColumnStore cluster.

    Step 2: Install New Major Release

    On the new ColumnStore cluster, follow the deployment instructions of the desired topology for the new ColumnStore version.

    For deployment instructions, see "".

    Step 3: Restore/Import Data

    On the new ColumnStore cluster, restore the table schemas and data.

    1. Restore the schema backup using :

      • HOST and PORT should refer to the following:

        • If you are connecting with MaxScale as a proxy, they should refer to the host and port of the MaxScale listener

        • If you are connecting directly to a multi-node ColumnStore cluster, they should refer to the host and port of the primary ColumnStore node

        • If you are connecting directly to single-node ColumnStore, they should refer to the host and port of the ColumnStore node

      • When the command is executed, mariadb client prompts for the user password

    2. For each table, restore the data from the table's CSV file by executing the on the primary ColumnStore node:

    Step 4: Test

    On the new ColumnStore cluster, verify that the table schemas and data have been restored.

    1. For each table, verify the table's definition by executing the SHOW CREATE TABLE statement:

    2. For each table, verify the number of rows in the table by executing SELECT COUNT(*):

    3. For each table, verify the data in the table executing the statement.

      If the table is very large, you can limit the number of rows in the result set by adding a LIMIT clause:

    Locking for Writes

    MariaDB Enterprise ColumnStore requires a table lock for write operations.

    Locking for Data Loading

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    For additional information, see "MariaDB Enterprise ColumnStore Data Loading".

    Online Schema Changes

    MariaDB Enterprise ColumnStore supports online schema changes, so that supported DDL operations can be performed without blocking reads. The supported DDL operations only require a write metadata lock (MDL) on the target table.

    Call a module command using the call command command.

  • As the first argument, provide the name for the module, which is .

  • As the second argument, provide the module command, which is rejoin .

  • As the third argument, provide the name of the monitor.

  • As the fourth argument, provide the name of the server.

  • For example:

    Checking Replication Status with MaxScale

    MaxScale is capable of checking the status of using :

    • List the servers using the list servers command, like this:

    If the node properly rejoined, the State column of the node shows Slave, Running.

    maxctrl call command \
       mariadbmon \
       rejoin \
       mcs_monitor \
       mcs3
    Retrieve Customer Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On the MaxScale node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On the MaxScale node, configure package repositories and specify MariaDB MaxScale 22.08:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install MaxScale

    On the MaxScale node, install MariaDB MaxScale.

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 6 of 9.

    Next: Step 7: Start and Configure MariaDB MaxScale.

    .

    Sample Configuration

    Edit your Storage Manager configuration file located at /etc/columnstore/storagemanager.cnf in order to look similar to the example below (replacing those in the [S3] section with your own custom variables):

    Note: This is an AWS only feature. For other deployment methods, see the example here.

    documentation
    here
    here
    reference guide
    cpimport
    cpimport
    cpimport
    cpimport
    https://customers.mariadb.com/downloads/token/
    Next: Step 7: Start and Configure MariaDB MaxScale

    Enterprise ColumnStore with Object Storage

    Enterprise ColumnStore with Shared Local Storage

    S3-compatible object storage
    Storage Manager directory
    DB Root directories
    cpimport
    cpimport
    cpimport
    cpimport
    maxctrl call command \
       mariadbmon \
       switchover \
       mcs_monitor
    SELECT calSetTrace(1);
    SELECT column1, column2
    FROM columnstore_tab
    WHERE column1 > '2020-04-01'
    AND column1 < '2020-11-01';
    [[analyze-table|ANALYZE TABLE]] columnstore_tab;
    ERROR 1815 (HY000): Internal error: CAL0009: Drop table failed due to IDB-2009: Unable to perform the drop table operation because cpimport with PID 16301 is currently holding the table lock for session -1.
    Security Vulnerabilities Fixed in MariaDB
    DELL EMC
    AWS S3
    Google GCS

    Step 3: Install MariaDB Enterprise Server

    Step 3: Install MariaDB Enterprise Server

    Overview

    This page details step 3 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore 23.10, CMAPI, and dependencies.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install Enterprise Server and Enterprise ColumnStore

    1. On each Enterprise ColumnStore node, install additional dependencies:

    Install on CentOS and RHEL (YUM):

    Install on Debian 9 and Ubuntu 18.04 (APT)

    Install on Debian 10 and Ubuntu 20.04 (APT):

    1. On each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 3 of 9.

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Overview

    This page details step 1 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Local storage.

    This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 1 of 5.

    About MariaDB ColumnStore

    MariaDB ColumnStore is a columnar storage engine that utilizes a massively parallel distributed data architecture. It's a columnar storage system built by porting InfiniDB 4.6.7 to MariaDB and released under the GPL license.

    is available as a storage engine for MariaDB Server. Before then, it is available as a separate download.

    Release notes and other documentation for ColumnStore is also available in the Enterprise docs section of the MariaDB website. For example:

    It is designed for big data scaling to process petabytes of data, linear scalability, and exceptional performance with real-time response to analytical queries. It leverages the I/O benefits of columnar storage, compression, just-in-time projection, and horizontal and vertical partitioning to deliver tremendous performance when analyzing large data sets.

    Links:

    • .

    • A Google Group exists for MariaDB ColumnStore that can be used to discuss ideas and issues and communicate with the community: Send email to mariadb-columnstore@googlegroups.com or use the

    • Bugs can be reported in MariaDB Jira: (see ). Please file bugs under the MCOL project and include the output from the if possible.

    MariaDB ColumnStore is released under the GPL license.

    Setting a Node to Maintenance Mode

    To set a node to maintenance mode with Enterprise ColumnStore, perform the following procedure.

    Setting the Server State in MaxScale

    The server object for the node can be set to maintenance mode in MaxScale using :

    • Use or another supported REST client.

    • Set the server object to maintenance mode using the set server command.

    • As the first argument, provide the name for the server.

    • As the second argument, provide maintenance as the state.

    For example:

    If the specified server is a primary server, then MaxScale will allow open transactions to complete before closing any connections.

    If you would like MaxScale to immediately close all connections, the --force option can be provided as a third argument:

    Confirming Maintenance Mode is Set with MaxScale

    Confirm the state of the server object in MaxScale using :

    • List the servers using the list servers command, like this:

    If the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.

    Performing Maintenance

    Now that the server is in maintenance mode in MaxScale, you can perform your maintenance.

    While the server is in maintenance mode:

    • MaxScale doesn't route traffic to the node.

    • MaxScale doesn't select the node to be primary during failover.

    • The node can be rebooted.

    • The node's services can be restarted.

    Clear the Server State in MaxScale

    Maintenance mode for the server object for the node can be cleared in MaxScale using :

    • Use or another supported REST client.

    • Clear the server object's state using the clear server command.

    • As the first argument, provide the name for the server.

    • As the second argument, provide maintenance

    For example:

    Confirming Maintenance Mode is Cleared with MaxScale

    Confirm the state of the server object in MaxScale using :

    • List the servers using the list servers command, like this:

    If the node is no longer in maintenance mode, the State column no longer shows Maintenance as one of the states.

    Step 9: Import Data

    Step 9: Import Data

    Overview

    This page details step 9 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step bulk imports data to Enterprise ColumnStore.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Import the Schema

    Before data can be imported into the tables, create a matching schema.

    On the primary server, create the schema:

    1. For each database that you are importing, create the database with the CREATE DATABASE statement:

    1. For each table that you are importing, create the table with the CREATE TABLE statement:

    Import the Data

    Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

    Interface
    Method
    Benefits

    cpimport

    MariaDB Enterprise ColumnStore includes , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server run :

    LOAD DATA INFILE

    When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using , which is a command-line utility designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

    Import from Remote Database

    MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into , which is a command-line utility that is designed to efficiently load data in bulk. Alternative methods are available.

    To import your data from a remote MariaDB database:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 9 of 9.

    This procedure is complete.

    Step 3: Install MariaDB Enterprise Server

    Step 3: Install MariaDB Enterprise Server

    Overview

    This page details step 3 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step installs MariaDB Enterprise Server, MariaDB Enterprise ColumnStore 23.10, CMAPI, and dependencies.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install Enterprise Server and Enterprise ColumnStore

    1. On each Enterprise ColumnStore node, install additional dependencies:

    Install on CentOS and RHEL (YUM):

    Install on Debian 9 and Ubuntu 18.04 (APT)

    Install on Debian 10 and Ubuntu 20.04 (APT):

    1. On each Enterprise ColumnStore node, install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology".

    This page was step 3 of 9.

    Extent Map Backup & Recovery

    Overview

    MariaDB ColumnStore utilizes an Extent Map to manage data distribution across extents—logical blocks within physical segment files ranging from 8 to 64 MB. Each extent holds a consistent number of rows, with the Extent Map cataloging these extents, their corresponding block identifiers (LBIDs), and the minimum and maximum values for each column's data within the extent.​

    The primary node maintains the master copy of the Extent Map. Upon system startup, this map is loaded into memory and propagated to other nodes for redundancy and quick access. Corruption of the master Extent Map can render the system unusable and lead to data loss.​

    Purpose

    ColumnStore's extent map is a smart structure that underpins its performance. By providing a logical partitioning scheme, it avoids the overhead associated with indexing and other common row-based database optimizations.

    The primary node in a ColumnStore cluster holds the master copy of the extent map. Upon system startup, this master copy is read into memory and then replicated to all other participating nodes for high availability and disaster recovery. Nodes keep the extent map in memory for rapid access during query processing. As data within extents is modified, these updates are broadcast to all participating nodes to maintain consistency.

    If the master copy of the extent map becomes corrupted, the entire system could become unusable, potentially leading to data loss. Having a recent backup of the extent map allows for a much faster recovery compared to reloading the entire database in such a scenario.

    Backup Procedure

    Note: MariaDB recommends implementing regular backups to ensure data integrity and recovery. A common default is to back up every 3 hours and retain backups for at least 10 days.

    To safeguard against potential Extent Map corruption, regularly back up the master copy:

    1. Lock Table:

    1. Save BRM:

    1. Create Backup Directory:

    1. Copy Extent Map:

    1. Unlock Tables:

    Recovery Procedures

    Single-Node System

    1. Stop ColumnStore:

    1. Rename Corrupted Map:

    1. Clear Versioning Files:

    1. Restore Backup:

    1. Set Ownership:

    1. Start ColumnStore:

    Clustered System

    1. Shutdown Cluster:

    1. Rename Corrupted Map:

    1. Clear Versioning Files:

    1. Restore Backup:

    1. Set Ownership:

    1. Start Cluster:

    Automation Recommendation

    Incorporate the save_brm command into your data import scripts (e.g., those using cpimport) to automate Extent Map backups. This practice ensures regular backups without manual intervention.

    Refer to the MariaDB ColumnStore Backup Script for an example implementation.​

    Step 3: Start and Configure Enterprise ColumnStore

    Step 3: Start and Configure Enterprise ColumnStore

    Overview

    This page details step 3 of a 5-step procedure for deploying .

    This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 4: Test Enterprise ColumnStore

    Overview

    This page details step 4 of a 5-step procedure for deploying .

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    ColumnStore Read Replicas

    The ColumnStore Read Replica topology is an Alpha release. Do not use it in production without testing in your development environment first.

    Overview

    The Read Replicas feature in MariaDB ColumnStore enables horizontal scaling of read performance by incorporating read-only nodes into a multi-node cluster. These replicas differ from standard ColumnStore nodes, in that they don't run the WriteEngineServer process

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Step 1: Prepare Systems for Enterprise ColumnStore Nodes

    Overview

    This page details step 1 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step prepares the system to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Job Steps

    Overview

    When Enterprise ColumnStore executes a query, the on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.

    Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:

    • Some types of job steps perform operations in a distributed manner, using multiple nodes to operate to different extents. Distributed operations provide horizontal scalability.

    Step 4: Test Enterprise ColumnStore

    Step 4: Test Enterprise ColumnStore

    Overview

    This page details step 4 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore with Object storage.

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 2: Install Enterprise ColumnStore

    Step 2: Install Enterprise ColumnStore

    Overview

    This page details step 2 of a 5-step procedure for deploying .

    This step installs MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
          
     | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install epel-release
    
    $ sudo yum install jemalloc
    $ sudo apt install libjemalloc2
    $ sudo apt install libjemalloc1
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore
    SHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\G
    SELECT * INTO OUTFILE '/path/to/DATABASE_NAME-TABLE_NAME.csv'
    FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
    LINES TERMINATED BY '\n'
    FROM DATABASE_NAME.TABLE_NAME;
    mariadb --host HOST --port PORT --user USER --password < schema-backup.sql
    SHOW CREATE TABLE DATABASE_NAME.TABLE_NAME\G
    SELECT COUNT(*) FROM DATABASE_NAME.TABLE_NAME;
    maxctrl list servers
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    
    $ chmod +x mariadb_es_repo_setup
    
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-server \
          --skip-tools \
          --mariadb-maxscale-version="22.08"
    $ sudo yum install maxscale
    $ sudo apt install maxscale
    [ObjectStorage]
    service = S3
    object_size = 5M
    metadata_path = /var/lib/columnstore/storagemanager/metadata
    journal_path = /var/lib/columnstore/storagemanager/journal
    max_concurrent_downloads = 21
    max_concurrent_uploads = 21
    common_prefix_depth = 3
    
    [S3]
    ec2_iam_mode=enabled
    bucket = my_mcs_bucket
    region = us-west-2
    endpoint = s3.us-west-2.amazonaws.com
    
    [LocalStorage]
    path = /var/lib/columnstore/storagemanager/fake-cloud
    fake_latency = n
    max_latency = 50000
    
    [Cache]
    cache_size = 2g
    path = /var/lib/columnstore/storagemanager/cache
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    
    $ chmod +x mariadb_es_repo_setup
    
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-server \
          --skip-tools \
          --mariadb-maxscale-version="22.08"
    $ sudo yum install maxscale
    $ sudo apt install maxscale
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    maxctrl call command \
       mariadbmon \
       switchover \
       mcs_monitor \
       mcs2
    maxctrl list servers
    $ cskeys
    $ scp 192.0.2.1:/var/lib/columnstore/.secrets /var/lib/columnstore/.secrets
    $ sudo chown mysql:mysql /var/lib/columnstore/.secrets
    $ sudo chmod 0400 /var/lib/columnstore/.secrets
    $ cspasswd util_user_passwd
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_encrypted_passwd
    $ cspasswd --decrypt util_user_encrypted_passwd
    SELECT calGetTrace();
    viewtablelock
     There is 1 table lock
    
      Table                     LockID  Process   PID    Session   Txn  CreationTime               State    DBRoots
      hq_sales.invoices         1       cpimport  16301  BulkLoad  n/a  Wed April 7 14:20:42 2021  LOADING  1
    viewtablelock hq_sales invoices
     There is 1 table lock
    
      Table                     LockID  Process   PID    Session   Txn  CreationTime               State    DBRoots
      hq_sales.invoices         1       cpimport  16301  BulkLoad  n/a  Wed April 7 14:20:42 2021  LOADING  1
    cleartablelock 1
    as the state.
    Configure Enterprise ColumnStore

    Mandatory system variables and options for Single-Node Enterprise ColumnStore include:

    Connector
    MariaDB Connector/R2DBC

    Set this system variable to utf8

    Set this system variable to utf8_general_ci

    columnstore_use_import_for_batchinsert

    Set this system variable to ALWAYS to always use cpimport for and statements.

    Example Configuration

    Start the Enterprise ColumnStore Services

    Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    Start and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:

    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account. By default, it connects to the server using the root user with no password. MariaDB Enterprise Server 10.6 will reject this login attempt by default, so you will need to configure Enterprise ColumnStore to use a different user account and password and create this user account on Enterprise Server.

    1. On the Enterprise ColumnStore node, create the user account with the statement:

    1. On the Enterprise ColumnStore node, grant the user account SELECT privileges on all databases with the GRANT statement:

    1. Configure Enterprise ColumnStore to use the utility user:

    1. Set the password:

    For details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Set SELinux to enforcing mode:

    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see How to create an AppArmor Profile on ubuntu.com.

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 3 of 5.

    Next: Step 4: Test MariaDB Enterprise ColumnStore.

    Single-Node Enterprise ColumnStore with Local storage
    Test Local Connection

    Connect to the server using MariaDB Client using the root@localhost user account:

    Test ColumnStore Plugin Status

    Query and confirm that the ColumnStore storage engine plugin is ACTIVE:

    Test ColumnStore Table Creation

    1. Create a test database, if it does not exist:

    1. Create a ColumnStore table:

    1. Add sample data into the table:

    1. Read data from table:

    Test Cross Engine Join

    1. Create an InnoDB table:

    1. Add data to the table:

    1. Perform a cross-engine join:

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 4 of 5.

    Next: Step 5: Bulk Import of Data.

    Single-Node Enterprise ColumnStore with Local storage
    . This means Read Replica nodes cannot handle write operations directly — instead, any write queries attempted on a replica are automatically forwarded to a read-write (RW) node.

    Replicas utilize shared storage with other nodes in the cluster, ensuring data consistency without duplication. A key requirement is maintaining at least one RW node — a cluster consisting solely of read replicas is not operational and cannot process reads or writes.

    Read-only nodes are incompatible with S3 as the storage backend.

    Additionally, there is no automatic promotion of a read replica to RW mode if the only RW node fails, which could lead to temporary downtime until manual intervention.

    Key Features

    • Horizontal Read Scaling: Adds compute power for handling more read-intensive queries without impacting write performance.

    • Write Forwarding: Ensures writes on replicas are redirected to RW nodes, maintaining data integrity.

    • Shared Storage: Replicas access the same DBRoots as RW nodes, promoting efficiency and reducing storage overhead.

    Key Commands

    These commands require CMAPI.

    • Add Read Replica. To introduce a read-only node for scaling reads, run this command:

    • Remove Node. To safely remove any node (RW or replica) from the cluster, run this command:

    This reassigns resources as needed without cluster disruption.

    • Verify Status. To monitor the cluster's health and node roles, issue:

    Limitations

    • Node addition is restricted to private IPs only.

    • Incompatible with S3 storage, limiting use to shared file systems.

    • No automatic failover or promotion mechanism if the sole RW node goes down, requiring manual recovery.

    • At least one RW node must always be present for the cluster to function properly, supporting both read and write operations.

    How-To

    Prerequisites

    Ensure shared storage is mounted on all nodes (at /var/lib/columnstore/data1 for non-s3 configuration), to ensure data consistency across RW nodes and read replicas.

    Refer to shared storage setup for exact mount points details.

    Installation and Setup

    1

    Set Up MariaDB Repository

    Run the following to add the MariaDB repository (adjust "11.4" to the latest stable version):

    See for additional details about the ES repo setup.

    2

    Install Packages

    Run the following commands on all nodes.

    For RPM-based systems, run this command:

    Refer to for additional information.

    For DEB-based systems, run these commands:

    3

    Start and Enable Services

    4

    Configure the Initial RW Node

    On the primary RW node, set up the cluster API key (use a secure API key):

    5

    Add the Initial RW Node to the Cluster

    Run this from the primary RW node:

    6

    Add Read Replica Nodes

    From the primary RW node, add each read replica:

    7

    Verify the Cluster

    Check the status to ensure nodes are added and the cluster is healthy:

    8

    Configure Replication Between Nodes

    See for instructions on how to set up replication, and for instructions how to create user accounts and configure replication for multi-node local storage.

    9

    Configure MaxScale

    See for instructions.

    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    Create an S3 Bucket

    If you want to use S3-compatible storage, it is important to create the S3 bucket before you start ColumnStore. If you already have an S3 bucket, confirm that the bucket is empty.

    S3 bucket configuration will be performed later in this procedure.

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 1 of 5.

    Next: Step 2: Install MariaDB Enterprise ColumnStore.

    Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.

    As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.

    For additional information, see "MariaDB Enterprise ColumnStore Query Evaluation.".

    Batch Primitive Step (BPS)

    Enterprise ColumnStore defines a batch primitive step to handle many types of tasks, such as scanning/filtering columns, JOIN operations, aggregation, functional filtering, and projecting (putting values into a SELECT list).

    In calGetTrace() output, a batch primitive step is abbreviated BPS.

    Batch primitive steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the batch primitive step to one extent at a time. The PrimProc process uses a thread pool to operate on individual blocks within the extent in parallel.

    Cross Engine Step (CES)

    Enterprise ColumnStore defines a cross-engine step to perform cross-engine joins, in which a ColumnStore table is joined with a table that uses a different storage engine.

    In calGetTrace() output, a cross-engine step is abbreviated CES.

    Cross-engine steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore can perform cross-engine joins when the mandatory utility user is properly configured.

    For additional information, refer to the "Mandatory Utility User Account"

    Dictionary Structure Step (DSS)

    Enterprise ColumnStore defines a dictionary structure step to scan the dictionary extents that ColumnStore uses to store variable-length string values.

    In calGetTrace() output, a dictionary structure step is abbreviated DSS.

    Dictionary structure steps are evaluated on multiple nodes in parallel. The PrimProc process on each node evaluates the dictionary structure step to one extent at a time. It uses a thread pool to operate on individual blocks within the extent in parallel.

    Dictionary structure steps can require a lot of I/O for a couple of reasons:

    • Dictionary structure steps do not support extent elimination, so all extents for the column must be scanned.

    • Dictionary structure steps must read the column extents to find each pointer and the dictionary extents to find each value, so it doubles the number of extents to scan.

    It is generally recommended to avoid queries that will cause dictionary scans.

    For additional information, see "Avoid Creating Long String Columns".

    Hash Join Step (HJS)

    Enterprise ColumnStore defines a hash join step to perform a hash join between two tables.

    In calGetTrace() output, a hash join step is abbreviated HJS.

    Hash join steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore performs the hash join in memory by default. If you perform large joins, you may be able get better performance by changing some configuration defaults with mcsSetConfig:

    • Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.

    • Enterprise ColumnStore can be configured to use disk-based joins.

    For additional information, see "Configure in-memory joins" and "Configure Disk-Based Joins".

    Having Step (HVS)

    Enterprise ColumnStore defines a having step to evaluate a HAVING clause on a result set.

    In calGetTrace() output, a having step is abbreviated HVS.

    Subquery Step (SQS)

    Enterprise ColumnStore defines a subquery step to evaluate a subquery.

    In calGetTrace() output, a subquery step is abbreviated SQS.

    Tuple Aggregation Step (TAS)

    Enterprise ColumnStore defines a tuple aggregation step to collect intermediate aggregation prior to the final aggregation and evaluation of the results.

    In calGetTrace() output, a tuple aggregation step is abbreviated TAS.

    Tuple aggregation steps are primarily evaluated by the ExeMgr process on the initiator/aggregator node. However, the PrimProc process on each node also plays a role, since the PrimProc process on each node provides the intermediate aggregation results to the ExeMgr process on the initiator/aggregator node.

    Tuple Annexation Step (TNS)

    Enterprise ColumnStore defines a tuple annexation step to perform the final aggregation and evaluation of the results.

    In calGetTrace() output, a tuple annexation step is abbreviated TNS.

    Tuple annexation steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Enterprise ColumnStore 5 performs aggregation operations in memory. As a consequence, more complex aggregation operations require more memory in that version.

    In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

    For additional information, see "Configure Disk-Based Aggregations".

    Tuple Union Step (TUS)

    Enterprise ColumnStore defines a tuple union step to perform a union of two subqueries.

    In calGetTrace() output, a tuple union step is abbreviated TUS.

    Tuple union steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Tuple Constant Step (TCS)

    Enterprise ColumnStore defines a tuple constant step to evaluate constant values.

    In calGetTrace() output, a tuple constant step is abbreviated TCS.

    Tuple constant steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    Window Function Step (WFS)

    Enterprise ColumnStore defines a window function step to evaluate window functions.

    In calGetTrace() output, a window function step is abbreviated WFS.

    Window function steps are evaluated locally by the ExeMgr process on the initiator/aggregator node.

    ExeMgr process
    Test S3 Connection

    MariaDB Enterprise ColumnStore 23.10 includes a testS3Connection command to test the S3 configuration, permissions, and connectivity.

    On each Enterprise ColumnStore node, test the S3 configuration:

    If the testS3Connection command does not return OK, investigate the S3 configuration.

    Test Local Connection

    Connect to the server using using the root@localhost user account:

    Test ColumnStore Plugin Status

    Query and confirm that the ColumnStore storage engine plugin is ACTIVE:

    Test ColumnStore Table Creation

    1. Create a test database, if it does not exist:

    1. Create a ColumnStore table:

    1. Add sample data into the table:

    1. Read data from table:

    Test Cross Engine Join

    1. Create an InnoDB table:

    1. Add data to the table:

    1. Perform a cross-engine join:

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 4 of 5.

    Next: Step 5: Bulk Import of Data.

    Retrieve Download Token

    MariaDB Corporation provides package repositories for CentOS / RHEL (YUM) and Debian / Ubuntu (APT). A download token is required to access the MariaDB Enterprise Repository.

    Customer Download Tokens are customer-specific and are available through the MariaDB Customer Portal.

    To retrieve the token for your account:

    1. Navigate to https://customers.mariadb.com/downloads/token/

    2. Log in.

    3. Copy the Customer Download Token.

    Substitute your token for CUSTOMER_DOWNLOAD_TOKEN when configuring the package repositories.

    Set Up Repository

    1. On each Enterprise ColumnStore node, install the prerequisites for downloading the software from the Web. Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    1. On each Enterprise ColumnStore node, configure package repositories and specify Enterprise Server:

    Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    Install Enterprise ColumnStore

    Install additional dependencies:

    Install on CentOS / RHEL (YUM)

    Install of Debian 10 and Ubuntu 20.04 (APT):

    Install on Debian 9 and Ubuntu 18.04 (APT):

    Install MariaDB Enterprise Server and MariaDB Enterprise ColumnStore:

    Install on CentOS / RHEL (YUM):

    Install on Debian / Ubuntu (APT):

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    This page was step 2 of 5.

    Next: Step 3: Start and Configure MariaDB Enterprise ColumnStore.

    Single-Node Enterprise ColumnStore with Local storage
    cpimport utility
    https://customers.mariadb.com/downloads/token/
    Next: Step 4: Start and Configure MariaDB Enterprise Server.
    Next: Step 2: Install MariaDB Enterprise ColumnStore.

    Shell

    cpimport

    • SQL access is not required

    SQL

    LOAD DATA INFILE

    • Shell access is not required

    Remote Database

    Remote Database Import

    • Use normal database client

    • Avoid dumping data to intermediate filed

    cpimport
    cpimport
    cpimport
    cpimport
    https://customers.mariadb.com/downloads/token/
    Next: Step 4: Start and Configure MariaDB Enterprise Server.
    Deployment Instructions
    MariaDB Columnstore Blogs
    forum interface
    jira.mariadb.org
    support utility
    cpimport
    LOAD DATA INFILE
    Remote Database Import
    columnstore-topology-s3
    columnstore-topology

    Step 2: Configure Shared Local Storage

    Step 2: Configure Shared Local Storage

    Overview

    This page details step 2 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step configures shared local storage on systems hosting Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Directories for Shared Local Storage

    In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.

    The Storage Manager directory is at the following path:

    • /var/lib/columnstore/storagemanager

    The Storage Manager directory must be mounted on every ColumnStore node.

    Choose a Shared Local Storage Solution

    Select a Shared Local Storage solution for the Storage Manager directory:

    For additional information, see "".

    Configure EBS Multi-Attach

    EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    For Enterprise ColumnStore deployments in AWS:

    • EBS Multi-Attach is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EBS Multi-Attach.

    Configure Elastic File System (EFS)

    EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)

    For deployments in AWS:

    • EFS is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EFS.

    Configure Filestore

    Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    For Enterprise ColumnStore deployments in GCP:

    • Filestore is the recommended option for the Storage Manager directory.

    • Google Object Storage (S3-compatible) is the recommended option for data.

    • Consult the vendor documentation for details on how to configure Filestore.

    Configure GlusterFS

    GlusterFS is a distributed file system. GlusterFS is a shared local storage option, but it is not one of the recommended options.

    For more information, see "".

    Install GlusterFS

    On each Enterprise ColumnStore node, install GlusterFS.

    Install on CentOS / RHEL 8 (YUM):

    Install on CentOS / RHEL 7 (YUM):

    Install on Debian (APT):

    Install on Ubuntu (APT):

    Start the GlusterFS Daemon

    Start the GlusterFS daemon:

    Probe the GlusterFS Peers

    Before you can create a volume with GlusterFS, you must probe each node from a peer node.

    1. On the primary node, probe all of the other cluster nodes:

    1. On one of the replica nodes, probe the primary node to confirm that it is connected:

    1. On the primary node, check the peer status:

    Configure and Mount GlusterFS Volumes

    Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.

    1. On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:

    1. On the primary node, create the GlusterFS volumes:

    1. On the primary node, start the volume:

    1. On each Enterprise ColumnStore node, create mount points for the volumes:

    1. On each Enterprise ColumnStore node, add the mount points to /etc/fstab:

    1. On each Enterprise ColumnStore node, mount the volumes:

    Configure Network File System (NFS)

    NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.

    For on-premises deployments:

    • NFS is the recommended option for the Storage Manager directory.

    • Any S3-compatible storage is the recommended option for data.

    Consult the documentation for your NFS implementation for details on how to configure NFS.

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 2 of 9.

    ColumnStore Storage Engine

    Overview

    MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

    For deployment instructions and available documentation, see "MariaDB Enterprise ColumnStore."

    The ColumnStore storage engine has the following features:

    Feature
    Detail
    Resources

    Examples

    Creating a ColumnStore Table

    To create a ColumnStore table, use the statement with the ENGINE=ColumnStore option:

    Multi-Node Configuration

    To deploy a multi-node Enterprise ColumnStore deployment, a configuration similar to below is required:

    Configure the Mandatory Utility User Account

    To configure the mandatory utility user account, use the mcsSetConfig command:

    Topologies Overview

    MariaDB offers varied deployment topologies by workload and technology, each named and diagrammed with benefits listed. Custom configurations are also supported.

    MariaDB products can be deployed in many different topologies. The topologies described in this section are representative of the overall structure. MariaDB products can be deployed to form other topologies, leverage advanced product capabilities, or combine the capabilities of multiple topologies.

    Topologies are the arrangements of nodes and links to achieve a purpose. This documentation describes a few of the many topologies that can be deployed using MariaDB database products.

    We group topologies by workload (transactional, analytical, or hybrid) and technologies (Enterprise Spider). Single-node topologies are listed separately.

    To help you select the correct topology:

    • Each topology is named, and this name is used consistently throughout the documentation.

    • A thumbnail diagram provides a small-scale summary of the topology's architecture.

    • Finally, we provide a list of the benefits of the topology.

    Although multiple topologies are listed on this page, the listed topologies are not the only options. MariaDB products are flexible, configurable, and extensible, so it is possible to deploy different topologies that combine the capabilities of multiple topologies listed on this page. The topologies listed on this page are primarily intended to be representative of the most commonly requested use cases.

    Transactional (OLTP)

    Primary/Replica Topology

    Diagram
    Features

    Galera Cluster Topology

    Diagram
    Features

    Analytical (OLAP, Data Warehousing, DSS)

    ColumnStore Shared Local Storage Topology

    Diagram
    Features

    ColumnStore Object Storage Topology

    Diagram
    Features

    Hybrid Workloads

    HTAP Topology

    Diagram
    Features

    Optimizing Linux Kernel Parameters for MariaDB ColumnStore

    This page provides information on optimizing Linux kernel parameters for improved performance with MariaDB ColumnStore.

    Introduction

    MariaDB ColumnStore is a high-performance columnar database designed for analytical workloads. By optimizing the Linux kernel parameters, you can further enhance the performance of your MariaDB ColumnStore deployments.

    Recommended Parameters

    The following table lists the recommended optimized Linux kernel parameters for MariaDB ColumnStore:

    For more information refer to .

    Parameter
    Recommended Value
    Explanation

    Configuration Example

    To configure these parameters, you can add them to the /etc/sysctl.conf file. For example:

    After making changes to the /etc/sysctl.conf file, you need to apply the changes by running the following command:

    Increase the Limit for Memory-Mapped Areas

    Common Use Cases

    These optimized parameters are recommended for all MariaDB ColumnStore deployments, regardless of the specific workload. They can improve performance for various use cases, including:

    • Large-scale data warehousing

    • Real-time analytics

    • Business intelligence

    • Machine learning

    Related Links

    Conclusion

    By optimizing the Linux kernel parameters, you can significantly improve the performance of your MariaDB ColumnStore deployments. These recommendations provide a starting point for optimizing your system, and you may need to adjust the values based on your specific hardware and workload.

    Query Tuning Recommendations

    When tuning queries for MariaDB Enterprise ColumnStore, there are some important details to consider.

    Avoid Selecting Unnecessary Columns

    Enterprise ColumnStore only reads the columns that are necessary to resolve a query.

    For example, the following query selects every column in the table:

    Whereas the following query only selects two columns in the table, so it requires less I/O:

    For best performance, only select the columns that are necessary to resolve a query.

    Avoid Large Sorts

    When Enterprise ColumnStore performs ORDER BY and LIMIT operations, the operations are performed in a single-threaded manner after the rest of the query processing has been completed, and the full unsorted result-set has been retrieved. For large data sets, the performance overhead can be significant.

    Avoid Excessive Aggregations

    When Enterprise ColumnStore 5 performs aggregations (i.e., DISTINCT, GROUP BY, COUNT(*), etc.), all of the aggregation work happens in-memory by default. As a consequence, more complex aggregation operations require more memory in that version.

    For example, the following query could require a lot of memory in Enterprise ColumnStore 5, since it has to calculate many distinct values in memory:

    Whereas the following query could require much less memory in Enterprise ColumnStore 5, since it has to calculate fewer distinct values:

    In Enterprise ColumnStore 6, disk-based aggregations can be enabled.

    For best performance, avoid excessive aggregations or enable disk-based aggregations.

    For additional information, see "".

    Avoid Non-Distributed Functions

    When Enterprise ColumnStore evaluates built-in functions and aggregate functions, it can often evaluate the function in a distributed manner. Distributed evaluation of functions can significantly improve performance.

    Enterprise ColumnStore supports distributed evaluation for some built-in functions. For other built-in functions, the function must be evaluated serially on the final result set.

    Enterprise ColumnStore also supports distributed evaluation for user-defined functions developed with . For functions developed with Enterprise Server's standard User-Defined Function (UDF) API, the function must be evaluated serially on the final result set.

    For best performance, avoid non-distributed functions.

    Optimize Large Joins

    By default, Enterprise ColumnStore performs all joins as in-memory hash joins.

    If the joined tables are very large, the in-memory hash join can require too much memory for the default configuration. There are a couple options to work around this:

    • Enterprise ColumnStore can be configured to use more memory for in-memory hash joins.

    • Enterprise ColumnStore can be configured to use disk-based joins.

    • Enterprise ColumnStore can use optimizer statistics to better optimize the join order.

    For additional information, see "", "", and "".

    Load Ordered Data in Proper Order

    Enterprise ColumnStore uses extent elimination to optimize queries. uses the minimum and maximum values in the to determine which extents can be skipped for a query.

    When data is loaded into Enterprise ColumnStore, it appends the data to the latest extent. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent. As a consequence, if ordered data is loaded in its proper order, then similar values will be clustered together in the same extent. This can improve query performance, because extent elimination performs best when similar values are clustered together.

    For example, if you expect to query a table with a filter on a timestamp column, you should sort the data using the timestamp column before loading it into Enterprise ColumnStore. Later, when the table is queried with a filter on the timestamp column, Enterprise ColumnStore would be able to skip many extents using extent elimination.

    For best performance, load ordered data in proper order.

    Enable Decimal Overflow Checks

    When Enterprise ColumnStore performs mathematical operations with very big values using the , , and data types, the operation can sometimes overflow ColumnStore's maximum precision or scale. The maximum precision and scale depend on the version of Enterprise ColumnStore:

    • In Enterprise ColumnStore 6, the maximum precision (M) is 38, and the maximum scale (D) is 38.

    • In Enterprise ColumnStore 5, the maximum precision (M) is 18, and the maximum scale (D) is 18.

    In Enterprise ColumnStore 6, applications can configure Enterprise ColumnStore to check for decimal overflows by setting the columnstore_decimal_overflow_check system variable, but only when the column has a decimal precision that is 18 or more:

    When decimal overflow checks are enabled, math operations have extra overhead.

    When the decimal overflow check fails, MariaDB Enterprise ColumnStore raises an error with the ER_INTERNAL_ERROR error SQL, and it writes detailed information about the overflow check failure to the ColumnStore system logs.

    User-Defined Aggregate Function (UDAF) C++ API

    MariaDB Enterprise ColumnStore supports Enterprise Server's standard User-Defined Function (UDF) API. However, UDFs developed using that API cannot be executed in a distributed manner.

    To support distributed execution of custom SQL, MariaDB Enterprise ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:

    • The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.

    • These functions can also be used as Analytic (Window) functions just like any built-in aggregate function.

    Performance Related Configuration Settings

    MariaDB ColumnStore

    Introduction

    A number of system configuration variables exist to allow fine tuning of the system to suit the physical hardware and query characteristics. In general the default values will work relatively well for many cases.

    The configuration parameters are maintained in the /etc/Columnstore.xml file. In a multiple server deployment these should only be edited on the PM1 server as this will be automatically replicated to other servers by the system. A system restart will be required for the configuration change to take affect.

    Convenience utility programs getConfig and setConfig are available to safely update the Columnstore.xml without needing to be comfortable with editing XML files. The -h argument will display usage information.

    Memory Management

    NumBlocksPct

    The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 25, to ensure enough physical memory.

    The NumBlocksPct configuration parameter specifies the percentage of physical memory to utilize for disk block caching. The default value is 50, to ensure enough physical memory.

    TotalUmMemory

    The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 50.

    The TotalUmMemory configuration parameter specifies the percentage of physical memory to utilize for joins, intermediate results and set operations. This specifies an upper limit for small table results in joins rather than a pre-allocation of memory. The default value is 25.

    Memory Requirements

    In a single server or combined deployment, the sum of NumBlocksPct and TotalUmMemory should typically not exceed 75% of physical memory. With very large memory servers this could be raised but the key point is to leave enough memory for other processes including mariadbd.

    From version 1.2.2, these can be set to static numeric limits instead of percentages by entering a number with 'M' or 'G' at the end to signify MiB or GiB.

    Query Concurrency - MaxOutstandingRequests

    ColumnStore handles concurrent query execution by managing the rate of concurrent batch primitive steps. This is configured using the MaxOutstandingRequests parameter and has a default value of 20. Each batch primitive step is executed within the context of 1 extent column according to this high level process:

    • ColumnStore issues up to MaxOutstandingRequests number of batch primitive steps.

    • PrimProc processes the request, using many threads and returns its response. These generally take a fraction of a second up to a low number of seconds depending on the amount of Physical I/O and the performance of that storage.

    • ColumnStore issues new requests as prior requests complete maintaining the maximum number of outstanding requests.

    This scheme allows for large queries to use all available resources when not otherwise being consumed and for smaller queries to execute with minimal delay. Lower values optimize for higher throughput of smaller queries while a larger value optimizes for response time of that query. The default value should work well under most circumstances however the value should be increased as the number of nodes is increased.

    How many Queries are running and how many queries are currently in the queue can be checked with

    Join Tuning - PmMaxMemorySmallSide

    ColumnStore maintains statistics for table and utilizes this to determine which is the larger table of the two. This is based both on the number of blocks in that table and estimation of the predicate cardinality. The first step is to apply any filters as appropriate to the smaller table and returning this data set to memory. The size of this data set is compared against the configuration parameter PmMaxMemorySmallSide which has a default value of 64 (MB). This value can be set all the way up to 4GB. This default allows for approximately 1M rows on the small table side to be joined against billions (or trillions) on the large table side. If the size of the small data set is less than PmMaxMemorySmallSide the dataset is sent to PrimProc for creation of a distributed hashmap. Thus this setting is important to tuning of joins and whether the operation can be distributed or not. This should be set to support your largest expected small table join size up to available memory:

    • Although this will increase the size of data between nodes to support the join, it means that the join and subsequent aggregates are pushed down, scaled out, and a smaller data set is returned back.

    • In a multiple server deployment, the sizing should be based from available physical memory on the servers, how much memory to reserve for block caching, and the number of simultaneous join operations that can be expected to run times the average small table join data size.

    Multi-Table Join Tuning

    The above logic for a single table join extrapolates out to multi table joins where the small table values are precalculated and performed as one single scan against the large table. This works well for the typical star schema case joining multiple dimension tables with a large fact table. For some join scenarios it may be necessary to sequence joins to create the intermediate datasets for joining, this would happen for instance with a snowflake schema structure. In some extreme cases it may be hard for the optimizer to be able to determine the most optimal join path. In this case a hint is available to force a join ordering. The INFINIDB_ORDERED hint will force the first table in the from clause to be considered the largest table and override any statistics based decision, for example:

    Note: INFINIDB\_ORDERED is deprecated and does not work anymore for ColumnStore 1.2 and above.

    use set infinidb_ordered_only=ON;

    and for 1.4 set columnstore_ordered_only=ON;

    Disk-Based Joins - AllowDiskBasedJoin

    When a join is very large and exceeds the PmMaxMemorySmallSide setting, it is performed in memory. For very large joins, this could exceed the available memory, in which case this is detected and a query error reported. Several configuration parameters are available to enable and configure usage of disk overflow should this occur:

    • AllowDiskBasedJoin – Controls the option to use disk Based joins or not. Valid values are Y (enabled) or N (disabled). By default, this option is disabled.

    • TempFileCompression – Controls whether the disk join files are compressed or noncompressed. Valid values are Y (use compressed files) or N (use non-compressed files).

    • TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /tmp/columnstore_tmp_files/). Files (named infinidb-join-data*

    A MariaDB global or session variable is available to specify a memory limit at which point the query is switched over to disk-based joins:

    • infinidb_um_mem_limit - Memory limit in MB per user (i.e., switch to disk-based join if this limit is exceeded). By default, this limit is not set (value of 0).

    Step 3: Start and Configure Enterprise ColumnStore

    Step 3: Start and Configure Enterprise ColumnStore

    Overview

    This page details step 3 of a 5-step procedure for deploying .

    This step starts and configures MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 1: Prepare ColumnStore Nodes

    Step 1: Prepare ColumnStore Nodes

    Overview

    This page details step 1 of the 9-step procedure "".

    This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 2: Configure Shared Local Storage

    Step 2: Configure Shared Local Storage

    Overview

    This page details step 2 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step configures shared local storage on systems hosting Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Backup and Restore with Object Storage

    Overview

    MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses for data and shared local storage for the , the S3 bucket, the Storage Manager directory, and the MariaDB data directory must be backed up separately.

    Recovery Planning

    Step 1: Prepare ColumnStore Nodes

    Step 1: Prepare ColumnStore Nodes

    Overview

    This page details step 1 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step prepares systems to host MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Backup and Restore with Shared Local Storage

    Overview

    MariaDB Enterprise ColumnStore supports backup and restore. If Enterprise ColumnStore uses shared local storage for the DB Root directories, the DB Root directories and the MariaDB data directory must be backed up separately.

    Recovery Planning

    Data Import

    Learn how to import data into MariaDB ColumnStore. This section covers various methods and tools for efficiently loading large datasets into your columnar database for analytical workloads.

    Overview

    MariaDB Enterprise ColumnStore supports very efficient bulk data loads.

    MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms, including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.

    maxctrl set server \
       mcs3 \
       maintenance
    maxctrl set server \
       mcs3 \
       maintenance \
       --force
    maxctrl list servers
    maxctrl clear server \
       mcs3 \
       maintenance
    maxctrl list servers
    [mariadb]
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    $ sudo systemctl start mariadb
    
    $ sudo systemctl enable mariadb
    $ sudo systemctl start mariadb-columnstore
    
    $ sudo systemctl enable mariadb-columnstore
    CREATE USER 'util_user'@'127.0.0.1'
    IDENTIFIED BY 'util_user_passwd';
    GRANT SELECT, PROCESS ON *.*
    TO 'util_user'@'127.0.0.1';
    $ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    
    $ sudo mcsSetConfig CrossEngineSupport Port 3306
    
    $ sudo mcsSetConfig CrossEngineSupport User util_user
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_passwd
    $ sudo yum install policycoreutils policycoreutils-python
    $ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utils
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    
    Nothing to do
    $ sudo semodule -i mariadb_local.pp
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=enforcing
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo setenforce enforcing
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    CREATE DATABASE IF NOT EXISTS test;
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE=ColumnStore;
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    SELECT * FROM test.contacts;
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    CREATE TABLE test.addresses (
       email VARCHAR(100),
       street_address VARCHAR(255),
       city VARCHAR(100),
       state_code VARCHAR(2)
    ) ENGINE = InnoDB;
    INSERT INTO test.addresses (email, street_address, city, state_code)
       VALUES
       ("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
       ("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");
    SELECT name AS "Name", addr AS "Address"
    FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
       email FROM test.contacts) AS contacts
    INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
       email FROM test.addresses) AS addr
    WHERE  contacts.email = addr.email;
    +----------+-----------------------------------------+
    | Name     | Address                                 |
    +----------+-----------------------------------------+
    | Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
    | Lee Wang | 32620 Little Blvd, Redwood City, CA     |
    +----------+-----------------------------------------+
    
    +-------------------+-------------------------------------+
    | Name              | Address                             |
    +-------------------+-------------------------------------+
    | Walker Percy      | 500 Thomas More Dr., Covington, LA  |
    | Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
    +-------------------+-------------------------------------+
    wget https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup ;
    chmod +x mariadb_es_repo_setup;
    ./mariadb_es_repo_setup --token="xxxxx" --apply --mariadb-server-version="11.4"
    sudo mcs node add --read-replica --node <private-ip>
    sudo mcs node remove --node <private-ip>
    sudo mcs cluster status
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    $ sudo testS3Connection
    StorageManager[26887]: Using the config file found at /etc/columnstore/storagemanager.cnf
    StorageManager[26887]: S3Storage: S3 connectivity & permissions are OK
    S3 Storage Manager Configuration OK
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    CREATE DATABASE IF NOT EXISTS test;
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE=ColumnStore;
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    SELECT * FROM test.contacts;
    
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    CREATE TABLE test.addresses (
       email VARCHAR(100),
       street_address VARCHAR(255),
       city VARCHAR(100),
       state_code VARCHAR(2)
    ) ENGINE = InnoDB;
    INSERT INTO test.addresses (email, street_address, city, state_code)
       VALUES
       ("kai.devi@example.com", "1660 Amphibious Blvd.", "Redwood City", "CA"),
       ("lee.wang@example.com", "32620 Little Blvd", "Redwood City", "CA");
    SELECT name AS "Name", addr AS "Address"
    FROM (SELECT CONCAT(first_name, " ", last_name) AS name,
       email FROM test.contacts) AS contacts
    INNER JOIN (SELECT CONCAT(street_address, ", ", city, ", ", state_code) AS addr,
       email FROM test.addresses) AS addr
    WHERE  contacts.email = addr.email;
    +----------+-----------------------------------------+
    | Name     | Address                                 |
    +----------+-----------------------------------------+
    | Kai Devi | 1660 Amphibious Blvd., Redwood City, CA |
    | Lee Wang | 32620 Little Blvd, Redwood City, CA     |
    +----------+-----------------------------------------+
    
    +-------------------+-------------------------------------+
    | Name              | Address                             |
    +-------------------+-------------------------------------+
    | Walker Percy      | 500 Thomas More Dr., Covington, LA  |
    | Flannery O'Connor | 300 Tarwater Rd., Milledgeville, GA |
    +-------------------+-------------------------------------+
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
          
     | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install epel-release
    
    $ sudo yum install jemalloc
    $ sudo apt install libjemalloc2
    $ sudo apt install libjemalloc1
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore
    sudo cpimport -s ',' \
       DATABASE_NAME \
       TABLE_NAME \
       /path/to/DATABASE_NAME-TABLE_NAME.csv
    SELECT * FROM DATABASE_NAME.TABLE_NAME LIMIT 100;
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install jemalloc jq curl
    $ sudo apt install libjemalloc1 jq curl
    $ sudo apt install libjemalloc2 jq curl
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine \
       MariaDB-columnstore-cmapi
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore \
       mariadb-columnstore-cmapi
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv
    LOAD DATA INFILE '/tmp/inventory-products.tsv'
    INTO TABLE inventory.products;
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ sudo yum install curl
    $ sudo apt install curl apt-transport-https
    $ curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    $ echo "${checksum}  mariadb_es_repo_setup" \
           | sha256sum -c -
    $ chmod +x mariadb_es_repo_setup
    $ sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
          --skip-maxscale \
          --skip-tools \
          --mariadb-server-version="11.4"
    $ sudo yum install jemalloc jq curl
    $ sudo apt install libjemalloc1 jq curl
    $ sudo apt install libjemalloc2 jq curl
    $ sudo yum install MariaDB-server \
       MariaDB-backup \
       MariaDB-shared \
       MariaDB-client \
       MariaDB-columnstore-engine \
       MariaDB-columnstore-cmapi
    $ sudo apt install mariadb-server \
       mariadb-backup \
       libmariadb3 \
       mariadb-client \
       mariadb-plugin-columnstore \
       mariadb-columnstore-cmapi
    mariadb -e "FLUSH TABLES WITH READ LOCK;"
    save_brm
    mkdir -p /extent_map_backup
    cp -f /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /extent_map_backup
    mariadb -e "UNLOCK TABLES;"
    systemctl stop mariadb-columnstore
    mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
    > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
    cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
    chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm/
    systemctl start mariadb-columnstore
    curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/shutdown \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
    mv /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_em /tmp/BRM_saves_em.bad
    > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vbbm > /var/lib/columnstore/data1/systemFiles/dbrm/BRM_saves_vss
    mv cp -f /extent_map_backup/BRM_saves_em /var/lib/columnstore/data1/systemFiles/dbrm/
    chown -R mysql:mysql /var/lib/columnstore/data1/systemFiles/dbrm
    curl -s -X PUT https://127.0.0.1:8640/cmapi/0.4.0/cluster/start \ --header 'Content-Type:application/json' \ --header 'x-api-key:your_api_key' \ --data '{"timeout":60}' -k
    SELECT * FROM tab;
    SELECT col1, col2 FROM tab;
    NFS (Network File System)
    EBS (Elastic Block Store) Multi-Attach
    EFS (Elastic File System)
    Filestore
    GlusterFS
    Shared Local Storage Options
    Recommended Storage Options
    Next: Step 3: Install MariaDB Enterprise Server.

    Compression

    Yes

    High Availability (HA)

    Yes

    Main Memory Caching

    Yes

    Transaction Logging

    Yes

    Garbage Collection

    Yes

    Online Schema changes

    Yes

    Non-locking Reads

    Yes

    Storage Engine

    ColumnStore

    Availability

    ES 10.5+, CS 10.5+

    MariaDB Enterprise Server

    Workload Optimization

    OLAP and Hybrid

    OLAP Workloads Hybrid Workloads

    Table Orientation

    Columnar

    Columnar Storage Engine

    ACID-compliant

    Yes

    Indexes

    Unnecessary

    net.core.netdev_max_backlog

    2500

    Sets the maximum number of packets that can be queued for a network device. A higher value allows for more packets to be queued, improving performance.

    net.core.rmem_max

    16777216

    Sets the maximum receive buffer size for TCP sockets. A higher value allows for larger buffers, improving performance.

    net.core.wmem_max

    16777216

    Sets the maximum send buffer size for TCP sockets. A higher value allows for larger buffers, improving performance.

    net.ipv4.tcp_max_syn_backlog

    8192

    Sets the maximum number of queued SYN requests. A higher value allows for more queued requests, improving performance.

    net.ipv4.tcp_timestamps

    0

    Disables TCP timestamps, reducing overhead and improving performance.

    vm.max_map_count

    4,262,144

    Increases the maximum number of memory map areas a process may have. The default is 65,530, which can be too low for workloads like MariaDB ColumnStore. Raising this prevents mapping errors for processes that need large address spaces.

    kernel.pid_max

    4,194,304

    Defines the maximum process ID value. Older Linux versions defaulted to 32,768; newer versions default to 4,194,304. Raising this ensures support for systems running a very large number of processes concurrently.

    kernel.threads-max

    2,000,000

    Specifies the maximum number of threads allowed on the system. The default varies depending on available RAM. A value of 2 million is suitable for systems with 32–64GB RAM. Increase further if running with more RAM or requiring more threads.

    vm.overcommit_memory

    1

    Disables overcommitting of memory, ensuring sufficient memory is available for MariaDB ColumnStore.

    vm.dirty_background_ratio

    5

    Sets the percentage of dirty memory that can be written back to disk in the background. A lower value reduces the amount of dirty memory, improving performance.

    vm.dirty_ratio

    10

    Sets the percentage of dirty memory that can be written back to disk before the kernel starts to write out clean pages. A lower value reduces the amount of dirty memory, improving performance.

    vm.vfs_cache_pressure

    50

    MariaDB ColumnStore Documentation
    Linux Kernel Documentation
    MCOL-5165: Add optimized Linux kernel parameters for MariaDB ColumnStore

    Sets the pressure level for the kernel's VFS cache. A lower value reduces the amount of memory used by the VFS cache, improving performance.

    ) in this directory will be created and cleaned on an as-needed basis. The entire directory is removed and recreated by
    ExeMgr
    at startup. It is strongly recommended that this directory be stored on a dedicated partition.
    Configure Enterprise ColumnStore

    Mandatory system variables and options for Single-Node Enterprise ColumnStore include:

    Connector
    MariaDB Connector/R2DBC

    Set this system variable to utf8

    Set this system variable to utf8_general_ci

    columnstore_use_import_for_batchinsert

    Set this system variable to ALWAYS to always use cpimport for and statements.

    Example Configuration

    Configure the S3 Storage Manager

    Configure Enterprise ColumnStore S3 Storage Manager to use S3-compatible storage by editing the /etc/columnstore/storagemanager.cnf configuration file:

    The S3-compatible object storage options are configured under [S3]:

    • The bucket option must be set to the name of the bucket that you created in "Create an S3 Bucket".

    • The endpoint option must be set to the endpoint for the S3-compatible object storage.

    • The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

    • To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.

    • To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.

    The local cache options are configured under [Cache]:

    • The cache_size option is set to 2 GB by default.

    • The path option is set to /var/lib/columnstore/storagemanager/cache by default.

    Ensure that the specified path has sufficient storage space for the specified cache size.

    Start the Enterprise ColumnStore Services

    Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    Start and enable the MariaDB Enterprise ColumnStore service, so that it starts automatically upon reboot:

    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.

    1. Create the user account with the statement:

    1. Grant the user account SELECT privileges on all databases with the statement:

    1. Configure Enterprise ColumnStore to use the utility user:

    1. Set the password:

    For details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Set SELinux to enforcing mode:

    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see How to create an AppArmor Profile on ubuntu.com.

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    This page was step 3 of 5.

    Next: Step 4: Test MariaDB Enterprise ColumnStore.

    Single-Node Enterprise ColumnStore with Object storage
    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file. Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    Temporarily Configure Firewall for Installation

    MariaDB Enterprise ColumnStore requires the following TCP ports:

    TCP Ports
    Description

    3306

    Port used for MariaDB Client traffic

    8600-8630

    Port range used for inter-node communication

    8640

    Port used by CMAPI

    8700

    Port used for inter-node communication

    8800

    Port used for inter-node communication

    The firewall should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The firewall will be configured and re-enabled later in this deployment procedure.

    The steps to disable the firewall depend on the specific firewall used by the operating system.

    CentOS / RHEL Stop firewalld

    1. Check if the firewalld service is running:

    1. If the firewalld service is running, stop it:

    Firewalld will be configured and re-enabled later in this deployment procedure.

    Ubuntu Stop UFW

    1. Check if the UFW service is running:

    1. If the UFW service is running, stop it:

    UFW will be configured and re-enabled later in this deployment procedure.

    Configure the AWS Security Group

    To install Enterprise ColumnStore on Amazon Web Services (AWS), the security group must be modified prior to installation.

    Enterprise ColumnStore requires all internal communications to be open between Enterprise ColumnStore nodes. Therefore, the security group should allow all protocols and all ports to be open between the Enterprise ColumnStore nodes and the MaxScale proxy.

    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    Configure DNS

    MariaDB Enterprise ColumnStore requires all nodes to have host names that are resolvable on all other nodes. If your infrastructure does not configure DNS centrally, you may need to configure static DNS entries in the /etc/hosts file of each server.

    On each Enterprise ColumnStore node, edit the /etc/hosts file to map host names to the IP address of each Enterprise ColumnStore node:

    Replace the IP addresses with the addresses in your own environment.

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 1 of 9.

    Multnode Localstorage
    Directories for Shared Local Storage

    In a ColumnStore Object Storage topology, MariaDB Enterprise ColumnStore requires the Storage Manager directory to be located on shared local storage.

    The Storage Manager directory is at the following path:

    • /var/lib/columnstore/storagemanager

    The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:

    • /var/lib/columnstore/data1

    • /var/lib/columnstore/data2

    • /var/lib/columnstore/data3

    The DB Root directories must be mounted on every ColumnStore node.

    Choose a Shared Local Storage Solution

    Select a Shared Local Storage solution for the Storage Manager directory:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    • Filestore

    • GlusterFS

    • NFS (Network File System)

    For additional information, see "Shared Local Storage Options".

    Configure EBS Multi-Attach

    EBS is a high-performance block-storage service for AWS (Amazon Web Services). EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    For Enterprise ColumnStore deployments in AWS:

    • EBS Multi-Attach is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EBS Multi-Attach.

    Configure Elastic File System (EFS)

    EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services)

    For deployments in AWS:

    • EFS is a recommended option for the Storage Manager directory.

    • Amazon S3 storage is the recommended option for data.

    • Consult the vendor documentation for details on how to configure EFS.

    Configure Filestore

    Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    For Enterprise ColumnStore deployments in GCP:

    • Filestore is the recommended option for the Storage Manager directory.

    • Google Object Storage (S3-compatible) is the recommended option for data.

    • Consult the vendor documentation for details on how to configure Filestore.

    Configure GlusterFS

    GlusterFS is a distributed file system.

    GlusterFS is a shared local storage option, but it is not one of the recommended options.

    For more information, see "Recommended Storage Options".

    Install GlusterFS

    On each Enterprise ColumnStore node, install GlusterFS.

    Install on CentOS / RHEL 8 (YUM):

    Install on CentOS / RHEL 7 (YUM):

    Install on Debian (APT):

    Install on Ubuntu (APT):

    Start the GlusterFS Daemon

    Start the GlusterFS daemon:

    Probe the GlusterFS Peers

    Before you can create a volume with GlusterFS, you must probe each node from a peer node.

    1. On the primary node, probe all of the other cluster nodes:

    1. On one of the replica nodes, probe the primary node to confirm that it is connected:

    1. On the primary node, check the peer status:

    Number of Peers: 2

    Configure and Mount GlusterFS Volumes

    Create the GlusterFS volumes for MariaDB Enterprise ColumnStore. Each volume must have the same number of replicas as the number of Enterprise ColumnStore nodes.

    1. On each Enterprise ColumnStore node, create the directory for each brick in the /brick directory:

    1. On the primary node, create the GlusterFS volumes:

    1. On the primary node, start the volume:

    1. On each Enterprise ColumnStore node, create mount points for the volumes:

    1. On each Enterprise ColumnStore node, add the mount points to /etc/fstab:

    1. On each Enterprise ColumnStore node, mount the volumes:

    Configure Network File System (NFS)

    NFS is a distributed file system. NFS is available in most Linux distributions. If NFS is used for an Enterprise ColumnStore deployment, the storage must be mounted with the sync option to ensure that each node flushes its changes immediately.

    For on-premises deployments:

    • NFS is the recommended option for the Storage Manager directory.

    • Any S3-compatible storage is the recommended option for data.

    Consult the documentation for your NFS implementation for details on how to configure NFS.

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 2 of 9.

    Next: Step 3: Install MariaDB Enterprise Server.

    MariaDB Enterprise ColumnStore supports multiple storage options.

    This page discusses how to backup and restore Enterprise ColumnStore when it uses S3-compatible object storage for data and shared local storage (such as NFS) for the Storage Manager directory.

    Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.

    If Enterprise ColumnStore is your system of record, it should be backed up regularly.

    If Enterprise ColumnStore uses S3-compatible object storage for data and shared local storage for the Storage Manager directory, the following items must be backed up:

    • The MariaDB Data directory is backed up using .

    • The S3 bucket must be backed up using the vendor's snapshot procedure.

    • The Storage Manager directory must be backed up.

    See the instructions below for more details.

    Backup

    Enterprise-ColumnStore-Backup-with-S3-Flowchart

    Use the following process to take a backup:

    1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

    The output will show "dbrm_mode": "master" for the primary server:

    1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

    1. Lock the database with the statement:

    Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

    1. Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to make a copy of the directory with rsync:

    1. Use to backup the MariaDB data directory:

    1. Use to prepare the backup:

    1. Create a snapshot of the S3-compatible storage. Consult the storage vendor's manual for details on how to do this.

    2. Ensure that all previous operations are complete.

    3. In the original client connection to the primary server, unlock the database with the statement:

    Restore

    Use the following process to restore a backup:

    1. Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.

    2. Ensure that all services are stopped on each node:

    1. Restore the backup of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to restore the backup with rsync:

    1. Use to restore the backup of the MariaDB data directory:

    1. Restore the snapshot of your S3-compatible storage to the new S3 bucket that you plan to use. Consult the storage vendor's manual for details on how to do this.

    2. Update storagemanager.cnf to configure Enterprise ColumnStore to use the S3 bucket. By default, it is located at /etc/columnstore/storagemanager.cnf.

    For example:

    • The default local cache size is 2 GB.

    • The default local cache path is /var/lib/columnstore/storagemanager/cache.

    • Ensure that the local cache path has sufficient store space to store the local cache.

    • The bucket option must be set to the name of the bucket that you created from your snapshot in the previous step.

    • To use an IAM role, you must also uncomment and set iam_role_name, sts_region, and sts_endpoint.

    1. Start the services on each node:

    S3-compatible object storage
    Storage Manager directory
    Optimize Linux Kernel Parameters

    MariaDB Enterprise ColumnStore performs best with Linux kernel optimizations.

    On each server to host an Enterprise ColumnStore node, optimize the kernel:

    1. Set the relevant kernel parameters in a sysctl configuration file. To ensure proper change management, use an Enterprise ColumnStore-specific configuration file.

    Create a /etc/sysctl.d/90-mariadb-enterprise-columnstore.conf file:

    1. Use the sysctl command to set the kernel parameters at runtime

    Temporarily Configure Linux Security Modules (LSM)

    The Linux Security Modules (LSM) should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The LSM will be configured and re-enabled later in this deployment procedure.

    The steps to disable the LSM depend on the specific LSM used by the operating system.

    CentOS / RHEL Stop SELinux

    SELinux must be set to permissive mode before installing MariaDB Enterprise ColumnStore.

    To set SELinux to permissive mode:

    1. Set SELinux to permissive mode:

    1. Set SELinux to permissive mode by setting SELINUX=permissive in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in permissive mode:

    SELinux will be configured and re-enabled later in this deployment procedure. This configuration is not persistent. If you restart the server before configuring and re-enabling SELinux later in the deployment procedure, you must reset the enforcement to permissive mode.

    Debian / Ubuntu AppArmor

    AppArmor must be disabled before installing MariaDB Enterprise ColumnStore.

    1. Disable AppArmor:

    1. Reboot the system.

    2. Confirm that no AppArmor profiles are loaded using aa-status:

    AppArmor will be configured and re-enabled later in this deployment procedure.

    Temporarily Configure Firewall for Installation

    MariaDB Enterprise ColumnStore requires the following TCP ports:

    TCP Ports
    Description

    3306

    Port used for MariaDB Client traffic

    8600-8630

    Port range used for inter-node communication

    8640

    Port used by CMAPI

    8700

    Port used for inter-node communication

    8800

    Port used for inter-node communication

    The firewall should be temporarily disabled on each Enterprise ColumnStore node during installation.

    The firewall will be configured and re-enabled later in this deployment procedure.

    The steps to disable the firewall depend on the specific firewall used by the operating system.

    CentOS / RHEL Stop firewalld

    1. Check if the firewalld service is running:

    1. If the firewalld service is running, stop it:

    Firewalld will be configured and re-enabled later in this deployment procedure.

    Ubuntu Stop UFW

    1. Check if the UFW service is running:

    1. If the UFW service is running, stop it:

    UFW will be configured and re-enabled later in this deployment procedure.

    Configure the AWS Security Group

    To install Enterprise ColumnStore on Amazon Web Services (AWS), the security group must be modified prior to installation.

    Enterprise ColumnStore requires all internal communications to be open between Enterprise ColumnStore nodes. Therefore, the security group should allow all protocols and all ports to be open between the Enterprise ColumnStore nodes and the MaxScale proxy.

    Configure Character Encoding

    When using MariaDB Enterprise ColumnStore, it is recommended to set the system's locale to UTF-8.

    1. On RHEL 8, install additional dependencies:

    1. Set the system's locale to en_US.UTF-8 by executing localedef:

    Configure DNS

    MariaDB Enterprise ColumnStore requires all nodes to have host names that are resolvable on all other nodes. If your infrastructure does not configure DNS centrally, you may need to configure static DNS entries in the /etc/hosts file of each server.

    On each Enterprise ColumnStore node, edit the /etc/hosts file to map host names to the IP address of each Enterprise ColumnStore node:

    Replace the IP addresses with the addresses in your own environment.

    Create an S3 Bucket

    With the ColumnStore Object Storage topology, it is important to create the S3 bucket before you start ColumnStore. All Enterprise ColumnStore nodes access data from the same bucket.

    If you already have an S3 bucket, confirm that the bucket is empty.

    S3 bucket configuration will be performed later in this procedure.

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 1 of 9.

    Next: Step 2: Configure Shared Local Storage.

    MariaDB Enterprise ColumnStore supports multiple storage options.

    This page discusses how to backup and restore Enterprise ColumnStore when it uses shared local storage (such as NFS) for the DB Root directories.

    Any file can become corrupt due to hardware issues, crashes, power loss, and other reasons. If the Enterprise ColumnStore data or metadata become corrupt, Enterprise ColumnStore could become unusable, resulting in data loss.

    If Enterprise ColumnStore is your system of record, it should be backed up regularly.

    If Enterprise ColumnStore uses shared local storage for the DB Root directories, the following items must be backed up:

    • The MariaDB Data directory is backed up using

    • The Storage Manager directory must be backed up

    • Each DB Root directories must be backed up

    See the instructions below for more details.

    Backup

    Use the following process to take a backup:

    1. Determine which node is the primary server using curl to send the status command to the CMAPI Server:

    The output will show dbrm_mode: master for the primary server:

    1. Connect to the primary server using MariaDB Client as a user account that has privileges to lock the database:

    1. Lock the database with the statement:

    Ensure that the client remains connected to the primary server, so that the lock is held for the remaining steps.

    1. Make a copy or snapshot of the Storage Manager directory. By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to make a copy of the directory with rsync:

    1. Make a copy or snapshot of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

    For example, to make a copy of the directories with rsync in a 3-node deployment:

    1. Use to backup the Storage Manager directory:

    1. Use to prepare the backup:

    1. Ensure that all previous operations are complete.

    2. In the original client connection to the primary server, unlock the database with the statement:

    Restore

    Use the following process to restore a backup:

    1. Deploy Enterprise ColumnStore, so that you can restore the backup to an empty deployment.

    2. Ensure that all services are stopped on each node:

    1. Restore the backup of the Storage Manager director. By default, it is located at /var/lib/columnstore/storagemanager.

    For example, to restore the backup with rsync:

    1. Restore the backup of the DB Root directories. By default, they are located at /var/lib/columnstore/dataN, where the N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment.

    For example, to restore the backup with rsync in a 3-node deployment:

    1. Use to restore the backup of the MariaDB data directory:

    1. Start the services on each node:

    cpimport

    MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:

    • Bypasses the SQL layer to decrease overhead

    • Does not block read queries

    • Requires a write metadata lock on the table, which can be monitored with the

    • Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.

    • Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "".

    • Supports parallel distributed bulk loads

    • Imports data from text files

    • Imports data from binary files

    • Imports data from standard input (stdin)

    Batch Insert Mode

    MariaDB Enterprise ColumnStore enables batch insert mode by default.

    When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for the following statements:

    • [[|load-data-infileLOAD DATA [ LOCAL ] INFILE]]

    Enterprise ColumnStore uses the following rules:

    • If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. It executes cpimport using a wrapper called cpimport.bin.

    • If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.

    Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable to OFF. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.

    Locking

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    Choose a Data Load Method

    Performance
    Method
    Interface
    Format(s)
    Location(s)
    Benefits

    Fastest

    Shell

    • Text file. • Binary file • Standard input (stdin)

    • Server file system

    Lowest latency. • Bypasses SQL layer. • Non-blocking

    Fast

    columnstore_info.load_from_s3

    SQL

    this blog post
    this page
    this page

    MariaDB Replication

    • Highly available

    • Asynchronous or semi-synchronous replication

    • Automatic failover via MaxScale

    • Manual provisioning of new nodes from backup

    • Scales read via MaxScale.

    • Enterprise Server 10.3+, MaxScale 2.5+

    Galera Cluster Topology Multi-Primary Cluster Powered by Galera for Transactional/OLTP Workloads

    • InnoDB Storage Engine

    • Highly available

    • Virtually synchronous, certification-based replication

    • Automated provisioning of new nodes (IST/SST)

    • Scales reads via MaxScale Enterprise Server 10.3+, MariaDB Enterprise Cluster (powered by Galera), MaxScale 2.5+

    Columnar storage engine with shared local storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales reads via MaxScale

    • Bulk data import

    • Enterprise Server, Enterprise ColumnStore, MaxScale

    • Optional

    Columnar storage engine with S3-compatible object storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales reads via MaxScale

    • Bulk data import

    • Enterprise Server, Enterprise ColumnStore, MaxScale

    • Single-stack hybrid transactional/analytical workloads

    • ColumnStore for analytics with scalable S3-compatible object storage

    • InnoDB for transactions• Cross-engine JOINs

    • Enterprise Server, Enterprise ColumnStore, MaxScale

    Configure Disk-Based Aggregations
    ColumnStore's User-Defined Aggregate Function (UDAF) C++ API
    Configure In-Memory Joins
    Configure Disk-Based Joins
    Optimizer Statistics
    Extent elimination
    extent map

    Single-Node Localstorage

    This guide provides steps for deploying a single-node ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.

    This procedure describes the deployment of the Single-Node Enterprise ColumnStore topology with Local storage.

    MariaDB Enterprise ColumnStore 23.10 is a columnar storage engine for MariaDB Enterprise Server 10.6. Enterprise ColumnStore is best suited for Online Analytical Processing (OLAP) workloads.

    This procedure has 5 steps, which are executed in sequence.

    This page provides an overview of the topology, requirements, and deployment procedures.

    Please read and understand this procedure before executing.

    Procedure Steps

    Step
    Description

    Support

    Customers can obtain support by .

    Components

    The following components are deployed during this procedure:

    Component
    Function

    MariaDB Enterprise Server Components

    Component
    Description

    Topology

    The Single-Node Enterprise ColumnStore topology provides support for Online Analytical Processing (OLAP) workloads to MariaDB Enterprise Server.

    The Enterprise ColumnStore node:

    • Receives queries from the application

    • Executes queries

    • Uses the local disk for storage.

    High Availability

    Single-Node Enterprise ColumnStore does not provide high availability (HA) for Online Analytical Processing (OLAP). If you would like to deploy Enterprise ColumnStore with high availability, see .

    Requirements

    These requirements are for the Single-Node Enterprise ColumnStore, when deployed with MariaDB Enterprise Server 10.6 and MariaDB Enterprise ColumnStore 23.10.

    Operating System

    • Debian 11 (x86_64, ARM64)

    • Debian 12 (x86_64, ARM64)

    • Red Hat Enterprise Linux 8 (x86_64, ARM64)

    • Red Hat Enterprise Linux 9 (x86_64, ARM64)

    Minimum Hardware Requirements

    MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the instead.

    The minimum hardware requirements are:

    Component
    CPU
    Memory

    MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.

    If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:

    And the following error message will be raised to the client:

    Recommended Hardware Requirements

    MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.

    The recommended hardware requirements are:

    Component
    CPU
    Memory

    Quick Reference

    MariaDB Enterprise Server Configuration Management

    Method
    Description

    MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.

    To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.

    Distribution
    Example Configuration File Path

    MariaDB Enterprise Server Service Management

    The systemctl command is used to start and stop the MariaDB Enterprise Server service.

    Operation
    Command

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Local storage deployment procedure:

    • Next: Step 1: Install MariaDB Enterprise ColumnStore 23.10.

    Performance Concepts

    Introduction

    The high level components of the ColumnStore architecture are:

    • PrimProc: PrimProc (Primitives Processor) is responsible for parsing the SQL requests into an optimized set of primitive job steps executed by one or more servers. PrimProc is thus responsible for query optimization and orchestration of query execution by the servers. While every instance has their own PrimProc in a multi-server deployment, each query begins and ends on the same PrimProc it originated from. A database load balancer such as MariaDB MaxScale can be deployed to appropriately balance external requests against individual servers. PrimProc also executes granular job steps received from the server (mariadbd) in a multi-threaded manner. ColumnStore allows distribution of the work across many servers.

    • Extent Maps: ColumnStore maintains metadata about each column in a shared distributed object known as the Extent Map. The primary node references the Extent Map to help assist in generating the correct primitive job steps. The primary node server references the Extent Map to identify the correct disk blocks to read. Each column is made up of one or more files and each file can contain multiple extents. As much as possible the system attempts to allocate contiguous physical storage to improve read performance.

    • Storage: ColumnStore can use either local storage or shared storage (e.g. SAN or EBS) to store data. Using shared storage allows for data processing to fail over to another node automatically in case of a server failing.

    Data Loading

    The system supports full MVCC ACID transactional logic via Insert, Update, and Delete statements. The MVCC architecture allows for concurrent query and DML / batch load. Although DML is supported, the system is optimized more for batch inserts and so larger data loads should be achieved through a batch load. The most flexible and optimal way to load data is via the cpimport tool. This tool optimizes the load path and can be run centrally or in parallel on each server.

    If the data contains a time or (time correlated ascending value) column then significant performance gains will be achieved if the data is sorted by this field and also typically queried with a where clause on that column. This is because the system records a minimum and maximum value for each extent providing for a system maintained range partitioning scheme. This allows the system to completely eliminate scanning an extent map if the query includes a where clause for that field limiting the results to a subset of extent maps.

    Query Execution

    MariaDB ColumnStore has its own query optimizer and execution engine distinct from the MariaDB server implementation. This allows for scaling out query execution to multiple servers, and to optimize for handling data stored as columns rather than rows. As such, the factors influencing query performance are very different:

    A query is first parsed by the MariaDB server (mariadbd) process and passed through to the ColumnStore storage engine. This passes the request onto the PrimProc process which is responsible for optimizing and orchestrating execution of the query. The PrimProc module's optimizer creates a series of batch primitive steps that are executed on all nodes in the cluster. Since multiple servers can be deployed, this allows for scale-out execution of the queries. The optimizer attempts to process query execution in parallel. However, certain operations inherently must be executed centrally, for example final result ordering. Filtering, joins, aggregates, and GROUP BY clauses are general.y pushed down and executed in parallel in PrimProc on all servers. In PrimProc, batch primitive steps are performed at a granular level where individual threads operate on individual 1K-8K blocks within an extent. This enables a larger multi-core server to be fully consumed and scale out within a single server. The current batch primitive steps available in the system include:

    • Single Column Scan: Scan one or more Extents for a given column based on a single column predicate, including operators like =, <>, IN (list), BETWEEN, and ISNULL. See the first scan section of for additional details on tuning this.

    • Additional Single Column Filters: Project additional columns for any rows found by a previous scan and apply additional single column predicates as needed. Access of blocks is based on row identifier, going directly to the blocks. See the additional column read section of for additional details on tuning this.

    ColumnStore Query Execution Paradigms

    The following items should be considered when thinking about query execution in ColumnStore vs a row based store such as InnoDB.

    Data Scanning and Filtering

    ColumnStore is optimized for large scale aggregation / OLAP queries over large data sets. As such indexes typically used to optimize query access for row based systems do not make sense since selectivity is low for such queries. Instead ColumnStore gains performance by only scanning necessary columns, utilizing system maintained partitioning, and utilizing multiple threads and servers to scale query response time.

    Since ColumnStore only reads the necessary columns to resolve a query, only include the necessary columns required. For example, SELECT * is significantly slower than SELECT col1, col2 FROM tbl.

    Datatype size is important. If say you have a column that can only have values 0 through 100 then declare this as a tinyint as this will be represented with 1 byte rather than 4 bytes for int. This reduces the I/O cost by 4 times.

    For string types, an important threshold is CHAR(9) and VARCHAR(8) or greater. Each column storage file uses a fixed number of bytes per value. This enables fast positional lookup of other columns to form the row. Currently the upper limit for columnar data storage is 8 bytes. So. for strings longer than this, the system maintains an additional 'dictionary' extent where the values are stored. The columnar extent file then stores a pointer into the dictionary. For example, it is more expensive to read and process a VARCHAR(8) column than a CHAR(8) column. Where possible, you get better performance if you can utilize shorter strings, especially if you avoid the dictionary lookup. All TEXT/BLOB data types in ColumnStore 1.1 onward utilize a dictionary and do a multiple-block 8KB lookup to retrieve that data if required. The longer the data, the more blocks are retrieved, and the greater is a potential performance impact.

    In a row-based system, adding redundant columns adds to the overall query cost, but in a columnar system a cost is only occurred if the column is referenced. Therefore, additional columns should be created to support different access paths. For instance, store a leading portion of a field in one column to allow for faster lookups, but additionally store the long-form value as another column. Scans on a shorter code or a leading-portion column are faster.

    ColumnStore distributes function application across all nodes for greater performance, but this requires a distributed implementation of the function in addition to the MariaDB server implementation. See for the full list.

    Its important to note that ColumnStore does not have a cost based optimizer, so for optimal extent elimination and performance, your first WHERE clause predicate order should be based on the same column order that the data is imported by. Example: Most use cases with a date column benefit from a natural sort. (Today's data are being inserted after yesterday's data.) Having the first column to filter by date helps efficiently filter through records. WHERE DATE='x' outperforms a query based on a column with random values as the first predicate. Compare different query plans using calSetTrace and calGetTrace. Optimizing for the lowest PIO/LIO and highest PBE. See also .

    Joins

    Hash joins are utilized by ColumnStore to optimize for large scale joins and avoid the need for indexes and the overhead of nested loop processing. ColumnStore maintains table statistics so as to determine the optimal join order. This is implemented by first identifying the small table side (based on extent map data) and materializing the necessary rows from that table for the join. If the size of this is less than the configuration setting PmMaxMemorySmallSide, the join is pushed down into PrimProc for distributed in-memory processing. Otherwise, the larger side rows is not processed in a distributed manner for joining, and only the WHERE clause on that side is executed across all PrimProc modules in the cluster. If the join is too large for memory, disk-based join can be enabled to allow the query to complete.

    Aggregations

    Similarly to scalar functions ColumnStore distributes aggregate evaluation as much as possible. However some post processing is required to combine the final results. Enough memory must exist to handle queries with a very large number of values in the aggregate columns.

    Aggregation performance is also influenced by the number of distinct aggregate column values. Generally, the same number of rows with 100 distinct values computes faster than 10000 distinct values. This is due to increased memory management as well as transfer overhead.

    SELECT COUNT() is internally optimized to be SELECT COUNT(COL-N), where COL-N is the column that uses the least number of bytes for storage. For example it would be pick a CHAR(1) column over int column because CHAR(1) uses 1 byte for storage and int uses 4 bytes. The implementation still honors ANSI semantics in that SELECT COUNT() will include nulls in the total count as opposed to an explicit SELECT(COL-N) which excludes NULL values in the count.

    ORDER BY and LIMIT

    ORDER BY and LIMIT are implemented at the very end by the mariadbd server process on the temporary result set table. This means that the unsorted results must be fully retrieved before either are applied. The performance overhead of this is minimal on small to medium results, but for larger results, it can be significant.

    Complex Queries

    Subqueries are executed in sequence thus the subquery intermediate results must be materialized and then the join logic applies with the outer query.

    Window functions are executed as part of final aggregation in PrimProc due to the need for ordering of the window results. The ColumnStore window function engines uses a dedicated faster sort process.

    Partitioning

    Automated system partitioning of columns is provided by ColumnStore. As data is loaded into extent maps, the system will capture and maintain min/max values of column data in that extent map. New rows are appended to each extent map until full at which point a new extent map is created. For column values that are ordered or semi-ordered this allows for very effective data partitioning. By using the min and max values, entire extent maps can be eliminated and not read to filter data. This generally works particularly well for time dimension / series data or similar values that increase over time.

    Removing a Node

    To remove a node from Enterprise ColumnStore, perform the following procedure.

    Unlinking from Service in MaxScale

    The server object for the node must be unlinked from the service using :

    • Unlink the server object from the service using the unlink service command.

    • As the first argument, provide the name of the service.

    • As the second argument, provide the name of the server.

    Checking the Service in MaxScale

    To confirm that the server object was properly unlinked from the service, the service should be checked using :

    • Show the services using the show services command, like this:

    Unlinking from Monitor in MaxScale

    The server object for the node must be unlinked from the monitor using :

    • Unlink a server object from the monitor using the unlink monitor command.

    • As the first argument, provide the name of the monitor.

    • As the second argument, provide the name of the server.

    Checking the Monitor in MaxScale

    To confirm that the server object was properly unlinked from the monitor, the monitor should be checked using :

    • Show the monitors using the show monitors command, like this:

    Removing the Server from MaxScale

    The server object for the node must also be removed from MaxScale using :

    • Use or another supported REST client.

    • Remove the server object using the destroy server command.

    • As the first argument, provide the name for the server.

    For example:

    Checking the Server in MaxScale

    To confirm that the server object was properly removed, the server objects should be checked using :

    • Show the server objects using the show servers command, like this:

    Stopping the Enterprise ColumnStore Services

    The Enterprise Server. Enterprise ColumnStore, and CMAPI services can be stopped using the systemctl command.

    Perform the following procedure on the node:

    1. Stop the MariaDB Enterprise Server service:

    2. Stop the MariaDB Enterprise ColumnStore service:

    3. Stop the CMAPI service:

    Removing the Node from Enterprise ColumnStore

    The node must be removed from Enterprise ColumnStore using :

    • Remove the node using the endpoint path.

    • Use a , such as curl .

    • Format the JSON output using jq for enhanced readability.

    For example, if the primary node's host name is mcs1 and the IP address for the node to remove is 192.0.2.3:

    • In ES 10.5.10-7 and later:

    • In ES 10.5.9-6 and earlier:

    Example output:

    Checking the Enterprise ColumnStore Status

    To confirm that the node was properly removed, the status of Enterprise ColumnStore should be checked using :

    • Check the status using the endpoint path.

    For example, if the primary node's host name is mcs1:

    Example output:

    Step 5: Test MariaDB Enterprise Serverd

    Step 5: Test MariaDB Enterprise Server

    Overview

    This page details step 5 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 7: Start and Configure MariaDB MaxScale

    Step 7: Start and Configure MariaDB MaxScale

    Overview

    This page details step 7 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step starts and configures MariaDB MaxScale 22.08.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 7: Start and Configure MariaDB MaxScale

    Step 7: Start and Configure MariaDB MaxScale

    Overview

    This page details step 7 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step starts and configures MariaDB MaxScale 22.08.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    ColumnStore and Recursive CTE Limitations

    The ColumnStore engine does not fully support recursive Common Table Expressions (CTEs). Attempting to use recursive CTEs directly against ColumnStore tables typically results in an error.

    The purpose of the following examples is to demonstrate three potential workarounds for this issue. The best fit for your organization will depend on your specific needs and ability to refactor queries and adjust your approach.

    Setup: Simulating an Org Chart

    It simulates a simple organizational chart with employees and managers to illustrate the problem and the workarounds.

    First, an InnoDB table for comparison:

    ColumnStore Streaming Data Adapters

    The enables the creation of higher performance adapters for ETL integration and data ingestions. The Streaming Data Adapters are out of box adapters using these API for specific data sources and use cases.

    • MaxScale CDC Data Adapter is integration of the MaxScale CDC streams into MariaDB ColumnStore.

    • Kafka Data Adapter is integration of the Kafka streams into MariaDB ColumnStore.

    Query Accelerator

    MariaDB Query Accelerator is an Alpha release. Do not use it in production environments. Query Accelerator works only in ColumnStore 25.10.0 and with MariaDB Enterprise Server 11.8.3+.

    What is Query Accelerator

    Query Accelerator allows MariaDB to use ColumnStore to execute queries that are otherwise executed by InnoDB. Under the hood, Columnstore:

    $ sudo yum install --enablerepo=PowerTools glusterfs-server
    $ sudo yum install centos-release-gluster
    $ sudo yum install glusterfs-server
    $ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
    
    $ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
    $ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
    $ DEBARCH=$(dpkg --print-architecture)
    $ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    $ sudo gluster peer probe mcs2
    $ sudo gluster peer probe mcs3
    $ sudo gluster peer probe mcs1
    peer probe: Host mcs1 port 24007 already in peer list
    $ sudo gluster peer status
    Number of Peers: 2
    
    Hostname: mcs2
    Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
    State: Peer in Cluster (Connected)
    
    Hostname: mcs3
    Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
    State: Peer in Cluster (Connected)
    $ sudo mkdir -p /brick/storagemanager
    $ sudo gluster volume create storagemanager \
          replica 3 \
          mcs1:/brick/storagemanager \
          mcs2:/brick/storagemanager \
          mcs3:/brick/storagemanager \
          force
    $ sudo gluster volume start storagemanager
    $ sudo mkdir -p /var/lib/columnstore/storagemanager
    127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0
    $ sudo mount -a
    CREATE DATABASE columnstore_db;
    
    CREATE TABLE columnstore_db.analytics_test (
       id INT,
       str VARCHAR(50)
    ) ENGINE = ColumnStore;
    [mariadb]
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    log_bin                                = mariadb-bin
    log_bin_index                          = mariadb-bin.index
    relay_log                              = mariadb-relay
    relay_log_index                        = mariadb-relay.index
    log_slave_updates                      = ON
    gtid_strict_mode                       = ON
    
    # This must be unique on each cluster node
    server_id                              = 1
    sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    sudo mcsSetConfig CrossEngineSupport Port 3306
    sudo mcsSetConfig CrossEngineSupport User cross_engine
    sudo mcsSetConfig CrossEngineSupport Password cross_engine_passwd
    vm.overcommit_memory=1 
    vm.dirty_background_ratio=5 
    vm.dirty_ratio=10 
    vm.vfs_cache_pressure=50 
    net.core.netdev_max_backlog=2500 
    net.core.rmem_max=16777216 
    net.core.wmem_max=16777216 
    net.ipv4.tcp_max_syn_backlog=8192 
    net.ipv4.tcp_timestamps=0
    sudo sysctl -p
    cat /proc/sys/kernel/threads-max
    cat /proc/sys/kernel/pid_max
    cat /proc/sys/vm/max_map_count
    
    
    # Rhel /etc/sysctl.conf
    sudo echo "vm.max_map_count=4262144" >> /etc/sysctl.conf
    sudo echo "kernel.pid_max = 4194304" >> /etc/sysctl.conf
    sudo echo "kernel.threads-max = 2000000" >> /etc/sysctl.conf
    
    # There may be a file called 50-pid-max.conf or perhaps something similar. If so, modify it 
    sudo echo "vm.max_map_count=4262144" > /usr/lib/sysctl.d/50-max_map_count.conf
    sudo echo "kernel.pid_max = 4194304" > /usr/lib/sysctl.d/50-pid-max.conf
    sudo sysctl -p
    SELECT calgetsqlcount();
    SELECT /*! INFINIDB_ORDERED */ r_regionkey     
    FROM region r, customer c, nation n    
    WHERE r.r_regionkey = n.n_regionkey      
    AND n.n_nationkey = c.c_nationkey
    [mariadb]
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    [ObjectStorage]
    …
    service = S3
    …
    [S3]
    bucket                = your_columnstore_bucket_name
    endpoint              = your_s3_endpoint
    aws_access_key_id     = your_s3_access_key_id
    aws_secret_access_key = your_s3_secret_key
    # iam_role_name       = your_iam_role
    # sts_region          = your_sts_region
    # sts_endpoint        = your_sts_endpoint
    # ec2_iam_mode        = enabled
    
    [Cache]
    cache_size = your_local_cache_size
    path       = your_local_cache_path
    $ sudo systemctl start mariadb
    
    $ sudo systemctl enable mariadb
    $ sudo systemctl start mariadb-columnstore
    
    $ sudo systemctl enable mariadb-columnstore
    CREATE USER 'util_user'@'127.0.0.1'
    IDENTIFIED BY 'util_user_passwd';
    GRANT SELECT, PROCESS ON *.*
    TO 'util_user'@'127.0.0.1';
    $ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    
    $ sudo mcsSetConfig CrossEngineSupport Port 3306
    
    $ sudo mcsSetConfig CrossEngineSupport User util_user
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_passwd
    $ sudo yum install policycoreutils policycoreutils-python
    $ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utils
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    
    Nothing to do
    $ sudo semodule -i mariadb_local.pp
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=enforcing
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo setenforce enforcing
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo systemctl status firewalld
    $ sudo systemctl stop firewalld
    $ sudo ufw status verbose
    $ sudo ufw disable
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    192.0.2.1     mcs1
    192.0.2.2     mcs2
    192.0.2.3     mcs3
    192.0.2.100   mxs1
    $ sudo yum install --enablerepo=PowerTools glusterfs-server
    $ sudo yum install centos-release-gluster
    $ sudo yum install glusterfs-server
    $ wget -O - https://download.gluster.org/pub/gluster/glusterfs/LATEST/rsa.pub | apt-key add -
    
    $ DEBID=$(grep 'VERSION_ID=' /etc/os-release | cut -d '=' -f 2 | tr -d '"')
    $ DEBVER=$(grep 'VERSION=' /etc/os-release | grep -Eo '[a-z]+')
    $ DEBARCH=$(dpkg --print-architecture)
    $ echo deb https://download.gluster.org/pub/gluster/glusterfs/LATEST/Debian/${DEBID}/${DEBARCH}/apt ${DEBVER} main > /etc/apt/sources.list.d/gluster.list
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo apt update
    $ sudo apt install glusterfs-server
    $ sudo systemctl start glusterd
    $ sudo systemctl enable glusterd
    $ sudo gluster peer probe mcs2
    $ sudo gluster peer probe mcs3
    $ sudo gluster peer probe mcs1
    
    
    peer probe: Host mcs1 port 24007 already in peer list
    $ sudo gluster peer status
    Hostname: mcs2
    Uuid: 3c8a5c79-22de-45df-9034-8ae624b7b23e
    State: Peer in Cluster (Connected)
    
    Hostname: mcs3
    Uuid: 862af7b2-bb5e-4b1c-8311-630fa32ed451
    State: Peer in Cluster (Connected)
    $ sudo mkdir -p /brick/storagemanager
    $ sudo gluster volume create storagemanager \
          replica 3 \
          mcs1:/brick/storagemanager \
          mcs2:/brick/storagemanager \
          mcs3:/brick/storagemanager \
          force
    $ sudo gluster volume start storagemanager
    $ sudo mkdir -p /var/lib/columnstore/storagemanager
    127.0.0.1:storagemanager /var/lib/columnstore/storagemanager glusterfs defaults,_netdev 0 0
    $ sudo mount -a
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/mcs cluster status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      }
    $ mariadb --host=192.0.2.1 \
       --user=root \
       --password
    FLUSH TABLES WITH READ LOCK;
    $ sudo mkdir -p /backups/columnstore/202101291600/
    $ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
    $ sudo mkdir -p /backups/mariadb/202101291600/
    $ sudo mariadb-backup --backup \
       --target-dir=/backups/mariadb/202101291600/ \
       --user=mariadb-backup \
       --password=mbu_passwd
    $ sudo mariadb-backup --prepare \
       --target-dir=/backups/mariadb/202101291600/
    UNLOCK TABLES;
    $ sudo systemctl stop mariadb-columnstore-cmapi
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl stop mariadb
    $ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
    $ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
    $ sudo mariadb-backup --copy-back \
       --target-dir=/backups/mariadb/202101291600/
    $ sudo chown -R mysql:mysql /var/lib/mysql
    [ObjectStorage]
    …
    service = S3
    …
    [S3]
    bucket = your_columnstore_bucket_name
    endpoint = your_s3_endpoint
    aws_access_key_id = your_s3_access_key_id
    aws_secret_access_key = your_s3_secret_key
    # iam_role_name = your_iam_role
    # sts_region = your_sts_region
    # sts_endpoint = your_sts_endpoint
    
    [Cache]
    cache_size = your_local_cache_size
    path = your_local_cache_path
    $ sudo systemctl start mariadb
    $ sudo systemctl start mariadb-columnstore-cmapi
    # minimize swapping
    vm.swappiness = 1
    
    # Increase the TCP max buffer size
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    
    # Increase the TCP buffer limits
    # min, default, and max number of bytes to use
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    
    # don't cache ssthresh from previous connection
    net.ipv4.tcp_no_metrics_save = 1
    
    # for 1 GigE, increase this to 2500
    # for 10 GigE, increase this to 30000
    net.core.netdev_max_backlog = 2500
    $ sudo sysctl --load=/etc/sysctl.d/90-mariadb-enterprise-columnstore.conf
    $ sudo setenforce permissive
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=permissive
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Permissive
    $ sudo systemctl disable apparmor
    $ sudo aa-status
    apparmor module is loaded.
    0 profiles are loaded.
    0 profiles are in enforce mode.
    0 profiles are in complain mode.
    0 processes have profiles defined.
    0 processes are in enforce mode.
    0 processes are in complain mode.
    0 processes are unconfined but have a profile defined.
    $ sudo systemctl status firewalld
    $ sudo systemctl stop firewalld
    $ sudo ufw status verbose
    $ sudo ufw disable
    $ sudo yum install glibc-locale-source glibc-langpack-en
    $ sudo localedef -i en_US -f UTF-8 en_US.UTF-8
    192.0.2.1     mcs1
    192.0.2.2     mcs2
    192.0.2.3     mcs3
    192.0.2.100   mxs1
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      }
    $ mariadb --host=192.0.2.1 \
       --user=root \
       --password
    FLUSH TABLES WITH READ LOCK;
    $ sudo mkdir -p /backups/columnstore/202101291600/
    $ sudo rsync -av /var/lib/columnstore/storagemanager /backups/columnstore/202101291600/
    $ sudo rsync -av /var/lib/columnstore/data1 /backups/columnstore/202101291600/
    $ sudo rsync -av /var/lib/columnstore/data2 /backups/columnstore/202101291600/
    $ sudo rsync -av /var/lib/columnstore/data3 /backups/columnstore/202101291600/
    $ sudo mkdir -p /backups/mariadb/202101291600/
    $ sudo mariadb-backup --backup \
       --target-dir=/backups/mariadb/202101291600/ \
       --user=mariadb-backup \
       --password=mbu_passwd
    $ sudo mariadb-backup --prepare \
       --target-dir=/backups/mariadb/202101291600/
    UNLOCK TABLES;
    $ sudo systemctl stop mariadb-columnstore-cmapi
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl stop mariadb
    $ sudo rsync -av /backups/columnstore/202101291600/storagemanager/ /var/lib/columnstore/storagemanager/
    $ sudo chown -R mysql:mysql /var/lib/columnstore/storagemanager
    $ sudo rsync -av /backups/columnstore/202101291600/data1/ /var/lib/columnstore/data1/
    $ sudo rsync -av /backups/columnstore/202101291600/data2/ /var/lib/columnstore/data2/
    $ sudo rsync -av /backups/columnstore/202101291600/data3/ /var/lib/columnstore/data3/
    $ sudo chown -R mysql:mysql /var/lib/columnstore/data1
    $ sudo chown -R mysql:mysql /var/lib/columnstore/data2
    $ sudo chown -R mysql:mysql /var/lib/columnstore/data3
    $ sudo mariadb-backup --copy-back \
       --target-dir=/backups/mariadb/202101291600/
    $ sudo chown -R mysql:mysql /var/lib/mysql
    $ sudo systemctl start mariadb
    $ sudo systemctl start mariadb-columnstore-cmapi
    sudo dnf install -y \
    MariaDB-server MariaDB-columnstore-engine MariaDB-columnstore-cmapi
    sudo apt update
    sudo apt install -y mariadb-server mariadb-plugin-columnstore mariadb-columnstore-cmapi
    sudo systemctl start mariadb
    sudo systemctl enable mariadb
    sudo systemctl start mariadb-columnstore-cmapi
    sudo systemctl enable mariadb-columnstore-cmapi
    sudo mcs cluster set api-key --key <your-api-key-here>
    sudo mcs node add --node <private-ip-of-rw-node>
    sudo mcs node add --read-replica --node <private-ip-of-replica>
    sudo mcs cluster status
    SELECT DISTINCT col1 FROM tab LIMIT 10000;
    SELECT DISTINCT col1 FROM tab LIMIT 100;
    SET SESSION columnstore_decimal_overflow_check=ON;
    
    SELECT (big_decimal1 * big_decimal2) AS product
    FROM columnstore_tab;
    Rocky Linux 8 (x86_64, ARM64)
  • Rocky Linux 9 (x86_64, ARM64)

  • Ubuntu 20.04 LTS (x86_64, ARM64)

  • Ubuntu 22.04 LTS (x86_64, ARM64)

  • Ubuntu 24.04 LTS (x86_64, ARM64)

  • Step 1

    Prepare System for Enterprise ColumnStore

    Step 2

    Install Enterprise ColumnStore

    Step 3

    Start and Configure Enterprise ColumnStore

    Step 4

    Test Enterprise ColumnStore

    Step 5

    Bulk Import Data to Enterprise ColumnStore

    MariaDB Enterprise Server

    Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.

    MariaDB Enterprise ColumnStore

    • Columnar Storage Engine

    • Optimized for Online Analytical Processing (OLAP) workloads

    Enterprise ColumnStore node

    4+ cores

    16+ GB

    Enterprise ColumnStore node

    64+ cores

    128+ GB

    Configuration File

    Configuration files (such as /etc/my.cnf) can be used to set and . The server must be restarted to apply changes made to configuration files.

    Command-line

    The server can be started with command-line options that set and .

    SQL

    Users can set that support dynamic changes on-the-fly using the statement.

    • CentOS

    • Red Hat Enterprise Linux (RHEL)

    /etc/my.cnf.d/z-custom-mariadb.cnf

    • Debian

    • Ubuntu

    /etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf

    Start

    sudo systemctl start mariadb

    Stop

    sudo systemctl stop mariadb

    Restart

    sudo systemctl restart mariadb

    Enable during startup

    sudo systemctl enable mariadb

    Disable during startup

    sudo systemctl disable mariadb

    Status

    sudo systemctl status mariadb

    submitting a support case
    Enterprise ColumnStore with Shared Local storage
    recommended hardware requirements

    Table Level Filters: Project additional columns as required for any table level filters such as column1 < column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.

  • Project Join Columns for Joins: Project additional join columns as needed for any join operations. Access of blocks is again based on row identifier, going directly to the blocks. See the join tuning section of performance configuration for additional details on tuning this.

  • Execute Multi-Join: Apply one or more hash join operation against projected join columns, and use that value to probe a previously built hash map. Build out tuples as needed to satisfy inner or outer join requirements. See the multi-table join section of performance configuration for additional details on tuning this.

  • Cross-Table Level Filters: Project additional columns from the range of rows for the Primitive Step as needed for any cross-table level filters such as table1.column1 < table2.column2, or more advanced functions and expressions. Access of blocks is again based on row identifier, going directly to the blocks.

  • Aggregation/Distinct Operation Part 1: Apply any local group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. Part 1 of this process is handled by PrimProc.

  • Aggregation/Distinct Operation Part 2: Apply any final group by, distinct, or aggregation operation against the set of joined rows assigned to a given Batch Primitive. This processing is handled by PrimProc. See the memory management section of performance configuration for additional details on tuning this.

  • performance configuration
    performance configuration
    Distributed Functions
    CSEP
    Authenticate using the configured API key.
  • Include the required headers.

  • CMAPI
    remove-node
    supported REST client
    CMAPI
    status
    Test S3 Connection

    MariaDB Enterprise ColumnStore 23.10 includes a testS3Connection command to test the S3 configuration, permissions, and connectivity.

    This action is performed on each Enterprise ColumnStore node.

    Test the S3 configuration by executing the following:

    If the testS3Connection command does not return OK, investigate the S3 configuration.

    Test Enterprise Server Service

    Use Systemd to test whether the MariaDB Enterprise Server service is running.

    This action is performed on each Enterprise ColumnStore node.

    Check if the MariaDB Enterprise Server service is running by executing the following:

    If the service is not running on any node, start the service by executing the following on that node:

    Test Local Client Connections

    Use to test the local connection to the Enterprise Server node.

    This action is performed on each Enterprise ColumnStore node:

    The sudo command is used here to connect to the Enterprise Server node using the root@localhost user account, which authenticates using the unix_socket authentication plugin. Other user accounts can be used by specifying the --user and --password command-line options.

    Test ColumnStore Storage Engine Plugin

    Query the table to confirm that the ColumnStore storage engine is loaded.

    This action is performed on each Enterprise ColumnStore node.

    Execute the following query:

    The PLUGIN_STATUS column for each ColumnStore-related plugin should contain ACTIVE.

    Test CMAPI Service

    Use Systemd to test whether the CMAPI service is running.

    This action is performed on each Enterprise ColumnStore node.

    Check if the CMAPI service is running by executing the following:

    If the service is not running on any node, start the service by executing the following on that node:

    Test ColumnStore Status

    Use CMAPI to request the ColumnStore status. The API key needs to be provided as part of the X-API-key HTML header.

    This action is performed with the CMAPI service on the primary server.

    Check the ColumnStore status using curl by executing the following:

    Test DDL

    Use MariaDB Client to test DDL.

    1. On the primary server, use the MariaDB Client to connect to the node:

    1. Create a test database and ColumnStore table:

    1. On each replica server, use the MariaDB Client to connect to the node:

    1. Confirm that the database and table exist:

    If the database or table do not exist on any node, then check the replication configuration.

    Test DML

    Use MariaDB Client to test DML.

    1. On the primary server, use the MariaDB Client to connect to the node:

    1. Insert sample data into the table created in the DDL test:

    1. On each replica server, use the MariaDB Client to connect to the node:

    1. Execute a SELECT query to retrieve the data:

    If the data is not returned on any node, check the ColumnStore status and the storage configuration.

    Next Step

    Navigation in the procedure 'Deploy ColumnStore Object Storage Topology".

    This page was step 5 of 9.

    Next: Step 6: Install MariaDB MaxScale.

    Replace the Default Configuration File

    MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.

    On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:

    For additional information, see "Global Parameters".

    Restart MaxScale

    On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:

    For additional information, see "Start and Stop Services".

    Configure Server Objects

    On the MaxScale node, use maxctrl create to create a server object for each Enterprise ColumnStore node:

    Configure MariaDB Monitor

    MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).

    On the MaxScale node, use maxctrl create monitor to create a MariaDB Monitor:

    In this example:

    • columnstore_monitor is an arbitrary name that is used to identify the new monitor.

    • mariadbmon is the name of the module that implements the MariaDB Monitor.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.

    • password='MAXSCALE_USER_PASSWORD' sets the password parameter to the password used by the database user account that MaxScale uses to monitor the ColumnStore nodes.

    • replication_user=REPLICATION_USER sets the replication_user parameter to the database user account that MaxScale uses to setup replication.

    • replication_password='REPLICATION_USER_PASSWORD' sets the replication_password parameter to the password used by the database user account that MaxScale uses to setup replication.

    • --servers sets the servers parameter to the set of nodes that MaxScale should monitor. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by mariadbmon in MaxScale 22.08 can also be specified.

    Choose a MaxScale Router

    Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.

    Router
    Configuration Procedure
    Description

    Connection-based load balancing

    • Routes connections to Enterprise ColumnStore nodes designated as replica servers for a read-only pool

    • Routes connections to an Enterprise ColumnStore node designated as the primary server for a read-write pool.

    Query-based load balancing

    • Routes write queries to an Enterprise ColumnStore node designated as the primary server

    • Routes read queries to Enterprise ColumnStore node designated as replica servers

    • Automatically reconnects after node failures

    Configure Read Connection Router

    Use MaxScale Read Connection Router (readconnroute) to route connections to replica servers for a read-only pool.

    On the MaxScale node, use maxctrl create service to create a router:

    In this example:

    • connection_router_service is an arbitrary name that is used to identify the new service.

    • readconnroute is the name of the module that implements the Read Connection Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • router_options=slave sets the router_options parameter to slave, so that MaxScale only routes connections to the replica nodes.

    • --servers sets the servers parameter to the set of nodes to which MaxScale should route connections. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by readconnroute in MaxScale 22.08 can also be specified.

    Configure Listener for the Read Connection Router

    These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read Connection Router (readconnroute):

    In this example:

    • connection_router_service is the name of the readconnroute service that was previously created.

    • connection_router_listener is an arbitrary name that is used to identify the new listener.

    • 3308 is the TCP port.

    • protocol=MariaDBClient sets the protocol parameter.

    • Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.

    Configure Read/Write Split Router for Queries

    MaxScale Read/Write Split Router (readwritesplit) performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.

    On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the Read/Write Split Router (readwritesplit):

    In this example:

    • query_router_service is an arbitrary name that is used to identify the new service.

    • readwritesplit is the name of the module that implements the Read/Write Split Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • --servers sets the servers parameter to the set of nodes to which MaxScale should route queries. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by readwritesplit in MaxScale 22.08 can also be specified.

    Configure a Listener for the Read/Write Split Router

    These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read/Write Split Router (readwritesplit):

    In this example:

    • query_router_service is the name of the readwritesplit service that was previously created.

    • query_router_listener is an arbitrary name that is used to identify the new listener.

    • 3307 is the TCP port.

    • protocol=MariaDBClient sets the protocol parameter.

    • Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.

    Start Services

    To start the services and monitors, on the MaxScale node use maxctrl start services:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 7 of 9.

    Next: Next: Step 8: Test MariaDB MaxScale.

    Replace the Default Configuration File

    MariaDB MaxScale installations include a configuration file with some example objects. This configuration file should be replaced.

    On the MaxScale node, replace the default /etc/maxscale.cnf with the following configuration:

    For additional information, see "Global Parameters".

    Restart MaxScale

    On the MaxScale node, restart the MaxScale service to ensure that MaxScale picks up the new configuration:

    For additional information, see "Start and Stop Services".

    Configure Server Objects

    On the MaxScale node, use maxctrl create to create a server object for each Enterprise ColumnStore node:

    Configure MariaDB Monitor

    MaxScale uses monitors to retrieve additional information from the servers. This information is used by other services in filtering and routing connections based on the current state of the node. For MariaDB Enterprise ColumnStore, use the MariaDB Monitor (mariadbmon).

    On the MaxScale node, use maxctrl create monitor to create a MariaDB Monitor:

    In this example:

    • columnstore_monitor is an arbitrary name that is used to identify the new monitor.

    • mariadbmon is the name of the module that implements the MariaDB Monitor.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to monitor the ColumnStore nodes.

    • password='MAXSCALE_USER_PASSWORD' sets the password parameter to the password used by the database user account that MaxScale uses to monitor the ColumnStore nodes.

    • replication_user=REPLICATION_USER sets the replication_user parameter to the database user account that MaxScale uses to setup replication.

    • replication_password='REPLICATION_USER_PASSWORD' sets the replication_password parameter to the password used by the database user account that MaxScale uses to setup replication.

    • --servers sets the servers parameter to the set of nodes that MaxScale should monitor. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by mariadbmon in MaxScale 22.08 can also be specified.

    Choose a MaxScale Router

    Routers control how MaxScale balances the load between Enterprise ColumnStore nodes. Each router uses a different approach to routing queries. Consider the specific use case of your application and database load and select the router that best suits your needs.

    Router
    Configuration Procedure
    Description

    Connection-based load balancing

    • Routes connections to Enterprise ColumnStore nodes designated as replica servers for a read-only pool

    • Routes connections to an Enterprise ColumnStore node designated as the primary server for a read-write pool.|

    Query-based load balancing

    • Routes write queries to an Enterprise ColumnStore node designated as the primary server

    • Routes read queries to Enterprise ColumnStore node designated as replica servers

    • Automatically reconnects after node failures

    Configure Read Connection Router

    Use MaxScale Read Connection Router (readconnroute) to route connections to replica servers for a read-only pool.

    On the MaxScale node, use maxctrl create service to create a router:

    In this example:

    • connection_router_service is an arbitrary name that is used to identify the new service.

    • readconnroute is the name of the module that implements the Read Connection Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • router_options=slave sets the router_options parameter to slave, so that MaxScale only routes connections to the replica nodes.

    • --servers sets the servers parameter to the set of nodes to which MaxScale should route connections. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by readconnroute in MaxScale 22.08 can also be specified.

    Configure Listener for the Read Connection Router

    These instructions reference TCP port 3308. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read Connection Router (readconnroute):

    In this example:

    • connection_router_service is the name of the readconnroute service that was previously created.

    • connection_router_listener is an arbitrary name that is used to identify the new listener.

    • 3308 is the TCP port.

    • protocol=MariaDBClient sets the protocol parameter.

    • Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.

    Configure Read/Write Split Router for Queries

    MaxScale Read/Write Split Router (readwritesplit) performs query-based load balancing. The router routes write queries to the primary and read queries to the replicas.

    On the MaxScale node, use the maxctrl create service command to configure MaxScale to use the Read/Write Split Router (readwritesplit):

    In this example:

    • query_router_service is an arbitrary name that is used to identify the new service.

    • readwritesplit is the name of the module that implements the Read/Write Split Router.

    • user=MAXSCALE_USER sets the user parameter to the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • password=MAXSCALE_USER_PASSWORD sets the password parameter to the password used by the database user account that MaxScale uses to connect to the ColumnStore nodes.

    • --servers sets the servers parameter to the set of nodes to which MaxScale should route queries. All non-option arguments after --servers are interpreted as server names.

    • Other Module Parameters supported by readwritesplit in MaxScale 22.08 can also be specified.

    Configure a Listener for the Read/Write Split Router

    These instructions reference TCP port 3307. You can use a different TCP port. The TCP port used must not be bound by any other listener.

    On the MaxScale node, use the maxctrl create listener command to configure MaxScale to use a listener for the Read/Write Split Router (readwritesplit):

    In this example:

    • query_router_service is the name of the readwritesplit service that was previously created.

    • query_router_listener is an arbitrary name that is used to identify the new listener.

    • 3307 is the TCP port.

    • protocol=MariaDBClient sets the protocol parameter.

    • Other Module Parameters supported by listeners in MaxScale 22.08 can also be specified.

    Start Services

    To start the services and monitors, on the MaxScale node use maxctrl start services:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 7 of 9.

    Next: Step 8: Test MariaDB MaxScale

    Next, the ColumnStore table, which is where the CTE issue arises:

    Attempting to run a recursive CTE directly on the employees (ColumnStore) table:

    This will result in the aforementioned error:

    Workarounds

    Here are three potential workarounds to address the recursive CTE limitation with MariaDB ColumnStore.

    Option 1: Toggle ColumnStore Select Handler

    You can temporarily bypass ColumnStore's SELECT handler by disabling it at the session level before executing your recursive CTE and then re-enabling it afterwards.

    Note: This workaround may not always be effective, as its success can depend on the specific MariaDB server version and table definitions.

    Option 2: Use Procedural Simulation via Temporary Table

    If direct recursive CTEs fail or cause server crashes, you can simulate the recursive logic using a stored procedure and a temporary table. This approach iteratively populates the hierarchy.

    First, create a temporary table to store the hierarchical data:

    Next, create a stored procedure to iteratively populate the temp_org_chart table:

    Finally, call the stored procedure and then select from the populated temporary table:

    Option 3: Clone Data into InnoDB

    Another robust workaround is to clone the structure and data of the ColumnStore table into an InnoDB table. Once the data resides in an InnoDB table, you can execute the recursive CTE as usual, as InnoDB fully supports them.

    This approach involves a few steps, often executed via shell commands interacting with the MariaDB client:

    1. Extract and Modify CREATE TABLE Statement: Use SHOW CREATE TABLE to get the definition of your ColumnStore table, then modify it to change the engine to InnoDB and give the new table a different name (e.g., employees2).

    1. Create New Table and Copy Data: Execute the modified CREATE TABLE script to create the new InnoDB table, then insert all data from the original ColumnStore table into it.

    1. Run Recursive CTE on the InnoDB Table: Now, with the data in employees2 (an InnoDB table), you can run your recursive CTE without issues.

    MaxScale CDC Data Adapter

    The MaxScale CDC Data Adapter has been deprecated.

    The MaxScale CDC Data Adapter allows streaming change data events (binary log events) from MariaDB Master hosting non-columnstore engines (InnoDB, MyRocks, MyISAM) to MariaDB ColumnStore. In other words, replicate data from a MariaDB master server to MariaDB ColumnStore. It acts as a CDC Client for MaxScale and uses the events received from MaxScale as input to MariaDB ColumnStore Bulk Data API to push the data to MariaDB ColumnStore. maxscale-cdc-adapter

    It registers with MariaDB MaxScale as a CDC Client using the MaxScale CDC Connector API, receiving change data records from MariaDB MaxScale (that are converted from binlog events received from the Master on MariaDB TX) in a JSON format. Then, using the MariaDB ColumnStore bulk write SDK, it converts the JSON data into API calls and streams it to a MariaDB PM node. The adapter has options to insert all the events in the same schema as the source database table or insert each event with metadata as well as table data. The event meta data includes the event timestamp, the GTID, event sequence and event type (insert, update, delete).

    Installation

    Pre-requisite:

    • Download and install MaxScale CDC Connector API from connector.

    • Download and install MariaDB ColumnStore bulk write SDK from columnstore-bulk-write-sdk.md.

    CentOS 7

    Debian 9/Ubuntu Xenial:

    Debian 8:

    Usage

    Streaming Multiple Tables

    To stream multiple tables, use the -f parameter to define a path to a TSV formatted file. The file must have one database and one table name per line. The database and table must be separated by a TAB character and the line must be terminated in a newline (\n).

    Here is an example file with two tables, t1 and t2 both in the test database:

    Automated Table Creation on ColumnStore

    You can have the adapter automatically create the tables on the ColumnStore instance with the -an option. In this case, the user used for cross-engine queries will be used to create the table (the values in Columnstore.CrossEngineSupport). This user requires CREATE privileges on all streamed databases and tables.

    Data Transformation Mode

    The -z option enables the data transformation mode. In this mode, the data is converted from historical, append-only data to the current version of the data. In practice, this replicates changes from a MariaDB master server to ColumnStore via the MaxScale CDC.

    This mode is not as fast as the append-only mode and might not be suitable for heavy workloads. This is due to the fact that the data transformation is done via various DML statements.

    Quick Start

    Download and install both MaxScale and ColumnStore.

    Copy the Columnstore.xml file from /usr/local/mariadb/columnstore/etc/Columnstore.xml from one of the ColumnStore PrimProc nodes to the server where the adapter is installed.

    Configure MaxScale according to the .

    Create a CDC user by executing the following MaxAdmin command on the MaxScale server. Replace the <service> with the name of the avrorouter service and <user> and <password> with the credentials that are to be created.

    Then we can start the adapter by executing the following command.

    The <database> and <table> define the table that is streamed to ColumnStore. This table should exist on the master server where MaxScale is reading events from. If the table is not created on ColumnStore, the adapter will print instructions on how to define it in the correct way.

    The <user> and <password> are the users created for the CDC user, <host> is the MaxScale address and <port> is the port where the CDC service listener is listening.

    The -c flag is optional if you are running the adapter on the server where ColumnStore is located.

    Kafka to ColumnStore Adapter

    The Kafka data adapter streams all messages published to Apache Kafka topics in Avro format to MariaDB ColumnStore automatically and continuously - enabling data from many sources to be streamed and collected for analysis without complex code. The Kafka adapter is built using librdkafka and the MariaDB ColumnStore bulk write SDK

    kafka-data-adapter

    A tutorial for the Kafka adapter for ingesting Avro formatted data can be found in the kafka-to-columnstore-data-adapter document.

    ColumnStore - Pentaho Data Integration - Data Adapter

    Starting with MariaDB ColumnStore 1.1.4, a data adapter for Pentaho Data Integration (PDI) / Kettle is available to import data directly into ColumnStore’s WriteEngine. It is built on MariaDB’s rapid-paced Bulk Write SDK.

    PDI Plugin Block info graphic

    Compatibility notice

    The plugin was designed for the following software composition:

    • Operating system: Windows 10 / Ubuntu 16.04 / RHEL/CentOS 7+

    • MariaDB ColumnStore >= 1.1.4

    • MariaDB Java Database client* >= 2.2.1

    • Java >= 8

    • Pentaho Data Integration >= 7 +not officially supported by Pentaho.

    *Only needed if you want to execute DDL.

    Installation

    The following steps are necessary to install the ColumnStore Data adapter (bulk loader plugin):

    1. Build the plugin from source or download it from our website

    2. Extract the archive mariadb-columnstore-kettle-bulk-exporter-plugin-*.zip into your PDI installation directory $PDI-INSTALLATION/plugins.

    3. Copy MariaDB's JDBC Client mariadb-java-client-2.2.x.jar into PDI's lib directory $PDI-INSTALLATION/lib.

    4. Install the additional library dependencies

    Ubuntu dependencies

    CentOS dependencies

    Windows 10 dependencies

    On Windows the installation of the Visual Studio 2015/2017 C++ Redistributable (x64) is required.

    Configuration

    Each MariaDB ColumnStore Bulk Loader block needs to be configured. On the one hand, it needs to know how to connect to the underlying Bulk Write SDK to inject data into ColumnStore, and on the other hand, it needs to have a proper JDBC connection to execute DDL.

    Both configurations can be set in each block’s settings tab.

    PDI Plugin Block settings info graphic

    The database connection configuration follows PDI’s default schema.

    By default, the plugin tries to use ColumnStore's default configuration /usr/local/mariadb/columnstore/etc/Columnstore.xml to connect to the ColumnStore instance through the Bulk Write SDK. In addition, individual paths or variables can be used too.

    Information on how to prepare the Columnstore.xml configuration file can be found here.

    Usage

    PDI Plugin Block mapping info graphic

    Once a block is configured and all inputs are connected in PDI, the inputs have to be mapped to ColumnStore’s table format.

    One can either choose “Map all inputs”, which sets target columns of adequate type, or choose a custom mapping based on the structure of the existing table.

    The SQL button can be used to generate DDL based on the defined mapping and to execute it.

    Limitations

    This plugin is a beta release.

    In addition, it can't handle blob data types and only supports multiple inputs to one block if the input field names are equal for all input sources.

    ColumnStore Bulk Data API
    • receives a query;

    • searches for applicable Engine Independent statistics for InnoDB table index column;

    • applies RBO rule to transform its InnoDB tables into a number of UNION queries over non-overlapping ranges of a suitable InnoDB table index;

    • retrieves the data in parallel from MariaDB, and runs it using Columnstore runtime.

    Queries Benefitting From Query Accelerator

    Query Accelerator improves the performance of queries that use aggregation functions, for instance SUM, AVG, MIN, MAX, and GROUP BY, where the performance overhead of pulling the data out of InnoDB can be overcome by the performance optimization of running in the ColumnStore engine.

    This avoids the bottleneck/pipeline of having to move data out of InnoDB and into ColumnStore. Query Accelerator strives to parallelize data out of InnoDB, by utilizing table statistics to optimize multiple threads to data ranges on disk. If the InnoDB table in question uses an index, Query Accelerator is able to get the data much faster.

    Example of a query benefitting from Query Accelerator (assuming column_a is indexed):

    The effectiveness of Query Accelerator can vary depending on the type of queries you run and the specific characteristics of your database schema. Certain types of queries or configurations may not benefit from Query Accelerator, or could potentially experience decreased performance. It's essential to understand when Query Accelerator is most advantageous and when traditional InnoDB operations might be more efficient. Consider the following points to optimize query performance with Query Accelerator:

    • Make sure your query uses tables that are indexed, and the index key has the first integer column.

    • Also, run ANALYZE TABLE before running Query Accelerator.

    Queries not to run in Query Accelerator

    Performance Issues

    Performance issues occur for queries like this:

    InnoDB handles such comparison much better than ColumnStore in general, and in Query Accelerator, that would be even worse.

    • Generally, if your query takes longer than a minute in InnoDB, try Query Accelerator.

    Queries not Working in Query Accelerator

    Query Accelerator has the same limitations as ColumnStore in general, in that it has a limited set of functions and data types it can handle. Therefore, be aware of

    • syntax or functions that Columnstore does not support;

    • data types ColumnStore does not support.

    Enabling Query Accelerator

    1

    Edit the MariaDB configuration file (my.cnf or my.ini)

    Locate (or create) the mariadb section, and add a line enabling Query Accelerator, like this:

    Restart MariaDB Server for the change to take effect.

    2

    Run queries to turn on Query Accelerator

    Set these parameters in a client session:

    In future versions of Query Accelerator, those SET statements will be in stored procedures, allowing to turn Query Accelerator on and off with simpler commands.

    To use Query Accelerator just for one query, you have to run those SET statements per query, not per session. Setting them per session effectively disables the MariaDB Optimizer for subsequent queries that ColumnStore cannot execute.

    Enabling Processing for InnoDB Tables

    There must be engine-independent statistics for an InnoDB table index column so that it can be used for Query Accelerator.

    Control Client Session Variables and Parameters

    • columnstore_unstable_optimizer enables unstable optimizer that is required for Query Accelerator RBO rule.

    • columnstore_select_handler enables/disables ColumnStore processing for InnoDB tables.

    • columnstore_query_accel_parallel_factor controls the number of parallel ranges to be used for Query Accelerator.

    Watch out for max_connections. If you set columnstore_query_accel_parallel_factor to a high value, you may need to increase max_connections to avoid connection pool exhaustion.

    Verifying That Query Accelerator is Being Used

    There are two ways to verify Query Accelerator is being used:

    1. Use select mcs_get_plan('rules') to get a list of the rules that were applied to the query.

    2. Look for patterns like derived table - $added_sub_#db_name_#table_name_X in the optimized plan using select mcs_get_plan('optimized').

    Query Accelerator Quick Start

    This example shows a SUM(x) GROUP BY y query which runs ~2.6s in InnoDB with indexes, and 3x faster via ColumnStore query acceleration ( ~0.7s ), provided there's enough CPU and a high enough parallel_factor.

    1

    In mariadb (MariaDB command-line client), run these statements:

    2

    Turn on Query Accelerator - On CLI:

    3

    In mariadb (MariaDB command-line client), run these statements:

    4

    Log out of mariadb (MariaDB command-line client), and log in again.

    5

    In mariadb (MariaDB command-line client), run these statements:

    6

    Turn off Query Accelerator - On CLI:

    Quick Verifications

    1

    Tail the ColumnStore log debug.log, and confirm parallel access to InnoDB:

    Increase or decrease parallelism with columnstore_ces_optimization_parallel_factor. Keep in mind you need enough max_connections in MariaDB server:

    2

    Check the execution plan via EXPLAIN FORMAT=JSON. It should say Pushed select:

    3

    Verify that mcs_get_plan shows parallel_ces, and that the detailed ColumnStore execution plan shows derived table:

    • Text file.

    • S3-compatible object storage

    • Loads data from the cloud. • Translates operation to cpimport command. • Non-blocking

    Fast

    SQL

    • Text file.

    • Server file system • Client file system

    • Translates operation to cpimport command. • Non-blocking

    Slow

    SQL

    • Other table(s).

    • Same MariaDB server

    • Translates operation to cpimport command. • Non-blocking

    Load Ordered Data in Proper Order
    cpimport
    Read Replica topology
    Extent Elimination
    High Availability and Failover
    Version Buffer
    ersion Buffer
    Online Schema Changes
    Lockless Reads
    character_set_server
    collation_server
    LOAD DATA INFILE
    INSERT...SELECT

    ColumnStore Partition Management

    Introduction

    MariaDB ColumnStore automatically creates logical horizontal partitions across every column. For ordered or semi-ordered data fields such as an order date this will result in a highly effective partitioning scheme based on that column. This allows for increased performance of queries filtering on that column since partition elimination can be performed. It also allows for data lifecycle management as data can be disabled or dropped by partition cheaply. Caution should be used when disabling or dropping partitions as these commands are destructive.

    It is important to understand that a Partition in ColumnStore terms is actually 2 extents (16 million rows) and that extents & partitions are created according to the following algorithm in 1.0.x:

    1. Create 4 extents in 4 files

    2. When these are filled up (after 32M rows), create 4 more extents in the 4 files created in step 1.

    3. When these are filled up (after 64M rows), create a new partition.

    Managing Partitions by Partition Number

    Displaying Partitioning Information

    Information about all partitions for a given column can be retrieved using the calShowPartitions stored procedure which takes either two or three mandatory parameters: [database_name], table_name, and column_name. If two parameters are provided the current database is assumed. For example:

    Enabling Partitions

    The calEnablePartitions stored procedure allows for enabling of one or more partitions. The procedure takes the same set of parameters as calDisablePartitions.

    For example:

    The result showing the first partition has been enabled:

    Disabling Partitions

    The calDisablePartitions stored procedure allows for disabling of one or more partitions. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes either two or three mandatory parameters: [database_name], table_name, and partition_numbers separated by commas. If two parameters are provided the current database is assumed.

    For example:

    The result showing the first partition has been disabled:

    Dropping Partitions

    The calDropPartitions stored procedure allows for dropping of one or more partitions. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of parameters as calDisablePartitions. Extra caution should be used with this procedure since it is destructive and cannot be reversed.

    For example:

    The result showing the first partition has been dropped:

    Managing Partitions by Column Value

    Displaying Partitioning Information

    Information about a range of parititions for a given column can be retrieved using the calShowPartitionsByValue stored procedure. This procedure takes either four or five mandatory parameters: [database_name], table_name,`` column_name,`` start_value, and`` end_value. If four parameters are provided, the current database is assumed. Only casual partition column types (, , , , up to 8 bytes and up to 7 bytes) are supported for this function.

    The function returns a list of partitions whose minimum and maximum values for the column col_name fall completely within the range of start_value and end_value.

    For example:

    Enabling Partitions

    The calEnablePartitionsbyValue stored procedure allows for enabling of one or more partitions by value. The procedure takes the same set of arguments as calShowPartitionsByValue.

    A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calEnablePartitionsbyValue call.

    For example:

    The result showing the first partition has been enabled:

    Disabling Partitions

    The calDisablePartitionsByValue stored procedure allows for disabling of one or more partitions by value. A disabled partition still exists on the file system (and can be enabled again at a later time) but will not participate in any query, DML or import activity. The procedure takes the same set of arguments as calShowPartitionsByValue.

    A good practice is to use calShowPartitionsByValue to identify the partitions to be disabled and then the same argument values used to construct the calDisablePartitionsByValue call. For example:

    The result showing the first partition has been disabled:

    Dropping Partitions

    The calDropPartitionsByValue stored procedure allows for dropping of one or more partitions by value. Dropping means that the underlying storage is deleted and the partition is completely removed. A partition can be dropped from either enabled or disabled state. The procedure takes the same set of arguments as calShowPartitionsByValue. A good practice is to use calShowPartitionsByValue to identify the partitions to be enabled and then the same argument values used to construct the calDropPartitionsByValue call. Extra caution should be used with this procedure since it is destructive and cannot be reversed.

    For example:

    The result showing the first partition has been dropped:

    Dropping Data Outside of Partitions

    Since the partitioning scheme is system-maintained, the minimum and maximum values are not directly specified, but influenced by the order of data loading. If you want to drop a specific date range, additional deletes are required to achieve this. The following cases may occur:

    • For semi-ordered data, there may be overlap between minimum and maximum values between partitions.

    • As in the example above, the partition ranges from 1992-01-01 to 1998-08-02. It may be desirable to drop the remaining 1998 rows.

    A bulk-delete statement can be used to delete the remaining rows that do not fall exactly within partition ranges. The partition drops will be fastest; however, the system optimizes bulk-delete statements to delete by block internally. This is still relatively fast.

    Mariadb Enterprise Columnstore Query Evaluation

    Overview

    MariaDB Enterprise ColumnStore is a smart storage engine designed to efficiently execute analytical queries using distributed query execution and massively parallel processing (MPP) techniques.

    Scalability

    Multi-Node S3

    This guide provides steps for deploying a multi-node S3 ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.

    Overview

    This procedure describes the deployment of the Single-Node Enterprise ColumnStore topology with Object storage.

    MariaDB Enterprise ColumnStore 23.10 is a columnar storage engine for MariaDB Enterprise Server and Enterprise ColumnStore is best suited for Online Analytical Processing (OLAP) workloads.

    This procedure has 5 steps, which are executed in sequence.

    This page provides an overview of the topology, requirements, and deployment procedures.

    Please read and understand this procedure before executing.

    Upgrade Multi-Node MariaDB Enterprise ColumnStore from 6 to 23.10

    These instructions detail the upgrade from MariaDB Enterprise ColumnStore 6 to MariaDB Enterprise ColumnStore 23.10 in a Multi-Node topology on a range of .

    Set Replicas to Maintenance Mode

    This action is performed for each replica server on the MaxScale node.

    Prior to upgrading, the replica servers must be in MaxScale. The replicas can be set to maintenance mode in MaxScale using . If you are using , the replicas can be set to maintenance mode using the set server

    Step 5: Test MariaDB Enterprise Server

    Step 5: Test MariaDB Enterprise Server

    Overview

    This page details step 5 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step tests MariaDB Enterprise Server and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.
    ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.
    maxctrl unlink service \
       mcs_service \
       mcs3
    maxctrl show services
    maxctrl unlink monitor \
       mcs_monitor \
       mcs3
    maxctrl show monitors
    maxctrl destroy server \
       mcs3
    maxctrl show servers
    sudo systemctl stop mariadb
    sudo systemctl stop mariadb-columnstore
    sudo systemctl stop mariadb-columnstore-cmapi
    curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":20, "node": "192.0.2.3"}' \
       | jq .
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/remove-node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":20, "node": "192.0.2.3"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:39:14.672142",
      "node_id": "192.0.2.3"
    }
    curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 2
    }
    $ sudo testS3Connection
    StorageManager[26887]: Using the config file found at /etc/columnstore/storagemanager.cnf
    StorageManager[26887]: S3Storage: S3 connectivity & permissions are OK
    S3 Storage Manager Configuration OK
    $ systemctl status mariadb
    $ sudo systemctl start mariadb
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    $ systemctl status mariadb-columnstore-cmapi
    $ sudo systemctl start mariadb-columnstore-cmapi
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    $ sudo mariadb
    CREATE DATABASE IF NOT EXISTS test;
    
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE = ColumnStore;
    $ sudo mariadb
    SHOW CREATE TABLE test.contacts\G;
    $ sudo mariadb
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    $ sudo mariadb
    SELECT * FROM test.contacts;
    
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    [maxscale]
    threads          = auto
    admin_host       = 0.0.0.0
    admin_secure_gui = false
    $ sudo systemctl restart maxscale
    $ maxctrl create server mcs1 192.0.2.101
    $ maxctrl create server mcs2 192.0.2.102
    $ maxctrl create server mcs3 192.0.2.103
    $ maxctrl create monitor columnstore_monitor mariadbmon \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         replication_user=repl \
         replication_password='REPLICATION_USER_PASSWORD' \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create service connection_router_service readconnroute \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         router_options=slave \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create listener connection_router_service connection_router_listener 3308 \
         protocol=MariaDBClient
    $ maxctrl create service query_router_service readwritesplit  \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create listener query_router_service query_router_listener 3307 \
         protocol=MariaDBClient
    $ maxctrl start services
    [maxscale]
    threads          = auto
    admin_host       = 0.0.0.0
    admin_secure_gui = false
    $ sudo systemctl restart maxscale
    $ maxctrl create server mcs1 192.0.2.101
    $ maxctrl create server mcs2 192.0.2.102
    $ maxctrl create server mcs3 192.0.2.103
    $ maxctrl create monitor columnstore_monitor mariadbmon \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         replication_user=repl \
         replication_password='REPLICATION_USER_PASSWORD' \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create service connection_router_service readconnroute \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         router_options=slave \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create listener connection_router_service connection_router_listener 3308 \
         protocol=MariaDBClient
    $ maxctrl create service query_router_service readwritesplit  \
         user=mxs \
         password='MAXSCALE_USER_PASSWORD' \
         --servers mcs1 mcs2 mcs3
    $ maxctrl create listener query_router_service query_router_listener 3307 \
         protocol=MariaDBClient
    $ maxctrl start services
    CREATE TABLE employees_innodb (
        id INT PRIMARY KEY,
        name VARCHAR(100),
        manager_id INT  -- references employees.id (nullable for top-level)
    );
    
    INSERT INTO employees_innodb (id, name, manager_id) VALUES
    (1, 'CEO', NULL),
    (2, 'VP of Sales', 1),
    (3, 'Sales Rep A', 2),
    (4, 'VP of Eng', 1),
    (5, 'Eng A', 4),
    (6, 'Eng B', 4);
    
    CREATE TABLE employees (
        id INT,
        name VARCHAR(100),
        manager_id INT  -- references employees.id (nullable for top-level)
    ) engine=columnstore;
    
    INSERT INTO employees (id, name, manager_id) VALUES
    (1, 'CEO', NULL),
    (2, 'VP of Sales', 1),
    (3, 'Sales Rep A', 2),
    (4, 'VP of Eng', 1),
    (5, 'Eng A', 4),
    (6, 'Eng B', 4);
    
    WITH RECURSIVE org_chart AS (
        -- Anchor: start with the top-level manager (CEO)
        SELECT id, name, manager_id, 0 AS level
        FROM employees
        WHERE id = 1
    
        UNION ALL
    
        -- Recursive step: find employees who report to the previous level
        SELECT e.id, e.name, e.manager_id, oc.level + 1
        FROM employees e
        JOIN org_chart oc ON e.manager_id = oc.id
    )
    SELECT * FROM org_chart;
    
    ERROR 1178 (42000): The storage engine for the table doesn't support Recursive CTE
    SET SESSION columnstore_select_handler=OFF;
    
    WITH RECURSIVE org_chart AS (
        -- Anchor: start with the top-level manager (CEO)
        SELECT id, name, manager_id, 0 AS level
        FROM employees
        WHERE id = 1
    
        UNION ALL
    
        -- Recursive step: find employees who report to the previous level
        SELECT e.id, e.name, e.manager_id, oc.level + 1
        FROM employees e
        JOIN org_chart oc ON e.manager_id = oc.id
    )
    SELECT * FROM org_chart;
    
    SET SESSION columnstore_select_handler=ON;
    
    CREATE TABLE temp_org_chart (
        id INT,
        name VARCHAR(100),
        manager_id INT,
        level INT
    );
    
    -- Initialize the temporary table with the top-level employees
    INSERT INTO temp_org_chart (id, name, manager_id, level)
    SELECT id, name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL;
    DELIMITER //
    
    CREATE OR REPLACE PROCEDURE populate_org_chart()
    BEGIN
      DECLARE v_level INT DEFAULT 1;
      DECLARE rows_inserted INT DEFAULT 1;
    
      -- Loop until no more rows are inserted, indicating the hierarchy is fully traversed
      WHILE rows_inserted > 0 DO
    
        -- Insert employees who report to the previous level
        INSERT INTO temp_org_chart (id, name, manager_id, level)
        SELECT e.id, e.name, e.manager_id, v_level
        FROM employees e
        JOIN temp_org_chart t ON e.manager_id = t.id
        WHERE t.level = v_level - 1
          AND NOT EXISTS (
              SELECT 1 FROM temp_org_chart x WHERE x.id = e.id
          );
    
        -- Get the number of rows inserted in the current iteration
        SET rows_inserted = ROW_COUNT();
        -- Increment the level for the next iteration
        SET v_level = v_level + 1;
    
      END WHILE;
    END //
    
    DELIMITER ;
    CALL populate_org_chart();
    SELECT * FROM temp_org_chart;
    mariadb test -qsNe "SHOW CREATE TABLE employees" \
      | awk -F '\t' '{print $2}' \
      | sed -e 's/ENGINE=Columnstore/ENGINE=InnoDB/' \
            -e 's/CREATE TABLE `employees`/CREATE TABLE `employees2`/' \
      > create_employees2.sql
    
    mariadb test < create_employees2.sql
    mariadb test -e "INSERT INTO employees2 SELECT * FROM employees"
    WITH RECURSIVE org_chart AS (
        -- Anchor: start with the top-level manager (CEO)
        SELECT id, name, manager_id, 0 AS level
        FROM employees2
        WHERE id = 1
    
        UNION ALL
    
        -- Recursive step: find employees who report to the previous level
        SELECT e.id, e.name, e.manager_id, oc.level + 1
        FROM employees2 e
        JOIN org_chart oc ON e.manager_id = oc.id
    )
    SELECT * FROM org_chart;
    sudo yum -y install epel-release
    sudo yum -y install <data adapter>.rpm
    sudo apt-get update
    sudo dpkg -i <data adapter>.deb
    sudo apt-get -f install
    sudo echo "deb http://httpredir.debian.org/debian jessie-backports main contrib non-free" >> /etc/apt/sources.list
    sudo apt-get update
    sudo dpkg -i <data adapter>.deb
    sudo apt-get -f install
    Usage: mxs_adapter [OPTION]... DATABASE TABLE
    
     -f FILE      TSV file with database and table names to stream (must be in `database TAB table NEWLINE` format)
      -h HOST      MaxScale host (default: 127.0.0.1)
      -P PORT      Port number where the CDC service listens (default: 4001)
      -u USER      Username for the MaxScale CDC service (default: admin)
      -p PASSWORD  Password of the user (default: mariadb)
      -c CONFIG    Path to the Columnstore.xml file (default: '/usr/local/mariadb/columnstore/etc/Columnstore.xml')
      -a           Automatically create tables on ColumnStore
      -z           Transform CDC data stream from historical data to current data (implies -n)
      -s           Directory used to store the state files (default: '/var/lib/mxs_adapter')
      -r ROWS      Number of events to group for one bulk load (default: 1)
      -t TIME      Connection timeout (default: 10)
      -n           Disable metadata generation (timestamp, GTID, event type)
      -i TIME      Flush data every TIME seconds (default: 5)
      -l FILE      Log output to FILE instead of stdout
      -v           Print version and exit
      -d           Enable verbose debug output
    test	t1
    test	t2
    maxadmin call command cdc add_user <service> <user> <password>
    mxs_adapter -u <user> -p <password> -h <host> -P <port> -c <path to Columnstore.xml> <database><table>
    sudo apt-get install libuv1 libxml2 libsnappy1v5
    sudo yum install epel-release
    sudo yum install libuv libxml2 snappy
    [mariadb]
    columnstore_innodb_queries_use_mcs = on
    SET columnstore_unstable_optimizer=ON;
    SET optimizer_switch="index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=on,in_to_exists=off,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=off,orderby_uses_equalities=off,condition_pushdown_for_derived=on,split_materialized=off,condition_pushdown_for_subquery=off,rowid_filter=off,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off,sargable_casefold=off";
    CREATE DATABASE IF NOT EXISTS test; USE test;
    CREATE TABLE IF NOT EXISTS test.customer_indexed (  `c_d_id` int(2) NOT NULL, `c_w_id` int(6) NOT NULL, `c_first` varchar(16) , `c_middle` char(2) , `c_last` varchar(16) , `c_street_1` varchar(20) , `c_street_2` varchar(20) , `c_city` varchar(20) , `c_state` char(2) , `c_zip` int(5) , `c_phone` char(16) , `c_since` datetime DEFAULT NULL, `c_credit` char(2) , `c_credit_lim` decimal(12,2) DEFAULT NULL, `c_discount` decimal(4,4) DEFAULT NULL, `c_balance` decimal(12,2) DEFAULT NULL, `c_ytd_payment` decimal(12,2) DEFAULT NULL, `c_payment_cnt` int(8) DEFAULT NULL, `c_delivery_cnt` int(8) DEFAULT NULL, `c_data` varchar(500)) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    INSERT INTO test.customer_indexed  SELECT  ROUND(RAND() * 42000, 0), ROUND(RAND() * 42000, 0), substring(MD5(RAND()*1000000000),1,16), substring(MD5(RAND()),1,2), substring(MD5(RAND()*1000000000),1,16), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()*1000000000),1,20), substring(MD5(RAND()),1,2), ROUND(RAND() * 42000, 0), substring(MD5(RAND()),1,16), CURRENT_TIMESTAMP - INTERVAL FLOOR(RAND() * 365 * 24 * 60 *60) SECOND, substring(MD5(RAND()),1,2), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 0, 4), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 9999999999, 2), ROUND(RAND() * 42000, 0), ROUND(RAND() * 42000, 0), substring(MD5(RAND()*1000000000),1,500) FROM seq_1_to_8000000; -- 3.5 min
    ALTER TABLE test.customer_indexed ADD INDEX idx_fast (`c_zip`, `c_payment_cnt`); -- ~1.5 min
    -- baseline 
    SELECT c_zip, sum(c_payment_cnt)  FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ;  --2.6s 
    sed -i 's/^#columnstore_innodb_queries_use_mcs = on/columnstore_innodb_queries_use_mcs = on/' /etc/my.cnf.d/columnstore.cnf
    systemctl restart mariadb
    # In mariadb (MariaDB command-line client)
    USE test;
    ANALYZE table test.customer_indexed PERSISTENT FOR COLUMNS (c_zip,c_payment_cnt) indexes(); --8s
    SELECT table_name, column_name, hist_type FROM mysql.column_stats WHERE table_name="customer_indexed"; 
    SHOW VARIABLES LIKE "%columnstore_innodb_queries_use_mcs%";
    tail -f /var/log/mariadb/columnstore/debug.log
    SET columnstore_ces_optimization_parallel_factor=100;
    EXPLAIN FORMAT=JSON SELECT c_zip, SUM(c_payment_cnt) FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ;
    ...
    | {
      "query_block": {
        "select_id": 1,
        "table": {
          "message": "Pushed select"
        }
      }
    } |
    ...
    SELECT column_a, SUM(column_b) FROM innodb_table GROUP BY column_a
     SELECT column_a FROM tbl WHERE column_a = column_b 
    ANALYZE TABLE table_name PERSISTENT FOR COLUMNS (column_name) indexes();
    Automatically replays transactions after node failures
  • Optionally enforces causal reads

  • Read Connection (readconnroute)
    Configure Read Connection Router
    Read/Write Split (readwritesplit)
    Configure Read/Write Split
    Automatically replays transactions after node failures
  • Optionally enforces causal reads|

  • Read Connection (readconnroute)
    Configure Read Connection Router
    Read/Write Split (readwritesplit)
    Configure Read/Write Split
    MariaDB Enterprise ColumnStore is designed to achieve vertical and horizontal scalability for production analytics using distributed query execution and massively parallel processing (MPP) techniques.

    Enterprise ColumnStore evaluates each query as a sequence of job steps using sophisticated techniques to get the best performance for complex analytical queries. Some types of job steps are designed to scale with the system's resources. As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute those types of job steps.

    Enterprise ColumnStore stores each column on disk in extents. The storage format is designed to maintain scalability, even as the table grows. If an operation does not read parts of a large table, I/O costs are reduced. Enterprise ColumnStore uses a technique called extent elimination that compares the maximum and minimum values in the extent map to the query's conditions, and it avoids scanning extents that don't satisfy the conditions.

    Enterprise ColumnStore provides exceptional scalability for analytical queries. Enterprise ColumnStore's design supports targeted scale-out to address increased workload requirements, whether it is a larger query load or increased storage and query processing capacity.

    Horizontal Scalability

    MariaDB Enterprise ColumnStore provides horizontal scalability by executing some types of job steps in a distributed manner using multiple nodes.

    When Enterprise ColumnStore is evaluating a job step, the ExeMgr process or facility on the initiator/aggregator node requests the PrimProc process on each node to perform the job step on different extents in parallel. As more nodes are added, Enterprise ColumnStore can perform more work in parallel.

    Enterprise ColumnStore also uses massively parallel processing (MPP) techniques to speed up some types of job steps. For some types of aggregation operations, each node can perform an initial local aggregation, and then the initiator/aggregator node only needs to combine the local results and perform a final aggregation. This technique can be very efficient for some types of aggregation operations, such as for queries that use the AVG(), COUNT(), or SUM() aggregate functions.

    Vertical Scalability

    MariaDB Enterprise ColumnStore provides vertical scalability by executing some types of job steps in a multi-threaded manner using a thread pool.

    When the PrimProc process on a node receives work, it executes the job step on an extent in a multi-threaded manner using a thread pool. Each thread operates on a different block within the extent. As more CPUs are added, Enterprise ColumnStore can work on more blocks in parallel.

    Extent Elimination

    ECStore-QueryExecutionExtentElimination

    MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as table size increases.

    Most databases are row-based databases that use manually-created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.

    Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each extent in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the extent map to perform extent elimination.

    When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.

    Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.

    Custom Select Handler

    The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.

    All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.

    For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.

    The generic select handler is not optimal for Enterprise ColumnStore, because:

    • Enterprise ColumnStore selects data by column, but the generic selects handler selects data by row

    • Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not

    • Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not

    • Enterprise ColumnStore supports distributed functions, but the generic select handler does not

    • Enterprise ColumnStore supports extent elimination, but the generic select handler does not

    • Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it

    Smart Storage Engine

    The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

    If a storage engine implements a custom select handler, it is known as a smart storage engine.

    As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.

    Configure the Select Handler

    The ColumnStore storage engine can use either the custom select handler or the generic select handler. The select handler can be configured using the columnstore_select_handler system variable:

    Value
    Description

    AUTO

    • When set to AUTO, Enterprise ColumnStore automatically chooses the best select handler for a given SELECT query.

    • AUTO was added in Enterprise ColumnStore 6.

    OFF

    • When set to OFF, Enterprise ColumnStore uses the generic select handlers for all SELECT queries.

    • It is not recommended to use this value, unless recommended by MariaDB Support.

    ON

    • When set to ON, Enterprise ColumnStore uses the custom select handlers for all SELECT queries.

    • ON is the default in Enterprise ColumnStore 5 and Enterprise ColumnStore 6.

    Joins

    MariaDB Enterprise ColumnStore performs join operations using hash joins.

    By default, hash joins are performed in memory.

    Configure In-Memory Joins

    MariaDB Enterprise ColumnStore can be configured to allocate more memory for hash joins.

    The relevant configuration options are:

    Section
    Option
    Description

    HashJoin

    PmMaxMemorySmallSide

    • Configures the amount of memory available for a single join.

    • Valid values are from 0 to 4 GB.

    • Default value is 1 GB.

    HashJoin

    TotalUmMemory

    • Configures the amount of memory available for all joins.

    • Values can be specified as a percentage of total system memory or as a specific amount of memory.

    • Valid percentage values are from 0 to 100%

    For example, to configure Enterprise ColumnStore to use more memory for hash joins using the mcsSetConfig utility:

    Configure Disk-Based Joins

    MariaDB Enterprise ColumnStore can be configured to perform disk-based joins.

    The relevant configuration options are:

    Section
    Option
    Description

    HashJoin

    AllowDiskBasedJoin

    • Enables disk-based joins

    • Valid values are Y and N

    • Default value is N

    HashJoin

    TempFileCompression

    • Enables compression for temporary files used by disk-based joins

    • Valid values are Y and N

    • Default value is N

    SystemConfig

    SystemTempFileDir

    • Configures the directory used for temporary files used by disk-based joins and aggregations

    • Default value is /tmp/columnstore_tmp_files

    For example, to configure Enterprise ColumnStore to perform disk-based joins using the mcsSetConfig utility:

    Aggregations

    MariaDB Enterprise ColumnStore performs aggregation operations on all nodes in a distributed manner, and then all nodes send their results to a single node, which combines the results and performs the final aggregation.

    By default, aggregation operations are performed in memory.

    Configure Disk-Based Aggregations

    In Enterprise ColumnStore 5.6.1 and later, disk-based aggregations can be configured.

    The relevant configuration options are:

    Section
    Option
    Description

    RowAggregation

    AllowDiskBasedAggregation

    • Enables disk-based joins

    • Valid values are Y and N

    • Default value is N

    RowAggregation

    Compression

    • Enables compression for temporary files used by disk-based joins

    • Valid values are Y and N

    • Default value is N

    SystemConfig

    SystemTempFileDir

    • Configures the directory used for temporary files used by disk-based joins and aggregations

    • Default value is /tmp/columnstore_tmp_files

    For example, to configure Enterprise ColumnStore to perform disk-based aggregations using the mcsSetConfig utility:

    Query Planning

    The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its own queries using the custom select handler.

    MariaDB Enterprise ColumnStore's query planning is divided into two steps:

    • ES provides the query's SELECT_LEX object to the custom select handler. The custom select handler builds a ColumnStore Execution Plan (CSEP).

    • The custom select handler provides the CSEP to the ExeMgr process or facility on the same node. ExeMgr performs extent elimination and creates a job list.

    ExeMgr Process/Facility

    The ColumnStore storage engine provides the CSEP to the ExeMgr process or facility on the same node, which will act as the initiator/aggregator node for the query.

    Starting with MariaDB Enterprise ColumnStore 22.08, the ExeMgr facility has been integrated into the PrimProc process, so it is no longer a separate process.

    ExeMgr performs multiple tasks:

    • Performs extent elimination.

    • Views the optimizer statistics.

    • Transforms the CSEP to a job list, which consists of job steps.

    • Assigns distributed job steps to the PrimProc process on each node.

    • Evaluates non-distributed job steps itself.

    • Provides final query results to ES.

    Query Evaluation Process

    ECStore-QueryExecutionwith-S3-FlowChart

    When Enterprise ColumnStore executes a query, it goes through the following process:

    1. The client or application sends the query to MariaDB MaxScale's listener port.

    2. The query is processed by the Read/Write Split Router (readwritesplit) service associated with the listener.

    3. The service routes the query to the ES TCP port on a ColumnStore node.

    4. MariaDB Enterprise Server (ES) evaluates the query using the handler interface.

    • The handler interface builds a SELECT_LEX object to represent the query.

    • The handler interface provides the SELECT_LEX object to the ColumnStore storage engine's select handler.

    • The select handler transforms the SELECT_LEX object into a ColumnStore Execution Plan (CSEP).

    • The select handler provides the CSEP to the ExeMgr facility on the same node, which will act as the initiator/aggregator node for the query.

    1. ExeMgr transforms the CSEP into a job list, which consists of job steps.

    2. ExeMgr evaluates each job step sequentially.

    • If it is a non-distributed job step, ExeMgr evaluates the job step itself.

    • If it is a distributed job step, ExeMgr provides the job step to the PrimProc process on each node. The PrimProc process on each node evaluates the job step in a multi-threaded manner using a thread pool. After the PrimProc process on each node evaluates its job step, the results are returned to ExeMgr on the initiator/aggregator node as a Row Group.

    1. After all job steps are evaluated, ExeMgr returns the results to ES.

    2. ES returns the results to MaxScale.

    3. MaxScale returns the results to the client or application.

    Procedure Steps

    Step
    Description

    Step 1

    Step 2

    Step 3

    Step 4

    Step 5

    Support

    Customers can obtain support by submitting a support case.

    Components

    The following components are deployed during this procedure:

    Component
    Function

    Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.

    MariaDB Enterprise Server Components

    Component
    Description

    • Columnar Storage Engine

    • Optimized for Online Analytical Processing (OLAP) workloads

    • S3-compatible object storage

    Topology

    The Single-Node Enterprise ColumnStore topology provides support for Online Analytical Processing (OLAP) workloads to MariaDB Enterprise Server.

    The Enterprise ColumnStore node:

    • Receives queries from the application

    • Executes queries

    • Use S3-compatible object storage for data

    High Availability

    Single-Node Enterprise ColumnStore does not provide high availability (HA) for Online Analytical Processing (OLAP). If you would like to deploy Enterprise ColumnStore with high availability, see Enterprise ColumnStore with Object storage.

    Requirements

    These requirements are for the Single-Node Enterprise ColumnStore, when deployed with MariaDB Enterprise Server and MariaDB Enterprise ColumnStore.

    Operating System

    • Debian 11 (x86_64, ARM64)

    • Debian 12 (x86_64, ARM64)

    • Red Hat Enterprise Linux 8 (x86_64, ARM64)

    • Red Hat Enterprise Linux 9 (x86_64, PPC64LE, ARM64)

    • Red Hat UBI 8 (x86_64, ARM64)

    • Rocky Linux 8 (x86_64, ARM64)

    • Rocky Linux 9 (x86_64, ARM64)

    • Ubuntu 20.04 LTS (x86_64, ARM64)

    • Ubuntu 22.04 LTS (x86_64, ARM64)

    • Ubuntu 24.04 LTS (x86_64, ARM64)

    Minimum Hardware Requirements

    MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the recommended hardware requirements instead.

    The minimum hardware requirements are:

    Component
    CPU
    Memory

    Enterprise ColumnStore node

    4+ cores

    16+ GB

    MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.

    If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:

    And the following error message will be raised to the client:

    Recommended Hardware Requirements

    MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.

    The recommended hardware requirements are:

    Component
    CPU
    Memory

    Enterprise ColumnStore node

    64+ cores

    128+ GB

    Storage Requirements

    Single-node Enterprise ColumnStore with Object Storage requires the following storage type:

    Storage Type
    Description

    Single-node Enterprise ColumnStore with Object Storage uses S3-compatible object storage to store data.

    S3-Compatible Object Storage Requirements

    Single-node Enterprise ColumnStore with Object Storage uses S3-compatible object storage to store data.

    Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

    For the preferred S3-compatible object storage providers that provide cloud and hardware solutions, see the following sections:

    • Cloud

    • Hardware

    The use of non-cloud and non-hardware providers is at your own risk.

    If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.

    Preferred Object Storage Providers: Cloud

    • Amazon Web Services (AWS) S3

    • Google Cloud Storage

    • Azure Storage

    • Alibaba Cloud Object Storage Service

    Preferred Object Storage Providers: Hardware

    • Cloudian HyperStore

    • Dell EMC

    • Seagate Lyve Rack

    • Quantum ActiveScale

    • IBM Cloud Object Storage

    Quick Reference

    MariaDB Enterprise Server Configuration Management

    Method
    Description

    Configuration File

    Configuration files (such as /etc/my.cnf) can be used to set and . The server must be restarted to apply changes made to configuration files.

    Command-line

    The server can be started with command-line options that set and .

    SQL

    Users can set that support dynamic changes on-the-fly using the statement.

    MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.

    To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.

    Distribution
    Example Configuration File Path
    • CentOS

    • Red Hat Enterprise Linux (RHEL)

    /etc/my.cnf.d/z-custom-mariadb.cnf

    • Debian

    • Ubuntu

    /etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf

    MariaDB Enterprise Server Service Management

    The systemctl command is used to start and stop the MariaDB Enterprise Server service.

    Operation
    Command

    Start

    sudo systemctl start mariadb

    Stop

    sudo systemctl stop mariadb

    Restart

    sudo systemctl restart mariadb

    Enable during startup

    sudo systemctl enable mariadb

    Disable during startup

    sudo systemctl disable mariadb

    Status

    sudo systemctl status mariadb

    Next Step

    Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

    Next: Step 1: Install MariaDB Enterprise ColumnStore.

    command:
    • As the first argument, provide the name for the server

    • As the second argument, provide maintenance as the state

    Confirm Maintenance Mode is Set for Replicas

    This action is performed on the MaxScale node.

    Confirm that the replicas are set to maintenance mode in MaxScale using MaxScale's REST API. If you are using MaxCtrl, the state of the replicas can be viewed using the list servers command:

    If the node is properly in maintenance mode, then the State column will show Maintenance as one of the states.

    Disable GTID Strict Mode

    This action is performed on each replica server.

    The system variable must be disabled for this upgrade procedure. If the gtid_strict_mode system variable is enabled in any configuration files, disable it temporarily until the upgrade procedure is complete.

    You can check if the gtid_strict_mode system variable is set in a configuration file by executing my_print_defaults command with the mysqld option:

    If the gtid_strict_mode system variable is set, you can temporarily disable it by adding # in front of it in the configuration file, so that it will be treated as a comment and ignored:

    Shutdown ColumnStore

    Prior to upgrading, MariaDB Enterprise ColumnStore must be shutdown.

    Stop Services

    This action is performed on each ColumnStore node.

    Prior to upgrading, several services must be stopped on each ColumnStore node:

    1. Stop the CMAPI service:

    2. Stop the MariaDB Enterprise ColumnStore service:

    3. Stop the MariaDB Enterprise Server service:

    Upgrade to the New Version

    MariaDB Corporation provides package repositories for YUM (RHEL, CentOS, Rocky Linux) and APT (Debian, Ubuntu).

    Upgrade via YUM (RHEL, CentOS, Rocky Linux)

    1. Retrieve your Customer Download Token at https://customers.mariadb.com/downloads/token/ and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.

    2. Configure the YUM package repository.

      Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to .

      To configure YUM package repositories:

      1. Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    3. Update MariaDB Enterprise Server and package dependencies:

    Upgrade via APT (Debian, Ubuntu)

    1. Retrieve your Customer Download Token at https://customers.mariadb.com/downloads/token/ and substitute for CUSTOMER_DOWNLOAD_TOKEN in the following directions.

    2. Configure the APT package repository.

      Enterprise ColumnStore 23.10 is included with MariaDB Enterprise Server 11.4. Pass the version to install using the --mariadb-server-version flag to mariadb_es_repo_setup.

      To configure APT package repositories:

      1. Checksums of the various releases of the mariadb_es_repo_setup script can be found in the section at the bottom of the page. Substitute ${checksum} in the example above with the latest checksum.

    3. Update MariaDB Enterprise Server and package dependencies.

      The update command depends on the installed APT version, which can be determined by executing the following command:

      For versions prior to APT 2.0, execute the following command:

      For APT 2.0 and later, execute the following command:

    Disable ColumnStore Service

    This action is performed on each ColumnStore node.

    After upgrading, the MariaDB Enterprise ColumnStore service should be stopped, since it will be controlled by CMAPI:

    CMAPI disables the Enterprise ColumnStore service in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not need to start automatically upon reboot.

    Start Services

    This action is performed on each ColumnStore node.

    After upgrading, the CMAPI service and the MariaDB Enterprise Server service must be started on each ColumnStore node:

    1. Start the CMAPI service:

    2. Start the MariaDB Enterprise Server service:

    Write Binary Log

    On the primary server, run mariadb-upgrade to upgrade the data directory with binary logging enabled to update the system tables:

    Start ColumnStore

    After upgrading, MariaDB Enterprise ColumnStore must be started.

    Enable GTID Strict Mode

    This action is performed on each replica server.

    If you temporarily disabled the system variable in the Disable GTID Strict Mode step, it can be re-enabled. If the gtid_strict_mode system variable was temporarily disabled in any configuration files, re-enable it.

    Confirm ColumnStore Version

    This action is performed on each ColumnStore node.

    After upgrading, it is recommended to confirm the Enterprise ColumnStore version on each ColumnStore node. Connect to the node using and query the Columnstore_version status variable with :

    Confirm ES Version

    This action is performed on each ColumnStore node.

    After upgrading, it is recommended to confirm the ES version on each ColumnStore node. Connect to the node using and query the version system variable with :

    Clear Maintenance Mode for Replicas

    This action is performed for each replica server on the MaxScale node.

    After the upgrade, maintenance mode for each replica has been cleared in MaxScale using . If you are using , maintenance mode can be cleared using the clear server command:

    • As the first argument, provide the name for the server

    • As the second argument, provide maintenance as the state

    Confirm Maintenance Mode is Cleared for Replicas

    This action is performed for each replica server on the MaxScale node.

    Confirm that maintenance mode in MaxScale has been cleared for each replica using . If you are using , the state of the replicas can be viewed using the list servers command:

    If the node is no longer in maintenance mode, then the State column will no longer show Maintenance as one of the states.

    supported Operating Systems
    set to maintenance mode
    Test Enterprise Server Service

    Use Systemd to test whether the MariaDB Enterprise Server service is running. This action is performed on each Enterprise ColumnStore node.

    Check if the MariaDB Enterprise Server service is running by executing the following:

    If the service is not running on any node, start the service by executing the following on that node:

    Test Local Client Connections

    Use MariaDB Client to test the local connection to the Enterprise Server node.

    This action is performed on each Enterprise ColumnStore node:

    The sudo command is used here to connect to the Enterprise Server node using the root@localhost user account, which authenticates using the unix_socket authentication plugin. Other user accounts can be used by specifying the --user and --password command-line options.

    Test ColumnStore Storage Engine Plugin

    Query the table to confirm that the ColumnStore storage engine is loaded.

    This action is performed on each Enterprise ColumnStore node.

    Execute the following query:

    The PLUGIN_STATUS column for each ColumnStore-related plugin should contain ACTIVE.

    Test CMAPI Service

    Use Systemd to test whether the CMAPI service is running. This action is performed on each Enterprise ColumnStore node.

    Check if the CMAPI service is running by executing the following:

    If the service is not running on any node, start the service by executing the following on that node:

    Test ColumnStore Status

    Use CMAPI to request the ColumnStore status. The API key needs to be provided as part of the X-API-key HTML header.

    This action is performed with the CMAPI service on the primary server.

    Check the ColumnStore status using curl by executing the following:

    Test DDL

    Use MariaDB Client to test DDL.

    1. On the primary server, use the MariaDB Client to connect to the node:

    1. Create a test database and ColumnStore table:

    1. On each replica server, use the MariaDB Client to connect to the node:

    1. Confirm that the database and table exist:

    If the database or table do not exist on any node, then check the replication configuration.

    Test DML

    Use MariaDB Client to test DML.

    1. On the primary server, use the MariaDB Client to connect to the node:

    1. Insert sample data into the table created in the DDL test:

    1. On each replica server, use the MariaDB Client to connect to the node:

    1. Execute a query to retrieve the data:

    If the data is not returned on any node, check the ColumnStore status and the storage configuration.

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 5 of 9.

    Next: Step 6: Install MariaDB MaxScale.

    Adding a Node

    Adding a Node to MariaDB Enterprise ColumnStore

    To add a new node to Enterprise ColumnStore, perform the following procedure.

    Deploying Enterprise ColumnStore

    Before you can add a node to Enterprise ColumnStore, confirm that the Enterprise ColumnStore software has been deployed on the node in the desired topology.

    For additional information, see "".

    Backing Up MariaDB Data Directory on the Primary Server

    Before the new node can be added, its MariaDB data directory must be consistent with the Primary Server. To ensure that it is consistent, take a backup of the Primary Server:

    The instructions below show how to perform a backup using .

    1. On the Primary Server, take a full backup:

      Confirm successful completion of the backup operation.

    2. On the Primary Server, prepare the backup:

      Confirm successful completion of the prepare operation.

    Restoring the Backup on the New Node

    To make the new node consistent with the Primary Server, restore the new backup on the new node:

    1. On the Primary Server, copy the backup to the new node:

    2. On the new node, restore the backup using .

    3. On the new node, fix the file permissions of the restored backup:

    Starting the Enterprise ColumnStore Services

    The Enterprise Server. Enterprise ColumnStore, and CMAPI services can be started using the systemctl command. In case the services were started during the installation process, use the restart command.

    Perform the following procedure on the new node:

    1. Start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    2. Start and disable the MariaDB Enterprise ColumnStore service, so that it does not start automatically upon reboot:

      Note

      The Enterprise ColumnStore service should not be enabled in a multi-node deployment. The Enterprise ColumnStore service will be started as-needed by the CMAPI service, so it does not require starting automatically upon reboot.

    3. Start and enable the CMAPI service, so that it starts automatically upon reboot:

    Configuring MariaDB Replication

    MariaDB Enterprise ColumnStore requires MariaDB Replication, which must be configured.

    1. Get the GTID position that corresponds to the restored backup.

      If the backup was taken with , this position will be located in xtrabackup_binlog_info:

      The GTID position from the above output is 0-1-2001,1-2-5139.

    2. Connect to the Replica Server using using the root@localhost user account:

    Adding the Node to Enterprise ColumnStore

    The new node must be added to Enterprise ColumnStore using :

    • Add the node using the endpoint path

    • Use a , such as curl

    • Format the JSON output using jq for enhanced readability

    For example, if the primary node's host name is mcs1 and the new node's IP address is 192.0.2.3:

    • In ES 10.5.10-7 and later:

    • In ES 10.5.9-6 and earlier:

    Example output:

    Checking Enterprise ColumnStore Status

    To confirm that the node was properly added, the status of Enterprise ColumnStore should be checked using :

    • Check the status using the endpoint path

    For example, if the primary node's host name is mcs1:

    Example output:

    Adding a Server to MaxScale

    A server object for the new node must also be added to MaxScale using :

    • Use or another supported REST client

    • Add the server object using the create server command

    • As the first argument, provide a name for the server

    • As the second argument, provide the IP address for the node

    For example:

    Verifying the Server in MaxScale

    To confirm that the server object was properly added, the server objects should be checked using :

    • Show the server objects using the show servers command

    For example:

    Linking to Monitor in MaxScale

    The server object for the new node must be linked to the monitor using :

    • Link a server object to the monitor using the link monitor command

    • As the first argument, provide the name of the monitor

    • As the second argument, provide the name of the server

    Checking the Monitor in MaxScale

    To confirm that the server object was properly linked to the monitor, the monitor should be checked using :

    • Show the monitors using the show monitors command

    For example:

    Linking to Service in MaxScale

    The server object for the new node must be linked to the service using :

    • Link the server object to the service using the link service command

    • As the first argument, provide the name of the service

    • As the second argument, provide the name of the server

    Checking the Service in MaxScale

    To confirm that the server object was properly linked to the service, the service should be checked using :

    • Show the services using the show services command

    For example:

    Checking the Replication Status with MaxScale

    MaxScale is capable of checking the status of using :

    • List the servers using the list servers command

    For example:

    If the new node is properly replicating, then the State column will show Slave, Running.

    Upgrading MariaDB Enterprise ColumnStore (Alpha)

    This page documents an Alpha version of the upgrade procedure using the mcs install_es command. Behavior may change. Validate in a non‑production environment first.

    This guide explains how to upgrade MariaDB Enterprise Server (ES) and MariaDB Enterprise ColumnStore across all nodes in a cluster using the unified mcs command-line tool that you have to run only once.

    The mcs command must be run as root. Either become root, or prefix the mcs commands on this page with sudo.

    The mcs install_es command:

    • Validates your MariaDB Enterprise Repository access using an ES API token.

    • Stops ColumnStore and MariaDB services in a controlled sequence.

    • Installs/configures the ES repository for the target version.

    • Creates a pre‑upgrade backup of ColumnStore DBRM and config files on each node.

    Prerequisites

    • Administrative privileges on all cluster nodes (package installation and service management required).

    • A valid ES API token with access to the MariaDB Enterprise Repository.

    • Network access from the nodes to the MariaDB Enterprise Repository endpoints.

    • A maintenance window: the upgrade will stop ColumnStore and MariaDB services.

    Related docs:

    • General backup and restore guidance:

    Always back up your data before upgrading. While the tool performs a pre‑upgrade backup of DBRM and configs, it is not a substitute for a full database backup.

    Command Overview

    The command can target a specific ES version, or use the latest tested version (currently latest 10.6 version).

    • Install latest tested version (if you omit the --version option, mcs uses the latest version):

    • Install a specific version:

    • Proceed even if nodes report different installed package versions (use the majority version as baseline):

    Options summary:

    • --token TEXT: ES API Token to use for the upgrade (required) — get it .

    • -v, --version TEXT: ES version to install; if omitted or set to latest, upgrades to the latest tested version.

      • For a different version, specify something like --version 10.6.23-19

    Before you Begin

    • Stop or pause write workloads and heavy ingestion (e.g., cpimport, large INSERT/LOAD DATA jobs).

    • Drain or put traffic managers/proxies (for example, MaxScale) into maintenance/drain mode.

    • Ensure you have administrative/SSH and package manager access on all nodes.

    What mcs install_es Does

    1

    Validate token and target version.

    • If --version=latest, the tool resolves the latest tested ES version.

    • If a specific version is requested, it is validated against the repository. Some versions could exists only for specific operating systems.

    2

    Post-Upgrade Checks

    • Run mcs cluster status to verify all services are up and the cluster is healthy. In case of a failure:

      • Verify CMAPI readiness on all nodes (for example, via mcs or an external monitoring tool).

    • Run a quick smoke test:

    Downgrades

    • Downgrades are supported up to MariaDB 10.6.9-5 and ColumnStore 22.08.4.

    • When downgrading, the tool doesn't automatically restart services. Complete these steps manually:

      1. Start MariaDB on each node (for example, via your service manager).

      2. Start the ColumnStore cluster (for example, using the mcs cluster start

    Downgrades can cause data loss or cluster inconsistency if not planned and validated. Always test and ensure backups are restorable.

    Verification and Logs

    After a successful upgrade, or after downgrading and a manual restart:

    • Validate that CMAPI is ready on all nodes: mcs cmapi is-ready

    • Check ColumnStore and MariaDB services are running and the cluster is healthy: mcs cluster status

    The mcs install_es command writes a detailed run log to:

    • /var/tmp/mcs_cli_install_es.log

    If CMAPI readiness times out or services do not start cleanly, review:

    • CMAPI logs: /var/log/mariadb/columnstore/cmapi_server.log

    • Service logs on each node: /var/log/mariadb/columnstore/

    • The install_es log file (/var/tmp/mcs_cli_install_es.log) for the full sequence and any errors

    Known Issues and Limitations (Alpha/Beta)

    • Mixed package versions across nodes.

      • If nodes report different installed versions of Server/ColumnStore/CMAPI, the command fails with a mismatch message.

      • You can force continuation with --ignore-mismatch; the tool uses the majority version per package as the baseline, but this carries risk—align versions whenever possible.

    Troubleshooting

    • Re‑run with -v/--verbose to enable console debug logging.

    • Inspect /var/tmp/mcs_cli_install_es.log for the complete sequence and API responses.

    • If package repository installation fails, verify token validity and outbound access from all nodes.

    Environment and Network Requirements

    • Cluster state: ColumnStore cluster should be healthy before starting.

    • Node access: All nodes must be reachable (SSH/admin access) and responsive.

    • Disk space: Ensure sufficient free space for package downloads and pre-upgrade backups.

    • Internet access: Nodes must reach MariaDB Enterprise repositories (per your operating system).

    Additional Usage Example (Downgrade)

    Downgrades can be destructive.

    This prompts for confirmation. After downgrade, services are not restarted automatically; start MariaDB and the ColumnStore cluster manually and verify health.

    Recovery Procedures

    If the upgrade fails or CMAPI does not become ready on all nodes:

    1. Review the detailed log at /var/tmp/mcs_cli_install_es.log for errors.

    2. Check service status on each node:

      • systemctl status mariadb

    Best Practices

    • Prior to upgrading:

      • Create a full backup and verify restore procedures.

      • Test the process in staging with similar topology/data.

      • Document current package versions and configs.

    Support and Reporting Issues

    Contact MariaDB Support if you encounter unexpected failures, data issues, or performance regressions. Provide:

    • The complete log file: /var/tmp/mcs_cli_install_es.log .

    • The mcs review logs: mcs review --logs .

    • The exact command used (with parameters, masking sensitive values).

    See Also

    • Command reference: mcs install_es in the command-line tool help and tool README.

    • Backups: mcs backup and Extent Map backup guidance.

    • Cluster management: mcs cluster start|stop|status .

    Using MariaDB With R

    Introduction to R

    R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, …), graphical techniques, machine learning packages and is highly extensible.

    One of R’s strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

    The R Environment

    R is an integrated suite of software facilities for data manipulation, calculation, and graphical display.

    It includes:

    • an effective data handling and storage facility,

    • a suite of operators for calculations on arrays, in particular matrices,

    • a large, coherent, integrated collection of intermediate tools for data analysis,

    • graphical facilities for data analysis and display either on-screen or on hardcopy, and

    • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.

    Using R with MariaDB

    R Installation

    Some basic notions / tips on how to use R along with MariaDB are the following:

    A. The recommended R distribution is “Base R”:

    B. The recommended R GUIs are RStudio Desktop, or RStudio Server:

    Alternative GUIs would be:

    • RCode (PGM Solutions): .

    “R” and “MariaDB Server” can be installed either in the same server, or in different servers, as an ODBC communication protocol will be used for the exchange of data between the two environments.

    Data Transfer between R and MariaDB

    Package: "odbc"

    For the transfer of data between MariaDB Server and R Environment, it is recommended R's "odbc" Package:

    • “odbc" is a new R package available on CRAN (Since 2017-02-05), and maintained by RStudio, which is designed to comply with the DBI specification.

    • Tutorials on how to use R's "odbc" package can be found here:

      • Setting up ODBC Drivers:

      • "odbc" R Package:

    The "odbc" package requires to have previously installed the MariaDB or MySQL ODBC connector:

    For installing the "odbc" package from CRAN, execute in R:

    Package: "RMariaDB"

    “RMariaDB” R library, is a modern 'MariaDB' client based on 'Rcpp'.

    For installing RMariaDB package through CRAN, execute the following R statement:

    And for connecting to MariaDB:

    Other Packages: "readr", "RODBC"

    There are other alternatives for data transfer between R and MariaDB:

    • “readr” R package, for writing / reading CSV files. To be used in MariaDB along with “LOAD DATA INFILE”.

    • "RODBC" R package: Robust and well-tested (Since 2000-05-24) package which enables data transfer between R and MariaDB by means of an ODBC connector:

      • It is slightly slower than RStudio's new "odbc" package (See benchmarks):

      • For bug report to the RODBC package maintainer, use the following R statement: bug.report(package = "RODBC")

    R Programming Resources

    A) Programming

    Recommended resources for learning how to program in R are the following:

    B) Statistics

    A recommended book for understanding the underlying statistics in the R packages is:

    C) Cheatsheets: Concept Summary

    • Rstudio Cheatsheets are a recommended and valuable resource:

    • Along with the following Base R reference card:

    D) Search Engine & R Package Spotlight

    • Search Engines:

    • Information on new R packages is regularly published in the following websites:

    E) Statistical / Unsupervised Machine Learning, Deep Learning and Artificial Intelligence

    H2O.AI

    The R Programming language has support for the H2O.ai library (), which enables to create in-memory multi-cluster GPU powered machine learning models.

    For installing H2O.ai through CRAN, execute:

    The following R Statements can be used for importing a MariaDB table to H2O.ai using the R Front End:

    • import_sql_table: "This function imports a SQL table to H2OFrame in memory".

    • import_sql_select: "This function imports the SQL table that is the result of the specified SQL query to H2OFrame in memory".

    NOTE: Be sure to start the h2o.jar in the terminal with your downloaded JDBC driver in the classpath:

    KERAS

    offers an interface to , a high-level neural networks 'API'.

    'Keras' was developed with a focus on enabling fast experimentation, supports both convolution based networks and recurrent networks (as well as combinations of the two), and runs seamlessly on both 'CPU' and 'GPU' devices.

    R LIBRARIES: CARET

    A book which introduces core Machine Learning concepts:

    F) Text Mining

    Documentation on how to perform Text Mining in R can be found in the book "Text Mining With R":

    G) Shiny Web Apps & RMarkdown Documents

    SHINY WEB APPS

    R Package makes it incredibly easy to build interactive web applications with R.

    Automatic "reactive" binding between inputs and outputs and extensive prebuilt widgets make it possible to build beautiful, responsive, and powerful applications with minimal effort.

    For deploy Shiny Web Applications using Open Source Alternatives, you can either use:

    RMARKDOWN DOCUMENTS

    H) Advanced R Resources

    Some of the most advanced R resources for fully understanding the internals and nuances of the R Programming Language are the following:

    Data Loading with cpimport

    Overview

    MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which bypasses the SQL layer to decrease the overhead of bulk data loading.

    Refer to the cpimport modes for additional information and to ColumnStore Bulk Data Loading.

    The cpimport tool:

    • Bypasses the SQL layer to decrease overhead;

    • Does not block read queries;

    • Requires a write metadata lock on the table, which can be monitored with the ;

    • Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries;

    • Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files;

    • Supports parallel distributed bulk loads;

    • Imports data from text files;

    • Imports data from binary files;

    • Imports data from standard input (stdin).

    Intended Use Cases

    You can load data using the cpimport tool in the following cases:

    • You are loading data into a ColumnStore table from a text file stored on the primary node's file system.

    • You are loading data into a ColumnStore table from a binary file stored on the primary node's file system.

    • You are loading data into a ColumnStore table from the output of a command running on the primary node.

    Locking

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    Importing the Schema

    Before data can be imported into the tables, the schema must be created.

    1. Connect to the primary server using :

    After the command is executed, it prompts for a password.

    1. For each imported database, create the database with the statement:

    1. For each imported table, create the table with the statement:

    To get the best performance from Enterprise ColumnStore, make sure to follow Enterprise ColumnStore's best practices for schema design.

    Appending Data

    When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.

    While the bulk load is in progress, the newly appended data is temporarily hidden from queries.

    After the bulk load is complete, the newly appended data is visible to queries.

    Sorting the Input File

    When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.

    The order of data can have a significant effect on performance with Enterprise ColumnStore, so it can be helpful to sort the data in the input file prior to importing it.

    For additional information, see "".

    Confirming the Field Delimiter

    Before importing a file into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.

    The default field delimiter for the cpimport tool is a pipe (|).

    To use a different delimiter, you can set the field delimiter.

    Importing from Text Files

    The cpimport tool can import data from a text file if a file is provided as an argument after the database and table name.

    For example, to import the file inventory-products.txt into the products table in the inventory database:

    Importing from Binary Files

    The cpimport tool can import data from a binary file if the -I1 or -I2 option is provided and a file is provided as an argument after the database and table name.

    For example, to import the file inventory-products.bin into the products table in the inventory database:

    The -I1 and -I2 options allow two different binary import modes to be selected:

    Option
    Description

    The binary file should use the following format for data:

    Data Type(s)
    Format

    Binary DATE Format

    In binary input files, the cpimport tool expects columns to be in the following format:

    Binary DATETIME Format

    In binary input files, the cpimport tool expects columns to be in the following format:

    Importing from Standard Input

    The cpimport tool can import data from standard input (stdin) if no file is provided as an argument.

    Importing from standard input is useful in many scenarios.

    One scenario is when you want to import data from a remote database. You can use to query the table using the statement, and then pipe the results into the standard input of the cpimport tool:

    Importing from S3 Using AWS CLI

    The cpimport tool can import data from a file stored in a remote S3 bucket.

    You can use the AWS CLI to copy the file from S3, and then pipe the contents into the standard input of the cpimport tool:

    Alternatively, the columnstore_info.load_from_s3 stored procedure can import data from S3-compatible cloud object storage.

    Setting the Field Delimiter

    The default field delimiter for the cpimport tool is a pipe sign (|).

    If your data file uses a different field delimiter, you can specify the field delimiter with the -s option.

    For a TSV (tab-separated values) file:

    For a CSV (comma-separated values) file:

    Setting the Quoting Style

    By default, the cpimport tool does not expect fields to be quoted.

    If your data file uses quotes around fields, you can specify the quote character with the -E option.

    To load a TSV (tab-separated values) file that uses double quotes:

    To load a CSV (comma-separated values) file that uses optional single quotes:

    Logging

    The cpimport tool writes logs to different directories, depending on the Enterprise ColumnStore version:

    • In Enterprise ColumnStore 5.5.2 and later, logs are written to /var/log/mariadb/columnstore/bulk/

    • In Enterprise ColumnStore 5 releases before 5.5.2, logs are written to /var/lib/columnstore/data/bulk/

    • In Enterprise ColumnStore 1.4, logs are written to /usr/local/mariadb/columnstore/bulk/

    Special Handling

    Column Order

    The cpimport tool requires column values to be in the same order in the input file as the columns in the table definition.

    Date Format

    The cpimport tool requires values to be specified in the format YYYY-MM-DD.

    Transaction Log

    The cpimport tool does not write bulk data loads to the transaction log, so they are not transactional.

    Binary Log

    The cpimport tool does not write bulk data loads to the binary log, so they cannot be replicated using .

    EFS Storage

    When Enterprise ColumnStore uses object storage and the Storage Manager directory uses EFS in the default Bursting Throughput mode, the cpimport tool can have performance problems if multiple data load operations are executed consecutively. The performance problems can occur because the Bursting Throughput mode scales the rate relative to the size of the file system, so the burst credits for a small Storage Manager volume can be fully consumed very quickly.

    When this problem occurs, some solutions are:

    • Avoid using burst credits by using Provisioned Throughput mode instead of Bursting Throughput mode

    • Monitor burst credit balances in AWS and run data load operations when burst credits are available

    • Increase the burst credit balance by increasing the file system size (for example, by creating a dummy file)

    Additional information is available .

    $ mcsSetConfig HashJoin PmMaxMemorySmallSide 2G
    $ mcsSetConfig HashJoin TotalUmMemory '40%'
    mcsSetConfig HashJoin AllowDiskBasedJoin Y
    mcsSetConfig HashJoin TempFileCompression Y
    mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmp
    $ mcsSetConfig RowAggregation AllowDiskBasedAggregation Y
    $ mcsSetConfig RowAggregation Compression SNAPPY
    $ mcsSetConfig SystemConfig SystemTempFileDir /mariadb/tmp
    Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.
    ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.
    sudo systemctl stop mariadb-columnstore-cmapi
    sudo systemctl stop mariadb-columnstore
    sudo systemctl stop mariadb
    sudo yum install curl
    curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    echo "${checksum} mariadb_es_repo_setup" | sha256sum -c -
    chmod +x mariadb_es_repo_setup
    sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
       --mariadb-server-version="11.4"
    sudo apt install curl
    curl -LsSO https://dlm.mariadb.com/enterprise-release-helpers/mariadb_es_repo_setup
    echo "${checksum}  mariadb_es_repo_setup" sha256sum -c -
    chmod +x mariadb_es_repo_setup
    sudo ./mariadb_es_repo_setup --token="CUSTOMER_DOWNLOAD_TOKEN" --apply \
       --mariadb-server-version="11.4"
    sudo apt update
    sudo systemctl start mariadb-columnstore-cmapi
    sudo systemctl start mariadb
    maxctrl set server \
       mcs2 \
       maintenance
    maxctrl list servers
    ┌────────┬───────────────┬──────┬─────────────┬──────────────────────┬────────┐
    │ Server │ Address       │ Port │ Connections │ State                │ GTID   │
    ├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
    │ mcs3   │ 192.0.2.3     │ 3306 │ 0           │ Maintenance, Running │ 0-1-17 │
    ├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
    │ mcs2   │ 192.0.2.2     │ 3306 │ 0           │ Maintenance, Running │ 0-1-17 │
    ├────────┼───────────────┼──────┼─────────────┼──────────────────────┼────────┤
    │ mcs1   │ 192.0.2.1     │ 3306 │ 0           │ Master, Running      │ 0-1-17 │
    └────────┴───────────────┴──────┴─────────────┴──────────────────────┴────────┘
    my_print_defaults --mysqld \
       | grep "gtid[-_]strict[-_]mode"
    --gtid_strict_mode=1
    [mariadb]
    ...
    # temporarily commented out for upgrade
    # gtid_strict_mode=1
    mcs cluster stop
    sudo systemctl stop mariadb-columnstore
    sudo systemctl disable mariadb-columnstore
    mariadb-upgrade --write-binlog
    mcs cluster start
    SHOW GLOBAL STATUS LIKE 'Columnstore_version';
    +---------------------+---------+
    | Variable_name       | Value   |
    +---------------------+---------+
    | Columnstore_version | 23.10.0 |
    +---------------------+---------+
    SHOW GLOBAL VARIABLES LIKE 'version';
    +---------------+----------------------------------+
    | Variable_name | Value                            |
    +---------------+----------------------------------+
    | version       | 10.6.9-5-MariaDB-enterprise-log  |
    +---------------+----------------------------------+
    maxctrl clear server \
       mcs2 \
       maintenance
    maxctrl list servers
    ┌────────┬───────────────┬──────┬─────────────┬─────────────────┬─────────┐
    │ Server │ Address       │ Port │ Connections │ State           │ GTID    │
    ├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
    │ mcs3   │ 192.0.2.3     │ 3306 │ 0           │ Slave, Running  │ 0-3-159 │
    ├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
    │ mcs2   │ 192.0.2.2     │ 3306 │ 0           │ Slave, Running  │ 0-1-88  │
    ├────────┼───────────────┼──────┼─────────────┼─────────────────┼─────────┤
    │ mcs1   │ 192.0.2.1     │ 3306 │ 0           │ Master, Running │ 0-1-88  │
    └────────┴───────────────┴──────┴─────────────┴─────────────────┴─────────┘
    $ systemctl status mariadb
    $ sudo systemctl start mariadb
    $ sudo mariadb
    Welcome to the MariaDB monitor.  Commands end with ; or \g.
    Your MariaDB connection id is 38
    Server version: 11.4.5-3-MariaDB-Enterprise MariaDB Enterprise Server
    
    Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    MariaDB [(none)]>
    SELECT PLUGIN_NAME, PLUGIN_STATUS
    FROM information_schema.PLUGINS
    WHERE PLUGIN_LIBRARY LIKE 'ha_columnstore%';
    
    +---------------------+---------------+
    | PLUGIN_NAME         | PLUGIN_STATUS |
    +---------------------+---------------+
    | Columnstore         | ACTIVE        |
    | COLUMNSTORE_COLUMNS | ACTIVE        |
    | COLUMNSTORE_TABLES  | ACTIVE        |
    | COLUMNSTORE_FILES   | ACTIVE        |
    | COLUMNSTORE_EXTENTS | ACTIVE        |
    +---------------------+---------------+
    $ systemctl status mariadb-columnstore-cmapi
    $ sudo systemctl start mariadb-columnstore-cmapi
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    $ sudo mariadb
    CREATE DATABASE IF NOT EXISTS test;
    
    CREATE TABLE IF NOT EXISTS test.contacts (
       first_name VARCHAR(50),
       last_name VARCHAR(50),
       email VARCHAR(100)
    ) ENGINE = ColumnStore;
    $ sudo mariadb
    SHOW CREATE TABLE test.contacts\G;
    $ sudo mariadb
    INSERT INTO test.contacts (first_name, last_name, email)
       VALUES
       ("Kai", "Devi", "kai.devi@example.com"),
       ("Lee", "Wang", "lee.wang@example.com");
    $ sudo mariadb
    SELECT * FROM test.contacts;
    
    +------------+-----------+----------------------+
    | first_name | last_name | email                |
    +------------+-----------+----------------------+
    | Kai        | Devi      | kai.devi@example.com |
    | Lee        | Wang      | lee.wang@example.com |
    +------------+-----------+----------------------+
    SET columnstore_unstable_optimizer=ON;
    SET optimizer_switch='index_merge=off,index_merge_union=off,index_merge_sort_union=off,index_merge_intersection=off,index_merge_sort_intersection=off,index_condition_pushdown=off,derived_merge=off,derived_with_keys=off,firstmatch=off,loosescan=off,materialization=on,in_to_exists=off,semijoin=off,partial_match_rowid_merge=off,partial_match_table_scan=off,subquery_cache=off,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=off,semijoin_with_cache=off,join_cache_incremental=off,join_cache_hashed=off,join_cache_bka=off,optimize_join_buffer_size=off,table_elimination=off,extended_keys=off,exists_to_in=off,orderby_uses_equalities=off,condition_pushdown_for_derived=on,split_materialized=off,condition_pushdown_for_subquery=off,rowid_filter=off,condition_pushdown_from_having=on,not_null_range_scan=off,hash_join_cardinality=off,cset_narrowing=off,sargable_casefold=off';
    SELECT c_zip, sum(c_payment_cnt)  FROM test.customer_indexed GROUP BY c_zip ORDER BY c_zip ; -- 0.7s
    sed -i 's/^columnstore_innodb_queries_use_mcs = on/#columnstore_innodb_queries_use_mcs = on/' /etc/my.cnf.d/columnstore.cnf
    systemctl restart mariadb
    SELECT mcs_get_plan('rules');
    +-----------------------+
    | mcs_get_plan('rules') |
    +-----------------------+
    | parallel_ces          |
    +-----------------------+
    
    SELECT mcs_get_plan('optimized');
    +-----------------------+
    | mcs_get_plan('rules') |
    +-----------------------+
    ...
    >>From Tables
      derived table - $added_sub_test_customer_indexed_0
    select calShowPartitions('orders','orderdate');
    +-----------------------------------------+
    | calShowPartitions('orders','orderdate') |
    +-----------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Enabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +-----------------------------------------+
    
    1 row in set (0.05 sec)
    select calEnablePartitions('orders', '0.0.1');
    +----------------------------------------+
    | calEnablePartitions('orders', '0.0.1') |
    +----------------------------------------+
    | Partitions are enabled successfully.   |
    +----------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitions('orders','orderdate');
    +-----------------------------------------+
    | calShowPartitions('orders','orderdate') |
    +-----------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Enabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +-----------------------------------------+
    1 rows in set (0.05 sec)
    select calDisablePartitions('orders','0.0.1');
    +----------------------------------------+
    | calDisablePartitions('orders','0.0.1') |
    +----------------------------------------+
    | Partitions are disabled successfully.  |
    +----------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitions('orders','orderdate');
    +-----------------------------------------+
    | calShowPartitions('orders','orderdate') |
    +-----------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Disabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +-----------------------------------------+
    1 row in set (0.05 sec)
    select calDropPartitions('orders', '0.0.1');
    +--------------------------------------+
    | calDropPartitions('orders', '0.0.1') |
    +--------------------------------------+
    | Partitions are enabled successfully  |
    +--------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitions('orders','orderdate');
    +-----------------------------------------+
    | calShowPartitions('orders','orderdate') |
    +-----------------------------------------+
    | Part# Min        Max        Status
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +-----------------------------------------+
    1 row in set (0.05 sec)
    select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
    +----------------------------------------------------------------------------+
    | calShowPartitionsbyvalue('orders','orderdate', '1992-01-02', '2010-07-24') |
    +----------------------------------------------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Enabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +----------------------------------------------------------------------------+
    1 row in set (0.05 sec)
    select calEnablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
    +--------------------------------------------------------------------------------+
    | calenablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
    +--------------------------------------------------------------------------------+
    | Partitions are enabled successfully                                            |
    +--------------------------------------------------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
    +----------------------------------------------------------------------------+
    | calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
    +----------------------------------------------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Enabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +----------------------------------------------------------------------------+
    1 rows in set (0.05 sec)
    select calDisablePartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
    +---------------------------------------------------------------------------------+
    | caldisablepartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
    +---------------------------------------------------------------------------------+
    | Partitions are disabled successfully                                            |
    +---------------------------------------------------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
    +----------------------------------------------------------------------------+
    | calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24’ ) |
    +----------------------------------------------------------------------------+
    | Part# Min        Max        Status
      0.0.1 1992-01-01 1998-08-02 Disabled
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +----------------------------------------------------------------------------+
    1 row in set (0.05 sec)
    select calDropPartitionsByValue('orders','orderdate', '1992-01-01', '1998-08-02');
    +------------------------------------------------------------------------------+
    | caldroppartitionsbyvalue ('orders', 'o_orderdate','1992-01-01','1998-08-02') |
    +------------------------------------------------------------------------------+
    | Partitions are enabled successfully.                                         |
    +------------------------------------------------------------------------------+
    1 row in set (0.28 sec)
    select calShowPartitionsByValue('orders','orderdate', '1992-01-01', '2010-07-24');
    +----------------------------------------------------------------------------+
    | calShowPartitionsbyvalue('orders','orderdate', '1992-01-02','2010-07-24' ) |
    +----------------------------------------------------------------------------+
    | Part# Min        Max        Status
      0.1.2 1998-08-03 2004-05-15 Enabled
      0.2.3 2004-05-16 2010-07-24 Enabled |
    +----------------------------------------------------------------------------+
    1 row in set (0.05 sec)
    DELETE FROM orders WHERE orderdate <= '1998-12-31';
    Set the system variable to the GTID position:
  • Execute the statement to configure the new node to connect to the Primary Server at this position:

    The above statement configures the Replica Server to connect to a Primary Server located at 192.0.2.1 using the repl user account.

  • Start replication using the command:

    The above statement configures the new node to connect to the Primary Server to retrieve new binary log events and replicate them into the local database.

  • Authenticate using the configured
  • Include the required headers

  • CMAPI
    add-node
    supported REST client
    CMAPI
    status
    API key

    Upgrades MariaDB Enterprise Server, ColumnStore, and CMAPI.

  • Waits for CMAPI to come back online on each node and, for upgrades, automatically restarts services.

  • Recent backups:

    • At a minimum, ensure Extent Map and configuration backups exist.

    • Recommended: take a full backup with the mcs backup command.

    or
    --version 11.4.8-5
    .
  • --ignore-mismatch: Continue even if cluster nodes report different package versions; uses majority versions as the baseline.

  • Verify time synchronization across all nodes (NTP/Chrony) to avoid coordination issues.

  • Confirm recent backups are complete and restorable.

  • Stop services.

    • Gracefully stops the ColumnStore cluster.

    • Stops the MariaDB server.

    3

    Configure repository.

    • Installs/configures the MariaDB Enterprise Server repository for the chosen version on each node automatically.

    • Validate installed repo on each node separately

    4

    Pre‑upgrade backups (per node).

    Creates a backup of DBRM and key configuration files with name preupgrade_dbrm_backup in default backup directory.

    5

    Upgrade packages (per node).

    • Upgrades MariaDB Enterprise Server and ColumnStore packages.

    • Upgrades CMAPI and waits for it to become ready again on each node (up to 5 minutes).

    6

    Service handling after upgrade.

    • On upgrades: automatically restarts MariaDB and the ColumnStore cluster.

    • On downgrades: automatic restarts are intentionally skipped; manual steps are required.

    Create a small ColumnStore table, insert a few rows, and run a SELECT query.

  • Check for errors in server/ColumnStore logs.

  • Review /var/tmp/mcs_cli_install_es.log for the full sequence, and ensure no errors were reported.

  • command).
  • Verify cluster health before resuming traffic.

  • CMAPI readiness timeout
    • After upgrading CMAPI, the command waits up to 300 seconds per node for readiness.

    • On slow nodes or constrained environments, this timeout may be insufficient, and the command exits with a failure; verify services manually and adjust operational expectations.

  • Downgrade restarts are skipped by design.

    • After a downgrade, automatic restarts are not performed; you must start MariaDB and the ColumnStore cluster manually and validate health.

    • ColumnStore skips automatic restarts, because it cannot guarantee that all the expected APIs endpoints exist or are backward-compatible.

  • MaxScale maintenance handling not automated.

    • Transitioning MaxScale to maintenance/normal mode during upgrades is not automated at this time; manage traffic routing and maintenance state manually if applicable.

  • Repository access and version validation.

    • Invalid tokens, network restrictions, or unsupported version strings can result in validation errors (for example, HTTP 422). Ensure the token has the correct entitlements and the requested version exists for your platform.

  • Single‑node detection.

    • If no active nodes are detected, the tool falls back to localhost only; ensure this matches your topology.

  • Downgrading to 22.08.4 (10.6.9-5) technically working but finished with known issues:

    • Got ERROR on waiting CMAPI ready. But in fact CMAPI starts and is working fine (check mcs status and systemctl status mariadb-columnstore-cmapi on each node).

    • If you try to run a mariadb command, you got an error due to unknown configuration flag. Tool forcing to save current config files while installing packages, and an older MariaDB version doesn't support never flag obviously. To fix it, remove this flag from the configuration file, or restore the configuration from last installed package.

  • Tool currently supported limited packages.

    • Only MariaDB-server (and dependencies), MariaDB-columnstore-engine (MariaDB-plugin-columnstore) and MariaDB-columnstore-cmapi packages remove and install supported. So packages like MariaDB-backup currently not supported and should be upgraded/downgraded manually.

  • If CMAPI does not become ready, check service logs on each node.
  • For mismatched node versions, align package versions before re‑running, or proceed with --ignore-mismatch , but only after assessing the risk.

  • CMAPI communication: Port 8640 (default) must be reachable between nodes.

  • Time sync: Keep NTP/Chrony synchronized across nodes.

  • systemctl status mariadb-columnstore-cmapi
  • Verify network/ports (CMAPI 8640) and repository reachability.

  • Manually start services if safe to do so:

    • systemctl start mariadb

    • mcs start (or mcs cluster start)

  • If corruption is suspected, follow your backup recovery plan (for example, restore from a recent backup and/or extent map backup).

  • Schedule a maintenance window and inform stakeholders.

  • During upgrading:

    • Monitor the console output and /var/tmp/mcs_cli_install_es.log .

    • Avoid interrupting the process; ensure network stability.

  • After upgrading:

    • Validate services and cluster health (mcs cluster status).

    • Run basic data integrity and application smoke tests.

    • Monitor performance and logs for regressions.

  • Cluster topology (nodes, versions, operating system, network).
  • Source and target versions (Server, ColumnStore, CMAPI).

  • Exact error messages and timestamps.

  • Extent Map backup and recovery
    ColumnStore Backup and Restore
    from here
  • A vignette on how to use the RODBC package can be found here: RODBC CRAN Vignette

  • Mastering Spark with R (O'Reilly; Javier Luraschi, Kevin Kuo, Edgar Ruiz)
  • R Packages (Hadley Wickham; O’Reilly)

  • R-bloggers

  • Towards Data Science

  • MRAN: Package Spotlight

  • Machine Learning with R and H2O (Mark Landry): Booklet Online Version
  • Deep Learning with H2O: Vignette

  • CRAN
    RStudio
    RCode
    CRAN odbc
    DB RStudio Drivers
    DB RStudio odbc Usage
    MariaDB ODBC Connector
    MySQL ODBC Connector
    CRAN RODBC
    RStudio odbc
    R Cookbook Second Edition (O’Reilly Media; Paul Teetor; James (JD) Long)
    R Graphics Cookbook Second Edition (O’Reilly Media; Winston Chang)
    R for Data Science (O’Reilly Media; Garrett Grolemund, Hadley Wickham)
    Advanced R Second Edition (CRC R Series; Hadley Wickham)
    Practical Statistics for Data Scientists (O’Reilly Media; Peter Bruce, Andrew Bruce)
    RStudio Cheatsheets: Webpage
    R Reference Card v2
    RSeek: For searching any R related information (Based on Google).
    RPackages: Search and stats for CRAN packages.
    h2o
    H2O.ai: Webpage
    H2O.ai Algorithms: Cheatsheet
    h2o R Package Functions: Cheatsheet
    Practical Machine Learning with H2O (O'Reilly Media; Darren Cook)
    R package keras
    Python's 'Keras'
    R interface to Keras: Webpage
    Deep Learning With R (François Chollet with J. J. Allaire, Manning)
    Keras Rstudio Cheatsheet
    Introduction to Machine Learning with R (O'Reilly; Scott Burger)
    Text Mining With R: A Tidy Approach (O’Reilly Media; Julia Silge and David Robinson): Book Online Version
    Shiny
    Shiny Written Tutorials
    Shiny R Package Cheatsheet
    RInno: CRAN Webpage (Windows)
    ShinyProxy: Webpage
    Shiny Server (Open Source Edition): Webpage
    R Markdown: The Definitive Guide (Book).
    R Markdown Cheatsheet.
    Chapman & Hall/CRC The R Series: Subject-specific Books

    FLOAT

    Native IEEE floating point format NULL: 0xFFAAAAAA

    INT

    Little-endian integer format Signed NULL: 0x80000000 Unsigned NULL: 0xFFFFFFFE

    SMALLINT

    Little-endian integer format Signed NULL: 0x8000 Unsigned NULL: 0xFFFE

    TINYINT

    Little-endian integer format Signed NULL: 0x80 Unsigned NULL: 0xFE

    VARCHAR

    String padded with '0' to match the length of the field NULL: '0' for the full length of the field

    -I1

    Numeric fields containing NULL will be treated as NULL unless the column has a default value

    -I2

    Numeric fields containing NULL will be saturated

    BIGINT

    Little-endian integer format Signed NULL: 0x8000000000000000ULL Unsigned NULL: 0xFFFFFFFFFFFFFFFEULL

    CHAR

    String padded with '0' to match the length of the field NULL: '0' for the full length of the field

    DATE

    Use the format represented by the struct Date NULL: 0xFFFFFFFE

    DATETIME

    Use the format represented by the struct DateTime NULL: 0xFFFFFFFFFFFFFFFEULL

    DECIMAL

    Use an integer representation of the value without the decimal point Sizing depends on precision: * 1-2: use 2 bytes * 3-4: use 3 bytes * 4-9: use 4 bytes * 10+: use 8 bytes Signed and unsigned NULL: See equivalent-sized integer

    DOUBLE

    Native IEEE floating point format NULL: 0xFFFAAAAAAAAAAAAAULL

    Load Ordered Data in Proper Order
    here
    Default value is 25%
    Prepare System for Enterprise ColumnStore
    Install Enterprise ColumnStore
    Start and Configure Enterprise ColumnStore
    Test Enterprise ColumnStore
    Bulk Import Data to Enterprise ColumnStore
    MariaDB Enterprise ColumnStore
    S3-Compatible Object Storage
    MariaDB Enterprise Server

    Analyzing Queries

    Determining Active Queries

    SHOW PROCESSLIST

    The MariaDB SHOW PROCESSLIST statement is used to see a list of active queries on that UM:

    getActiveSQLStatements

    getActiveSQLStatements is a mcsadmin command that shows which SQL statements are currently being executed on the database:

    Analysis of Individual Queries

    Query Statistics

    The calGetStats function provides statistics about resources used on the node, and network by the last run query. Example:

    The output contains information on:

    • MaxMemPct - Peak memory utilization on the , likely in support of a large (User Module) based hash join operation.

    • NumTempFiles - Report on any temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold.

    • TempFileSpace - Report on space used by temporary files created in support of query operations larger than available memory, typically for unusual join operations where the smaller table join cardinality exceeds some configurable threshold.

    The output is useful to determine how much physical I/O was required, how much data was cached, and how many partition blocks were eliminated through use of extent map elimination. The system maintains min / max values for each extent and uses these to help implement where clause filters to completely bypass extents where the value is outside of the min/max range. When a column is ordered (or semi-ordered) during load such as a time column this offer very large performance gains as the system can avoid scanning many extents for the column.

    Query Plan / Trace

    While the MariaDB Server's utility can be used to look at the query plan, it is somewhat less helpful for ColumnStore tables as ColumnStore does not use indexes or make use of MariaDB I/O functionality. The execution plan for a query on a ColumnStore table is made up of multiple steps. Each step in the query plan performs a set of operations that are issued from the to the set of in support of a given step in a query.

    • Full Column Scan - an operation that scans each entry in a column using all available threads on the Performance Modules. Speed of operation is generally related to the size of the data type and the total number of rows in the column. The closest analogy for a traditional system is an index scan operation.

    • Partitioned Column Scan - an operation that uses the Extent Map to identify that certain portions of the column do not contain any matching values for a given set of filters. The closest analogy for a traditional row-based DBMS is a partitioned index scan, or partitioned table scan operation.

    • Column lookup by row offset - once the set of matching filters have been applied and the minimal set of rows have been identified; additional blocks are requested using a calculation that determines exactly which block is required. The closest analogy for a traditional system is a lookup by rowid.

    These operations are automatically executed together in order to execute appropriate filters and column lookup by row offset.

    Viewing the ColumnStore Query Plan

    In MariaDB ColumnStore there is a set of SQL tracing stored functions provided to see the distributed query execution plan between the nodes.

    The basic steps to using these SQL tracing stored functions are:

    1. Start the trace for the particular session.

    2. Execute the SQL statement in question.

    3. Review the trace collected for the statement. As an example, the following session starts a trace, issues a query against a 6 million row fact table and 300,000 row dimension table, and then reviews the output from the trace:

    The columns headings in the output are as follows:

    • Desc – Operation being executed. Possible values:

      • BPS - Batch Primitive Step: scanning or projecting the column blocks.

      • CES - Cross Engine Step: Performing Cross engine join

      • DSS - Dictionary Structure Step: a dictionary scan for a particular variable length string value.

    Note: The time recorded is the time from PrimProc and ExeMgr. Execution time from withing mysqld is not tracked here. There could be extra processing time in mysqld due to a number of factors such as ORDER BY.

    Cache Clearing to Enable Cold Testing

    Sometimes it can be useful to clear caches to allow understanding of un-cached and cached query access. The calFlushCache() function will clear caches on all servers. This is only really useful for testing query performance:

    Viewing Extent Map Information

    It can be useful to view details about the extent map for a given column. This can be achieved using the edit item process on any ColumnStore server. Available arguments can be provided by using the -h flag. The most common use is to provide the column object id with the -o argument which will output details for the column and in this case the -t argument is provided to show min / max values as dates:

    Here it can be seen that the extent maps for the o_orderdate (object id 3032) column are well partitioned since the order table source data was sorted by the order_date. This example shows 2 separate DBRoot values as the environment was a 2-node combined deployment.

    Column object ids may be found by querying the calpontsys.syscolumn metadata table (deprecated) or information_schema.columnstore_columns table (version 1.0.6+).

    Query Statistics History

    MariaDB ColumnStore query statistics history can be retrieved for analysis. By default the query stats collection is disabled. To enable the collection of query stats, the element in the ColumnStore.XML configuration file should be set to Y (default is N).

    Cross Engine Support must also be enabled before enabling Query Statistics. See the section.

    For Querystats Cross Engine User needs INSERT Privilege on querystats table.

    Example:

    When enabled the history of query statistics across all sessions along with execution time, and those stats provided by calgetstats() is stored in a table in the infinidb_querystats schema. Only queries in the following ColumnStore syntax are available for statistics monitoring:

    • SELECT

    • INSERT

    • UPDATE

    • DELETE

    Query Statistics Table

    When QueryStats is enabled, the query statistics history is collected in the querystats table in the infinidb_querystats schema.

    The columns of this table are:

    • queryID - A unique identifier assigned to the query

    • Session ID (sessionID) - The session number that executed the statement.

    • queryType - The type of the query whether insert, update, delete, select, delete, insert select or load data infile

    • query - The text of the query

    Query Statistics Viewing

    Users can view the query statistics by selecting the rows from the query stats table in the infinidb_querystats schema. Examples listed below:

    • Example 1: List execution time, rows returned for all the select queries within the past 12 hours:

    • Example 2: List the three slowest running select queries of session 2 within the past 12 hours:

    • Example 3: List the average, min and max running time of all the INSERT SELECT queries within the past 12 hours:

    CHANGE MASTER TO
       MASTER_USER = "repl",
       MASTER_HOST = "192.0.2.1",
       MASTER_PASSWORD = "repl_passwd",
       MASTER_USE_GTID=slave_pos;
    START SLAVE;
    sudo mariadb-backup --backup \
          --user=mariabackup_user \
          --password=mariabackup_passwd \
          --target-dir=/data/backup/replica_backup
    sudo mariadb-backup --prepare \
          --target-dir=/data/backup/replica_backup
    sudo rsync -av /data/backup/replica_backup 192.0.2.3:/data/backup/
    sudo mariadb-backup --copy-back \
       --target-dir=/data/backup/replica_backup
    sudo chown -R mysql:mysql /var/lib/mysql
    sudo systemctl restart mariadb
    sudo systemctl enable mariadb
    sudo systemctl restart mariadb-columnstore
    sudo systemctl disable mariadb-columnstore
    sudo systemctl restart mariadb-columnstore-cmapi
    sudo systemctl enable mariadb-columnstore-cmapi
    cat xtrabackup_binlog_info
    mariadb-bin.000096 568 0-1-2001,1-2-5139
    sudo mariadb
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":20, "node": "192.0.2.3"}' \
       | jq .
    curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/add-node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":20, "node": "192.0.2.3"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:39:14.672142",
      "node_id": "192.0.2.3"
    }
    curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    maxctrl create server \
       mcs3 \
       192.0.2.3
    maxctrl show servers
    maxctrl link monitor \
       mcs_monitor \
       mcs3
    maxctrl show monitors
    maxctrl link service \
       mcs_service \
       mcs3
    maxctrl show services
    maxctrl list servers
    SET GLOBAL gtid_slave_pos='0-1-2001,1-2-5139';
    mcs install_es --token <ES_API_TOKEN> --version latest
    mcs install_es --token <ES_API_TOKEN> --version <ES_VERSION>
    mcs install_es --token <ES_API_TOKEN> --version <ES_VERSION> --ignore-mismatch
    mcs install_es --token <ES_API_TOKEN> --version 10.6.15-10
    install.packages("odbc")
    install.packages("RMariaDB")
    library(RMariaDB)
    
    con <- dbConnect(
      drv = RMariaDB::MariaDB(), 
      username = NULL,
      password = NULL, 
      host = NULL, 
      port = 3306
    )
    install.packages("h2o")
    connection_url <- "jdbc:mariadb://172.16.2.178:3306/ingestSQL?&useSSL=false"
    username <- "root"
    password <- "abc123"
    
    # Whole Table:
    table <- "citibike20k"
    my_citibike_data <- h2o.import_sql_table(connection_url, table, username, password)
    
    # SELECT Query:
    select_query <-  "SELECT  bikeid  FROM citibike20k"
    my_citibike_data <- h2o.import_sql_select(connection_url, select_query, username, password)
    java -cp <path_to_h2o_jar>:<path_to_jdbc_driver_jar> water.H2OApp
    $ mariadb --host 192.168.0.100 --port 5001 \
              --user db_user --password \
              --default-character-set=utf8
    CREATE DATABASE inventory;
    CREATE TABLE inventory.products (
       product_name VARCHAR(11) NOT NULL DEFAULT '',
       supplier VARCHAR(128) NOT NULL DEFAULT '',
       quantity VARCHAR(128) NOT NULL DEFAULT '',
       unit_cost VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    $ sudo cpimport \
       inventory products \
       inventory-products.txt
    $ sudo cpimport -I1 \
       inventory products \
       inventory-products.bin
    struct Date
    {
      unsigned spare : 6;
      unsigned day : 6;
      unsigned month : 4;
      unsigned year : 16
    };
    struct DateTime
    {
      unsigned msecond : 20;
      unsigned second : 6;
      unsigned minute : 6;
      unsigned hour : 6;
      unsigned day : 6;
      unsigned month : 4;
      unsigned year : 16
    };
    $ mariadb --quick \
       --skip-column-names \
       --execute="SELECT * FROM inventory.products" \
       | cpimport -s '\t' inventory products
    $ aws s3 cp --quiet s3://columnstore-test/inventory-products.csv - \
       | cpimport -s ',' inventory products
    $ sudo cpimport -s '\t' \
       inventory products \
       inventory-products.tsv
    $ sudo cpimport -s ',' \
       inventory products \
       inventory-products.csv
    $ sudo cpimport -s '\t' -E '"' \
       inventory products \
       inventory-products.tsv
    $ sudo cpimport -s ',' -E "'" \
       inventory products \
       inventory-products.csv
    sudo yum update "MariaDB-*" "MariaDB-columnstore-engine" "MariaDB-columnstore-cmapi"
    apt --version
    apt 2.0.9 (amd64)
    sudo apt install --only-upgrade "mariadb*"
    sudo apt install --only-upgrade '?upgradable ?name(mariadb.*)'
    MariaDB [test]> SHOW PROCESSLIST;
    +----+------+-----------+-------+---------+------+-------+--------------+
    | Id | User | Host | db | Command | Time | State | Info |
    +----+------+-----------+-------+---------+------+-------+--------------+
    | 73 | root | localhost | ssb10 | Query | 0 | NULL | show processlist
    +----+------+-----------+-------+---------+------+-------+--------------+
    1 row in set (0.01 sec)
    PhyI/O - Number of 8k blocks read from disk, SSD, or other persistent storage.
  • CacheI/O - Approximate number of 8k blocks processed in memory, adjusted down by the number of discrete PhyI/O calls required.

  • BlocksTouched - Approximate number of 8k blocks processed in memory.

  • PartitionBlocksEliminated - The number of block touches eliminated via the Extent Map elimination behavior.

  • MsgBytesIn, MsgByteOut - Message size in MB sent between nodes in support of the query.

  • HJS - Hash Join Step: Performing a hash join between 2 tables

  • HVS - Having Step: Performing the having clause on the result set

  • SQS - Sub Query Step: Performing a sub query

  • TAS - Tuple Aggregation step: the process of receiving intermediate aggregation results from other nodes.

  • TNS - Tuple Annexation Step: Query result finishing, e.g. filling in constant columns, limit, order by and final distinct cases.

  • TUS = Tuple Union step: Performing a SQL union of 2 sub queries.

  • TCS = Tuple Constant Step: Process Constant Value Columns

  • WFS = Window Function Step: Performing a window function.

  • Mode – Where the operation was performed within the PrimProc library

  • Table – Table for which columns may be scanned/projected.

  • TableOID – ObjectID for the table being scanned.

  • ReferencedOIDs – ObjectIDs for the columns required by the query.

  • PIO – Physical I/O (reads from storage) executed for the query.

  • LIO – Logical I/O executed for the query, also known as Blocks Touched.

  • PBE – Partition Blocks Eliminated identifies blocks eliminated by Extent Map min/max.

  • Elapsed – Elapsed time for a give step.

  • Rows – Intermediate rows returned.

  • INSERT ... SELECT

  • LOAD DATA INFILE

  • Host (host) - The host that executed the statement.

  • User ID (user) - The user that executed the statement.

  • Priority (priority) The priority the user has for this statement.

  • Query Execution Times (startTime, endTime) Calculated as end time – start time.

    • start time - the time that the query gets to ExeMgr, DDLProc, or DMLProc

    • end time - the time that the last result packet exits ExeMgr, DDLProc or DMLProc

  • Rows returned or affected (rows) -The number of rows returned for SELECT queries, or the number of rows affected by DML queries. Not valid for DDL and other query types.

  • Error Number (errNo) - The IDB error number if this query failed, 0 if it succeeded.

  • Physical I/O (phyIO) - The number of blocks that the query accessed from the disk, including the pre-fetch blocks. This statistic is only valid for the queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.

  • Cache I/O (cacheIO) - The number of blocks that the query accessed from the cache. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.

  • Blocks Touched (blocksTouched) - The total number of blocks that the query accessed physically and from the cache. This should be equal or less than the sum of physical I/O and cache I/O. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.

  • Partition Blocks Eliminated (CPBlocksSkipped) - The number of blocks being eliminated by the extent map casual partition. This statistic is only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.

  • Messages to other nodes (msgOutUM) - The number of messages in bytes that ExeMgr sends to the PrimProc. If a message needs to be distributed to all the PMs, the sum of all the distributed messages will be counted. Only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with WHERE clause, and INSERT SELECT.

  • Messages from other nodes (msgInUM) - The number of messages in bytes that PrimProc sends to the ExeMgr. Only valid for queries that are processed by ExeMgr, i.e. SELECT, DML with where clause, and INSERT SELECT.

  • Memory Utilization (maxMemPct) - This field shows memory utilization in support of any join, group by, aggregation, distinct, or other operation.

  • Blocks Changed (blocksChanged) - Total number of blocks that queries physically changed on disk. This is only for delete/update statements.

  • Temp Files (numTempFiles) - This field shows any temporary file utilization in support of any join, group by, aggregation, distinct, or other operation.

  • Temp File Space (tempFileSpace) - This shows the size of any temporary file utilization in support of any join, group by, aggregation, distinct, or other operation.

  • User Module
    User Module
    Performance Modules
    Cross Engine Configuration
    mcsadmin> getActiveSQLStatements
    getactivesqlstatements Wed Oct 7 08:38:32 2015
    Get List of Active SQL Statements
    =================================
    Start Time    Time (hh:mm:ss) Session ID SQL Statement
    ---------------- ---------------- -------------------- ------------------------------------------------------------
    Oct 7 08:38:30    00:00:03       73 select c_name,sum(lo_revenue) from customer, lineorder where lo_custkey = c_custkey and c_custkey = 6 group by c_name
    MariaDB [test]> SELECT count(*) FROM wide2;
    +----------+                                       
    | count(*) |
    +----------+
    |  5000000 |
    +----------+
    1 row in set (0.22 sec)
    
    MariaDB [test]> SELECT calGetStats();
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | calGetStats()                                                                                                                                                                                     |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Query Stats: MaxMemPct-0; NumTempFiles-0; TempFileSpace-0B; ApproxPhyI/O-1931; CacheI/O-2446; BlocksTouched-2443; PartitionBlocksEliminated-0; MsgBytesIn-73KB; MsgBytesOut-1KB; Mode-Distributed |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.01 sec)
    MariaDB [test]> SELECT calSetTrace(1);
    +----------------+
    | calSetTrace(1) |
    +----------------+
    |              0 |
    +----------------+
    1 row in set (0.00 sec)
    
    MariaDB [test]> SELECT c_name, sum(o_totalprice)
        -> FROM customer, orders
        -> WHERE o_custkey = c_custkey
        -> AND c_custkey = 5
        -> GROUP BY c_name;
    +--------------------+-------------------+
    | c_name             | sum(o_totalprice) |
    +--------------------+-------------------+
    | Customer#000000005 |         684965.28 |
    +--------------------+-------------------+
    1 row in set, 1 warning (0.34 sec)
    
    MariaDB [test]> SELECT calGetTrace();
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    --------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    ----------------------------------------------------------------------------------------------------------+
    | calGetTrace()                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    --------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    ----------------------------------------------------------------------------------------------------------+
    |
    Desc Mode Table           TableOID ReferencedColumns        PIO LIO PBE Elapsed Rows
    BPS  PM   customer        3024     (c_custkey,c_name)       0   43  36  0.006   1
    BPS  PM   orders          3038     (o_custkey,o_totalprice) 0   766 0   0.032   3
    HJS  PM   orders-customer 3038     -                        -   -   -   -----   -
    TAS  UM   -               -        -                        -   -   -   0.021   1
     |
    +-------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    --------------------------------------------------------------------------------------------------------------------------------------------------------------------                                                                                                                    ----------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
    MariaDB [test]> SELECT calFlushCache();
    editem -o 3032 -t
    Col OID = 3032, NumExtents = 10, width = 4
    428032 - 432127 (4096) min: 1992-01-01, max: 1993-06-21, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 0, seg#: 0, HWM: 0; status: avail
    502784 - 506879 (4096) min: 1992-01-01, max: 1993-06-22, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 0, seg#: 1, HWM: 0; status: unavail
    708608 - 712703 (4096) min: 1993-06-21, max: 1994-12-11, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 0, seg#: 2, HWM: 0; status: unavail
    766976 - 771071 (4096) min: 1993-06-22, max: 1994-12-12, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 0, seg#: 3, HWM: 0; status: unavail
    989184 - 993279 (4096) min: 1994-12-11, max: 1996-06-01, seqNum: 1, state: valid, fbo: 4096, DBRoot: 1, part#: 0, seg#: 0, HWM: 8191; status: avail
    1039360 - 1043455 (4096) min: 1994-12-12, max: 1996-06-02, seqNum: 1, state: valid, fbo: 4096, DBRoot: 2, part#: 0, seg#: 1, HWM: 8191; status: avail
    1220608 - 1224703 (4096) min: 1996-06-01, max: 1997-11-22, seqNum: 1, state: valid, fbo: 4096, DBRoot: 1, part#: 0, seg#: 2, HWM: 8191; status: avail
    1270784 - 1274879 (4096) min: 1996-06-02, max: 1997-11-22, seqNum: 1, state: valid, fbo: 4096, DBRoot: 2, part#: 0, seg#: 3, HWM: 8191; status: avail
    1452032 - 1456127 (4096) min: 1997-11-22, max: 1998-08-02, seqNum: 1, state: valid, fbo: 0, DBRoot: 1, part#: 1, seg#: 0, HWM: 1930; status: avail
    1510400 - 1514495 (4096) min: 1997-11-22, max: 1998-08-02, seqNum: 1, state: valid, fbo: 0, DBRoot: 2, part#: 1, seg#: 1, HWM: 1930; status: avail
    <QueryStats>
    <Enabled>Y</Enabled>
    </QueryStats>
    grant INSERT on infinidb_querystats.querystats to 'cross_engine'@'127.0.0.1';
    grant INSERT on infinidb_querystats.querystats to 'cross_engine'@'localhost';
    MariaDB [infinidb_querystats]> select queryid, query, endtime-starttime, rows from querystats 
    where starttime >= now() - interval 12 hour and querytype = 'SELECT';
    MariaDB [infinidb_querystats]> select a.* from (select endtime-starttime execTime, query from queryStats 
    where sessionid = 2 and querytype = 'SELECT' and starttime >= now()-interval 12 hour
    order by 1 limit 3) a;
    MariaDB [infinidb_querystats]> select min(endtime-starttime), max(endtime-starttime), avg(endtime-starttime) from querystats 
    where querytype='INSERT SELECT' and starttime >= now() - interval 12 hour;

    ColumnStore System Variables

    Variables

    columnstore_diskjoin_force_run

    • Controls whether disk joins are forced to run even if they are not estimated to be the most efficient execution plan. This can be useful for debugging purposes or for situations where the optimizer's estimates are not accurate.

    • Scope: global, session

    • Data type:

    • Default value: OFF

    • Range: ON, OFF

    • Introduced in: MariaDB Enterprise Server 10.6

    columnstore_diskjoin_max_partition_tree_depth

    • Sets the maximum depth of the partition tree that can be used for disk joins. A higher value allows for more complex joins, but may also increase the memory usage and execution time.

    • Scope: global, session

    • Data type:

    • Default value: 10

    columnstore_max_allowed_in_values

    • Sets the maximum number of values that can be used in an IN predicate on a Columnstore table. This limit helps to prevent performance issues caused by queries with a large number of IN values.

    • Scope: global, session

    • Data type:

    • Default value: 10000

    columnstore_max_pm_join_result_count

    • Sets the maximum number of rows that can be returned by a parallel merge join on a Columnstore table. This limit helps to prevent memory issues caused by joins that return a large number of rows.

    • Scope: global, session

    • Data type:

    • Default value: 1000000

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 2

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 8

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 100

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 0

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 0

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: OFF

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 7

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 17

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 0

    infinidb_ordered_only

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: OFF

    infinidb_string_scan_threshold

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 10

    infinidb_stringtable_threshold

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 20

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 0

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: OFF

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: ON

    infinidb_varbin_always_hex

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: ON

    • Command line: Yes

    • Scope: global, session

    • Data type:

    • Default value: 1

    Compression Mode

    MariaDB ColumnStore has the ability to compress data. This is controlled through a compression mode, which can be set as a default for the instance or set at the session level.

    To set the compression mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

    where n is:

    1. compression is turned off. Any subsequent table create statements run will have compression turned off for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned off for that column unless any statement override has been performed.

    2. compression is turned on. Any subsequent table create statements run will have compression turned on for that table unless any statement overrides have been performed. Any alter statements run to add a column will have compression turned on for that column unless any statement override has been performed. ColumnStore uses snappy compression in this mode.

    ColumnStore Decimal-to-Double Math

    MariaDB ColumnStore has the ability to change intermediate decimal mathematical results from decimal type to double. The decimal type has approximately 17-18 digits of precision, but a smaller maximum range. Whereas the double type has approximately 15-16 digits of precision, but a much larger maximum range.

    In typical mathematical and scientific applications, the ability to avoid overflow in intermediate results with double math is likely more beneficial than the additional two digits of precisions. In banking applications, however, it may be more appropriate to leave in the default decimal setting to ensure accuracy to the least significant digit.

    Enable/Disable Decimal-to-Double Math

    The infinidb\_double\_for\_decimal\_math variable is used to control the data type for intermediate decimal results. This decimal for double math may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.

    To enable/disable the use of the decimal to double math at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

    where n is:

    • off (disabled, default)

    • on (enabled)

    ColumnStore Decimal Scale

    ColumnStore has the ability to support varied internal precision on decimal calculations. infinidb_decimal_scale is used internally by the ColumnStore engine to control how many significant digits to the right of the decimal point are carried through in suboperations on calculated columns. If, while running a query, you receive the message ‘aggregate overflow’, try reducing infinidb_decimal_scale and running the query again.

    Note that, as you decrease infinidb_decimal_scale, you may see reduced accuracy in the least significant digit(s) of a returned calculated column. infinidb_use_decimal_scale is used internally by the ColumnStore engine to turn the use of this internal precision on and off. These two system variables can be set as a default for the instance or at session level.

    Enable/Disable Decimal Scale

    To enable/disable the use of the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance:

    where n is off (disabled) or on (enabled).

    Set Decimal Scale Level

    To set the decimal scale at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

    where n is the amount of precision desired for calculations.

    Disk-Based Joins

    Introduction

    Joins are performed in memory. When a join operation exceeds the memory allocated for query joins, the query is aborted with an error code IDB-2001.

    Disk-based joins enable such queries to use disk for intermediate join data in case when the memory needed for join exceeds the memory limit. Although slower in performance as compared to a fully in-memory join, and bound by the temporary space on disk, it does allow such queries to complete.

    Disk-based joins does not include aggregation and DML joins.

    The following variables in the HashJoin element in the Columnstore.xml configuration file relate to disk-based joins. Columnstore.xml resides in /usr/local/mariadb/columnstore/etc/.

    • AllowDiskBasedJoin – Option to use disk-based joins. Valid values are Y (enabled) or N (disabled). Default is disabled.

    • TempFileCompression – Option to use compression for disk join files. Valid values are Y (use compressed files) or N (use non-compressed files).

    • TempFilePath – The directory path used for the disk joins. By default, this path is the tmp directory for your installation (i.e., /usr/local/mariadb/columnstore/tmp). Files (named infinidb-join-data*) in this directory will be created and cleaned on an as needed basis. The entire directory is removed and recreated by ExeMgr at startup.)

    When using disk-based joins, it is strongly recommended that the TempFilePath reside on its own partition as the partition may fill up as queries are executed.

    Per user join memory limit

    In addition to the system wide flags, at SQL global and session level, the following system variables exists for managing per user memory limit for joins.

    • infinidb_um_mem_limit - A value for memory limit in MB per user. When this limit is exceeded by a join, it will switch to a disk-based join. By default, the limit is not set (value of 0).

    For modification at the global level: In my.cnf file (typically /usr/local/mariadb/columnstore/mysql):

    where value is the value in MB for in memory limitation per user.

    For modification at the session level, before issuing your join query from the SQL client, set the session variable as follows.

    Batch Insert Mode for INSERT Statements

    Introduction

    MariaDB ColumnStore has the ability to utilize the cpimport fast data import tool for non-transactional and SQL statements. Using this method results in a significant increase in performance in loading data through these two SQL statements. This optimization is independent of the storage engine used for the tables in the select statement.

    Enable/Disable Using cpimport for Batch Insert

    The infinidb_use_import_for_batchinsert variable is used to control if cpimport is used for these statements. This variable may be set as a default for the instance, set at the session level, or at the statement level by toggling this variable on and off.

    To enable/disable the use of the use cpimport for batch insert at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

    where n is:

    • 0 (disabled)

    • 1 (enabled)

    Changing Default Delimiter for INSERT SELECT

    • The infinidb_import_for_batchinsert_delimiter variable is used internally by MariaDB ColumnStore on a non-transactional INSERT INTO SELECT FROM statement as the default delimiter passed to the cpimport tool. With a default value ascii 7, there should be no need to change this value unless your data contains ascii 7 values.

    To change this variable value at the at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

    where ascii_value is an ASCII value representation of the delimiter desired.

    Note that this setting may cause issues with multi byte character set data. It is recommended to utilize UTF8 files directly with cpimport.

    Version Buffer File Management

    If the following error is received, most likely with a transaction LOAD DATA INFILE or INSERT INTO SELECT, it is recommended to break up the load into multiple smaller chunks, increase the VersionBufferFileSize setting, consider a nontransactional LOAD DATA INFILE, or use cpimport.

    The VersionBufferFileSize setting is updated in the ColumnStore.xml typically located under /usr/local/mariadb/columnstore/etc. This dictates the size of the version buffer file on disk which provides DML transactional consistency. The default value is '1GB' which reserves up to a 1 Gigabyte file size. Modify this on the primary node and restart the system if you require a larger value.

    Local PrimProc Query Mode

    MariaDB ColumnStore has the ability to query data from just a single node instead of the whole cluster. In order to accomplish this, the infinidb_local_query variable in the my.cnf configuration file is used and maybe set as a default at system wide or set at the session level.

    Enable Local PrimProc Query During Installation

    Local PrimProc query can be enabled system wide during the install process when running the install script postConfigure. Answer 'y' to this prompt during the install process:

    Enable Local PrimProc Query System-Wide

    To enable the use of the local PrimProc query at the instance level, specify infinidb_local_query =1 (enabled) in the my.cnf configuration file at /usr/local/mariadb/columnstore/mysql. The default is 0 (disabled).

    Enable/Disable Local PrimProc Query at the Session Level

    To enable/disable the use of the local PrimProc query at the session level, the following statement is used. Once the session has ended, any subsequent session will return to the default for the instance:

    where n is:

    • 0 (disabled)

    • 1 (enabled)

    At the session level, this variable applies only to executing a query on an individual . The PrimProc must be set up with the local query option during installation.

    Local PrimProc Query Examples

    Example 1 - SELECT from a single table on local PrimProc to import back on local PrimProc:

    With the infinidb_local_query variable set to 1 (default with local PrimProc Query):

    Example 2 - SELECT involving a join between a fact table on the PrimProc node and dimension table across all the nodes to import back on local PrimProc:

    With the infinidb_local_query variable set to 0 (default with local PrimProc Query):

    Create a script (i.e., extract_query_script.sql in our example) similar to the following:

    The infinidb_local_query is set to 0 to allow query across all PrimProc nodes.

    The query is structured so PrimProc gets the fact table data locally from the PrimProc node (as indicated by the use of the function), while the dimension table data is extracted from all the PrimProc nodes.

    Then you can execute the script to pipe it directly into cpimport:

    Operating Mode

    ColumnStore has the ability to support full MariaDB query syntax through an operating mode. This operating mode may be set as a default for the instance or set at the session level. To set the operating mode at the session level, the following command is used. Once the session has ended, any subsequent session will return to the default for the instance.

    where n is:

    1. a generic, highly compatible row-by-row processing mode. Some WHERE clause components can be processed by ColumnStore, but joins are processed entirely by MySQL using a nested loop join mechanism.

    2. (the default) query syntax is evaluated by ColumnStore for compatibility with distributed execution and incompatible queries are rejected. Queries executed in this mode take advantage of distributed execution and typically result in higher performance.

    3. auto-switch mode: ColumnStore will attempt to process the query internally, if it cannot, it will automatically switch the query to run in row-by-row mode.

    Introduced in: MariaDB Enterprise Server 10.6

    Introduced in: MariaDB Enterprise Server 10.6

    Introduced in: MariaDB Enterprise Server 10.6

    Range: 0,2
    Range: OFF, ON
    Range: 0,1
    Range: OFF, ON
    Range: OFF, ON
    Range: OFF, ON
    Range: OFF, ON
    Range: 0,1,2
    infinidb_compression_type
    infinidb_decimal_scale
    infinidb_diskjoin_bucketsize
    infinidb_diskjoin_largesidelimit
    infinidb_diskjoin_smallsidelimit
    infinidb_double_for_decimal_math
    infinidb_import_for_batchinsert_delimiter
    infinidb_import_for_batchinsert_enclosed_by
    infinidb_local_query
    infinidb_um_mem_limit
    infinidb_use_decimal_scale
    infinidb_use_import_for_batchinsert
    infinidb_vtable_mode
    PrimProc
    idbLocalPm()

    ColumnStore Storage Architecture

    Overview

    MariaDB Enterprise ColumnStore's storage architecture is designed to provide great performance for analytical queries.

    Columnar Storage Engine

    MariaDB Enterprise ColumnStore is a columnar storage engine for . MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.

    Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads but are less performant for analytical workloads.

    Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.

    MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.

    OLAP Workloads

    MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to perform analytical or online analytical processing (OLAP) workloads.

    OLAP workloads are generally characterized by ad hoc queries on very large data sets. Some other typical characteristics are:

    • Each query typically reads a subset of columns in the table

    • Most activity typically consists of read-only queries that perform aggregations, window functions, and various calculations

    • Analytical applications typically require only a few concurrent queries

    • Analytical applications typically require the scalability of large, complex queries

    OLAP workloads are typically required for:

    • Business intelligence (BI)

    • Health informatics

    • Historical data mining

    Row-based storage engines have a disadvantage for OLAP workloads. Indexes are not usually very useful for OLAP workloads, because the large size of the data set and the ad hoc nature of the queries preclude the use of indexes to optimize queries.

    Columnar storage engines are much better suited for OLAP workloads. MariaDB Enterprise ColumnStore is a columnar storage engine that is designed for OLAP workloads:

    • When a query reads a subset of columns in the table, Enterprise ColumnStore can reduce I/O by reading those columns and ignoring all others, because each column is stored separately

    • When most activity consists of read-only queries that perform aggregations, window functions, and various calculations, Enterprise ColumnStore is able to efficiently execute those queries using extent elimination, distributed query execution, and massively parallel processing (MPP) techniques

    • When only a few concurrent queries are required, Enterprise ColumnStore is able to maximize the use of system resources by using multiple threads and multiple nodes to perform work for each query

    OLTP Workloads

    MariaDB Enterprise Server has had excellent performance for transactional or online transactional processing (OLTP) workloads since the beginning.

    OLTP workloads are generally characterized by a fixed set of queries using a relatively small data set. Some other typical characteristics are:

    • Each query typically reads and/or writes many columns in the table.

    • Most activity typically consists of small transactions that only read and/or write a small number of rows.

    • Transactional applications typically require many concurrent transactions.

    • Transactional applications typically require a fast response time and low latency.

    OLTP workloads are typically required for:

    • Financial transactions performed by financial institutions and e-commerce sites.

    • Store inventory changes performed by brick-and-mortar stores and e-commerce sites.

    • Account metadata changes performed by many sites that stores personal data.

    Row-based storage engines have several advantages for OLTP workloads:

    • When a query reads and/or writes many columns in the table, row-based storage engines can find all columns on a single page, so the I/O costs of the operation are low.

    • When a transaction reads/writes a small number of rows, row-based storage engines can use an index to find the page for each row without a full table scan.

    • When many concurrent transactions are operating, row-based storage engines can implement transactional isolation by storing multiple versions of changed rows.

    • When a fast response time and low latency are required, row-based storage engines can use indexes to optimize the most common queries.

    is ES's default storage engine, and it is a highly performant row-based storage engine.

    Hybrid Workloads

    MariaDB Enterprise ColumnStore enables MariaDB Enterprise Server to function as a single-stack solution for workloads.

    Hybrid workloads are characterized by a mix of transactional and analytical queries. Hybrid workloads are also known as "Smart Transactions", "Augmented Transactions" "Translytical", or "Hybrid Operational-Analytical Processing (HOAP)".

    Hybrid workloads are typically required for applications that require real-time analytics that lead to immediate action:

    • Financial institutions use transactional queries to handle financial transactions and analytical queries to analyze the transactions for business intelligence.

    • Insurance companies use transactional queries to accept/process claims and analytical queries to analyze those claims for business opportunities or risks.

    • Health providers use transactional queries to track electronic health records (EHR) and analytical queries to analyze the EHRs to discover health trends or prevent adverse drug interactions.

    MariaDB Enterprise Server provides multiple components to perform hybrid workloads:

    • For analytical queries, the Enterprise ColumnStore storage engine can be used.

    • For transactional queries, row-based storage engines, such as InnoDB, can be used.

    • For queries that reference both analytical and transactional data, ES's cross-engine join functionality can be used to join Enterprise ColumnStore tables with InnoDB tables.

    • is a high-performance database proxy that can dynamically route analytical queries to Enterprise ColumnStore and transactional queries to the transactional storage engine.

    Storage Options

    MariaDB Enterprise ColumnStore supports multiple storage types:

    Storage Type
    Description

    Deployment with S3-Compatible Storage

    Deployment with Shared Storage

    S3-Compatible Object Storage

    MariaDB Enterprise ColumnStore supports S3-compatible object storage.

    S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the to use (such as NFS) for high availability.

    S3-compatible object storage is:

    • Compatible: Many object storage services are compatible with the Amazon S3 API.

    • Economical: S3-compatible object storage is often very low cost.

    • Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.

    • Limitless: S3-compatible object storage is often virtually limitless.

    Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

    If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.

    S3 API

    MariaDB Enterprise ColumnStore can use any object store that is compatible with the Amazon S3 API.

    Many object storage services are compatible with the Amazon S3 API, and compatible object storage services are available for cloud deployments and on-premises deployments, so vendor lock-in is not a concern.

    Storage Manager

    MariaDB Enterprise ColumnStore's Storage Manager enables remote S3-compatible object storage to be efficiently used. The Storage Manager uses a persistent local disk cache for read/write operations, so that network latency has minimal performance impact on Enterprise ColumnStore. In some cases, it will even perform better than local disk operations.

    Enterprise ColumnStore only uses the Storage Manager when S3-compatible storage is used for data.

    Storage Manager is configured using .

    Storage Manager Directory

    MariaDB Enterprise ColumnStore's Storage Manager directory is at the following path by default:

    /var/lib/columnstore/storagemanager

    To enable high availability when S3-compatible object storage is used, the Storage Manager directory should use and be mounted on every ColumnStore node.

    Configure the S3 Storage Manager

    When you want to use S3-compatible storage for Enterprise ColumnStore, you must configure Enterprise ColumnStore's S3 Storage Manager to use S3-compatible storage.

    To configure Enterprise ColumnStore to use S3-compatible storage, edit /etc/columnstore/storagemanager.cnf:

    The S3-compatible object storage options are configured under [S3]:

    • The bucket option must be set to the name of the bucket.

    • The endpoint option must be set to the endpoint for the S3-compatible object storage.

    • The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

    The local cache options are configured under [Cache]:

    • The cache_size option is set to 2 GB by default.

    • The path option is set to /var/lib/columnstore/storagemanager/cache by default.

    Ensure that the specified path has sufficient storage space for the specified cache size.

    Shared Local Storage

    MariaDB Enterprise ColumnStore can use shared local storage.

    Shared local storage is required for high availability. The specific requirements depend on whether Enterprise ColumnStore is configured to use :

    When S3-compatible object storage is used, Enterprise ColumnStore requires the to use shared local storage for high availability.

    When S3-compatible object storage is not used, Enterprise ColumnStore requires the to use shared local storage for high availability.

    The most common shared local storage options for on-premises and cloud deployments are:

    • NFS (Network File System)

    • GlusterFS

    The most common shared local storage options for AWS (Amazon Web Services) deployments are:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    The most common shared local storage option for GCP (Google Cloud Platform) deployments is:

    • Filestore

    Shared Local Storage Options

    The most common options for shared local storage are:

    Shared Local Storage
    Description

    Directories Requiring Shared Local Storage for HA

    Multi-node MariaDB Enterprise ColumnStore requires some directories to use shared local storage for high availability. The specific requirements depend on if MariaDB Enterprise ColumnStore is configured to use :

    Using S3-Compatible Object Storage?
    Directories to use Shared Local Storage

    Recommended Storage Options

    For best results, MariaDB Corporation would recommend the following storage options:

    Environment
    Object Storage For Data
    Shared Local Storage For Storage Manager

    Storage Format

    MariaDB Enterprise ColumnStore's storage format is optimized for analytical queries.

    DB Root Directories

    MariaDB Enterprise ColumnStore stores data in DB Root directories when S3-compatible object storage is not configured.

    In a multi-node Enterprise ColumnStore, each node has its own DB Root directory.

    The DB Root directories are at the following path by default:

    • /var/lib/columnstore/dataN

    The N in dataN represents a range of integers that starts at 1 and stops at the number of nodes in the deployment. For example, with a 3-node Enterprise ColumnStore deployment, this would refer to the following directories:

    • /var/lib/columnstore/data1

    • /var/lib/columnstore/data2

    • /var/lib/columnstore/data3

    To enable high availability for the DB Root directories, each directory should be mounted on every ColumnStore node using .

    Extents

    Each column in a table is stored in units called extents.

    By default, each extent contains the column values for 8 million rows. The physical size of each extent can range from 8 MB to 64 MB. When an extent reaches the maximum number of column values, Enterprise ColumnStore creates a new extent.

    Each extent is stored in 8 KB blocks, and each block has a logical block identifier (LBID).

    If a string column is longer than 8 characters, the value is stored in a separate dictionary file, and a pointer to the value is stored in the extent.

    Segment Files

    A segment file is used to store Enterprise ColumnStore data within a DB Root directory.

    A segment file always contains two extents. When a segment file reaches its maximum size, Enterprise ColumnStore creates a new segment file.

    The relevant configuration options are:

    Option
    Description

    For example, to configure Enterprise ColumnStore to store more extents in each segment file using the mcsSetConfig utility:

    Column Partitions

    Enterprise ColumnStore automatically groups a column's segment files into column partitions.

    On disk, each column partition is represented by a directory in the DB Root. The directory contains the segment files for the column partition.

    By default, a column partition can contain four segment files, but you can configure Enterprise ColumnStore to store more segment files in each column partition. When a column partition reaches the maximum number of segment files, Enterprise ColumnStore creates a new column partition.

    The relevant configuration options are:

    Option
    Description

    For example, to configure Enterprise ColumnStore to store more segment files in each column partition using the mcsSetConfig utility:

    Extent Map

    Enterprise ColumnStore maintains an Extent Map to determine which values are located in each extent.

    The Extent Map identifies each extent using its logical block identifier (LBID) values, and it maintains the minimum and maximum values within each extent.

    The Extent Map is used to implement a performance optimization called .

    The primary node has a master copy of the Extent Map. When Enterprise ColumnStore is started, the primary node copies the Extent Map to the replica nodes.

    While Enterprise ColumnStore is running, each node maintains a copy of the Extent Map in its main memory, so that it can be accessed quickly without additional I/O.

    If the Extent Map gets corrupted, the mcsRebuildEM utility can rebuild the Extent Map from the contents of the database file system. The mcsRebuildEM utility is available starting in MariaDB Enterprise ColumnStore 6.2.2.

    Compression

    Enterprise ColumnStore automatically compresses all data on disk using either Snappy or LZ4 compression. See the columnstore_compression_type system variable for how to select the desired compression type.

    Since Enterprise ColumnStore stores a single column's data in each segment file, the data in each segment file tends to be very similar. Similar data usually allows for excellent compressibility. However, the specific data compression ratio will depend on a lot of factors, such as the randomness of the data and the number of distinct values.

    Enterprise ColumnStore's compression strategy is tuned to optimize the performance of I/O-bound queries, because the decompression rate is optimized to maximize read performance.

    Version Buffer

    Enterprise ColumnStore uses the version buffer to store blocks that are being modified.

    The version buffer is used for multiple tasks:

    • It is used to roll back a transaction.

    • It is used for multi-version concurrency control (MVCC). With MVCC, Enterprise ColumnStore can implement read snapshots, which allows a statement to have a consistent view of the database, even if some of the underlying rows have changed. The snapshot for a given statement is identified by the system change number (SCN).

    The version buffer is split between data structures that are in-memory and on-disk.

    The in-memory data structures are hash tables that keep track of in-flight transaction. The hash tables store the LBIDs for each block that is being modified by a transaction. The in-memory hash tables start at 4 MB, and they grow as-needed. The size of the hash tables increases as the number of modified blocks increases.

    An on-disk version buffer file is stored in each DB Root. By default, the on-disk version buffer file is 1 GB, but you can configure Enterprise ColumnStore to use a different file size. The relevant configuration options are:

    Option
    Description

    For example, to configure Enterprise ColumnStore to use a larger on-disk version buffer file using the mcsSetConfig utility:

    Extent Elimination

    Using the Extent Map, ColumnStore can perform logical range partitioning and only retrieve the blocks needed to satisfy the query. This is done through Extent Elimination, the process of eliminating Extents from the results that don't meet the given join and filter conditions of the query, which reduces the overall I/O operations.

    In Extent Elimination, ColumnStore scans the columns in join and filter conditions. It then extracts the logical horizontal partitioning information of each extent along with the minimum and maximum values for the column to further eliminate Extents. To eliminate an Extent when a column scan involves a filter, that filter is compared to the minimum and maximum values stored in each extent for the column. If the filter value is outside the Extents minimum and maximum value range, ColumnStore eliminates the Extent.

    This behavior is automatic and well suited for series, ordered, patterned and time-based data, where the data is loaded frequently and often referenced by time. Any column with clustered values is a good candidate for Extent Elimination.

    SET infinidb_compression_type = n
    SET infinidb_double_for_decimal_math = on
    SET infinidb_use_decimal_scale = on
    SET infinidb_decimal_scale = n
    [mysqld]
    ...
    infinidb_um_mem_limit = value
    SET infinidb_um_mem_limit = value
    SET infinidb_use_import_for_batchinsert = n
    SET infinidb_import_for_batchinsert_delimiter = ascii_value
    ERROR 1815 (HY000) at line 1 in file: 'ldi.sql': Internal error: CAL0006: IDB-2008: The version buffer overflowed. Increase VersionBufferFileSize or limit the rows to be processed.
    NOTE: Local Query Feature allows the ability to query data from a single Performance
          Module. Check MariaDB ColumnStore Admin Guide for additional information.
    
    Enable Local Query feature? [y,n] (n) >
    SET infinidb_local_query = n
    mcsmysql -e 'select * from source_schema.source_table;' –N | /usr/local/Calpont/bin/cpimport target_schema target_table -s '\t' –n1
    SET infinidb_local_query=0;
    SELECT fact.column1, dim.column2 
    FROM fact JOIN dim USING (KEY) 
    WHERE idbPm(fact.KEY) = idbLocalPm();
    mcsmysql source_schema -N < extract_query_script.sql | /usr/local/mariadb/columnstore/bin/cpimport target_schema target_table -s '\t' –n1
    SET infinidb_vtable_mode = n

    Analytical applications typically require efficient bulk loads of new data

    When scalability of large, complex queries is required, Enterprise ColumnStore is able to achieve horizontal and vertical scalability using distributed query execution and massively parallel processing (MPP) techniques
  • When efficient bulk loads of new data are required, Enterprise ColumnStore is able to bulk load new data without affecting existing data using automatic partitioning with the extent map

  • Transactional applications typically require ACID properties to protect data.

  • When ACID properties are required, row-based storage engines can implement consistency and durability with fewer performance trade-offs, since each row's columns are stored contiguously.

  • Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.

  • Scalable: S3-compatible object storage is often highly optimized for read and write scaling.

  • Secure: S3-compatible object storage is often encrypted-at-rest.

  • To use a specific IAM role, you must uncomment and set iam_role_name, sts_region, and sts_endpoint.
  • To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.

  • To use a non-default port number, you must set port_number to the desired port.

  • S3-Compatible Object Storage

    • S3-compatible object storage is optional but recommended • Enterprise ColumnStore can use S3-compatible object storage to store data. •With multi-node Enterprise ColumnStore, the Storage Manager directory should use shared local storage for high availability.

    Shared Local Storage

    • Required for multi-node Enterprise ColumnStore with high availability. • Enterprise ColumnStore can use shared local storage to store data and metadata. •If S3-compatible storage is used for data, the shared local storage will only be used for the Storage Manager directory.

    Non-Shared Local Storage

    • Appropriate for single-node Enterprise ColumnStore. • Enterprise ColumnStore can use non-shared local storage to store data and metadata.

    EBS (Elastic Block Store) Multi-Attach

    • EBS is a high-performance block storage service for AWS (Amazon Web Services).

    • EBS multi-attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

    EFS (Elastic File System)

    • EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).

    • For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

    Filestore

    • Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    • For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.

    NFS (Network File System)

    • NFS is a distributed file system.

    • If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately.

    • For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.

    GlusterFS

    • GlusterFS is a distributed file system.

    • GlusterFS supports replication and failover.

    Yes

    Storage Manager directory

    No

    DB Root directories

    AWS

    Amazon S3 storage

    EBS Multi-Attach or EFS

    GCP

    Google Object Storage (S3-compatible)

    Filestore

    On-premises

    Any S3-compatible object storage

    NFS

    ExtentsPerSegmentFile

    • Configures the maximum number of extents that can be stored in each segment file. • Default value is 2.

    FilesPerColumnPartition

    • Configures the maximum number of segment files that can be stored in each column partition.

    • Default value is 4.

    VersionBufferFileSize

    • Configures the size of the on-disk version buffer in each DB Root. • Default value is 1 GB.

    MariaDB Enterprise Server
    Storage Manager directory
    Shared Local Storage
    storagemanager.cnf
    Shared Local Storage
    Shared Local Storage
    S3-compatible object storage
    Storage Manager directory
    DB Root directories
    S3-compatible object storage
    Shared Local Storage
    Extent Elimination
    MariaDB MaxScale
    EntColumnStoreTopologyS3-Network-Diagram
    EntColStoreTopologySharedStorageNetworkDiagram
    EColumnStorePhysicalDataOrganizationColumnExtents
    SegmentFiles
    ColumnPartitions
    DataOrganizationExtentMap
    extent-elimination

    Step 8: Test MariaDB MaxScale

    Step 8: Test MariaDB MaxScale

    Overview

    This page details step 8 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step tests MariaDB MaxScale 22.08.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Step 4: Start and Configure MariaDB Enterprise Server

    Step 4: Start and Configure MariaDB Enterprise Server

    Overview

    This page details step 4 of the 9-step procedure "Deploy ColumnStore Shared Local Storage Topology".

    This step starts and configures MariaDB Enterprise Server, and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    [ObjectStorage]
    …
    service = S3
    …
    [S3]
    region = your_columnstore_bucket_region
    bucket = your_columnstore_bucket_name
    endpoint = your_s3_endpoint
    aws_access_key_id = your_s3_access_key_id
    aws_secret_access_key = your_s3_secret_key
    # iam_role_name = your_iam_role
    # sts_region = your_sts_region
    # sts_endpoint = your_sts_endpoint
    # ec2_iam_mode=enabled
    # port_number = your_port_number
    
    [Cache]
    cache_size = your_local_cache_size
    path = your_local_cache_path
    $ mcsSetConfig ExtentMap ExtentsPerSegmentFile 4
    $ mcsSetConfig ExtentMap FilesPerColumnPartition 8
    $ mcsSetConfig VersionBuffer VersionBufferFileSize 2GB
    Check Global Configuration

    Use maxctrl show maxscale command to view the global MaxScale configuration.

    This action is performed on the MaxScale node:

    Output should align to the global MaxScale configuration in the new configuration file you created.

    Check Server Configuration

    Use the maxctrl list servers and maxctrl show server commands to view the configured server objects.

    This action is performed on the MaxScale node:

    1. Obtain the full list of servers objects:

    For each server object, view the configuration:

    Output should align to the Server Object configuration you performed.

    Check Monitor Configuration

    Use the maxctrl list monitors and maxctrl show monitor commands to view the configured monitors.

    This action is performed on the MaxScale node:

    1. Obtain the full list of monitors:

    1. For each monitor, view the monitor configuration:

    Output should align to the MariaDB Monitor (mariadbmon) configuration you performed.

    Check Service Configuration

    Use the maxctrl list services and maxctrl show service commands to view the configured routing services.

    This action is performed on the MaxScale node:

    1. Obtain the full list of routing services:

    1. For each service, view the service configuration:

    Output should align to the Read Connection Router (readconnroute) or Read/Write Split Router (readwritesplit) configuration you performed.

    Test Application User

    Applications should use a dedicated user account. The user account must be created on the primary server.

    When users connect to MaxScale, MaxScale authenticates the user connection before routing it to an Enterprise Server node. Enterprise Server authenticates the connection as originating from the IP address of the MaxScale node.

    The application users must have one user account with the host IP address of the application server and a second user account with the host IP address of the MaxScale node.

    The requirement of a duplicate user account can be avoided by enabling the proxy_protocol parameter for MaxScale and the proxy_protocol_networks for Enterprise Server.

    Create a User to Connect from MaxScale

    This action is performed on the primary Enterprise ColumnStore node:

    1. Connect to the primary Enterprise ColumnStore node:

    1. Create the database user account for your MaxScale node:

    Replace 192.0.2.10 with the relevant IP address specification for your MaxScale node.

    Passwords should meet your organization's password policies.

    1. Grant the privileges required by your application to the database user account for your MaxScale node:

    The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

    Create a User to Connect from the Application Server

    This action is performed on the primary Enterprise ColumnStore node:

    1. Create the database user account for your application server:

    Replace 192.0.2.11 with the relevant IP address specification for your application server.

    Passwords should meet your organization's password policies.

    1. Grant the privileges required by your application to the d database user account for your application server:

    The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

    Test Connection with Application User

    To test the connection, use the MariaDB Client from your application server to connect to an Enterprise ColumnStore node through MaxScale.

    This action is performed on a client connected to the MaxScale node:

    Test Connection with Read Connection Router

    If you configured the Read Connection Router, confirm that MaxScale routes connections to the replica servers.

    On the MaxScale node, use the maxctrl list listeners command to view the available listeners and ports:

    1. Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read Connection Router (in the example, 3308):

    Use the application user credentials you created for the --user and --password options.

    1. In each terminal, query the hostname and server_id system variable and option to identify to which you're connected:

    Different terminals should return different values since MaxScale routes the connections to different nodes.

    Since the router was configured with the slave router option, the Read Connection Router only routes connections to replica servers.

    Test Write Queries with Read/Write Split Router

    If you configured the Read/Write Split Router, confirm that MaxScale routes write queries on this router to the primary Enterprise ColumnStore node.

    on the MaxScale node, use the maxctrl list listeners command to view the available listeners and ports:

    1. Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read/Write Split Router (in the example, 3307):

    Use the application user credentials you created for the --user and --password options.

    1. In one terminal, create the test table:

    1. In each terminal, issue an insert.md statement to add a row to the example table with the values of the hostname and server_id system variable and option:

    1. In one terminal, issue a SELECT statement to query the results:

    While MaxScale is handling multiple connections from different terminals, it routed all connections to the current primary Enterprise ColumnStore node, which in the example is mcs1#.

    Test Read Queries with Read/Write Split Router

    If you configured the Read/Write Split Router (readwritesplit), confirm that MaxScale routes read queries on this router to replica servers.

    1. On the MaxScale node, use the maxctrl list listeners command to view the available listeners and ports:

    1. In a terminal connected to your application server, use MariaDB Client to connect to the listener port for the Read/Write Split Router (readwritesplit) (in the example, 3307):

    Use the application user credentials you created for the --user and --password options.

    1. Query the hostname and server_id to identify which server MaxScale routed you to.

    1. Resend the query:

    Confirm that MaxScale routes the SELECT statements to different replica servers.

    For more information on different routing criteria, see slave_selection_criteria

    Next Step

    "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 8 of 9.

    Next: Step 9: Import Data.

    Stop the Enterprise ColumnStore Services

    The installation process might have started some of the ColumnStore services. The services should be stopped prior to making configuration changes.

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise Server service:

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:

    1. On each Enterprise ColumnStore node, stop the CMAPI service:

    Configure Enterprise ColumnStore

    On each Enterprise ColumnStore node, configure Enterprise Server.

    Connector
    MariaDB Connector/R2DBC

    Set this system variable to utf8

    Set this system variable to utf8_general_ci

    columnstore_use_import_for_batchinsert

    Set this system variable to ALWAYS to always use cpimport for LOAD DATA INFILE and INSERT...SELECT statements.

    Set this system variable to ON.

    Set this option to the file you want to use for the Binary Log. Setting this option enables binary logging.

    Set this option to the file you want to use to track binlog filenames.

    Mandatory system variables and options for ColumnStore Object Storage include:

    Example Configuration

    Start the Enterprise ColumnStore Services

    1. On each Enterprise ColumnStore node, start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:

    After the CMAPI service is installed in the next step, CMAPI will start the Enterprise ColumnStore service as needed on each node. CMAPI disables the Enterprise ColumnStore service to prevent systemd from automatically starting Enterprise ColumnStore upon reboot.

    1. On each Enterprise ColumnStore node, start and enable the CMAPI service, so that it starts automatically upon reboot:

    For additional information, see "Start and Stop Services".

    Create User Accounts

    The ColumnStore Object Storage topology requires several user accounts. Each user account should be created on the primary server, so that it is replicated to the replica servers.

    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.

    1. On the primary server, create the user account with the CREATE USER statement:

    1. On the primary server, grant the user account SELECT privileges on all databases with the GRANT statement:

    1. On each Enterprise ColumnStore node, configure the ColumnStore utility user:

    1. On each Enterprise ColumnStore node, set the password:

    For details about how to encrypt the password, see "Credentials Management for MariaDB Enterprise ColumnStore".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    Create the Replication User

    ColumnStore Object Storage uses MariaDB Replication to replicate writes between the primary and replica servers. As MaxScale can promote a replica server to become a new primary in the event of node failure, all nodes must have a replication user.

    The action is performed on the primary server.

    Create the replication user and grant it the required privileges:

    1. Use the CREATE USER statement to create replication user.

    Replace the referenced IP address with the relevant address for your environment.

    Ensure that the user account can connect to the primary server from each replica.

    1. Grant the user account the required privileges with the GRANT statement.

    Create MaxScale User

    ColumnStore Object Storage 23.10 uses MariaDB MaxScale 22.08 to load balance between the nodes.

    This action is performed on the primary server.

    1. Use the statement to create the MaxScale user:

    Replace the referenced IP address with the relevant address for your environment.

    Ensure that the user account can connect from the IP address of the MaxScale instance.

    1. Use the statement to grant the privileges required by the router:

    1. Use the statement to grant privileges required by the MariaDB Monitor.

    Configure MariaDB Replication

    On each replica server, configure MariaDB Replication:

    1. Use the CHANGE MASTER TO statement to configure the connection to the primary server:

    1. Start replication using the START REPLICA statement:

    1. Confirm that replication is working using the SHOW REPLICA STATUS statement:

    Ensure that the replica server cannot accept local writes by setting the read_only system variable to ON using the SET GLOBAL statement:

    Initiate the Primary Server with CMAPI

    Initiate the primary server using CMAPI.

    1. Create an API key for the cluster. This API key should be stored securely and kept confidential, because it can be used to add cluster nodes to the multi-node Enterprise ColumnStore deployment.

    For example, to create a random 256-bit API key using openssl rand:

    This document will use the following API key in further examples, but users should create their own:

    1. Use CMAPI to add the primary server to the cluster and set the API key. The new API key needs to be provided as part of the X-API-key HTML header.

    For example, if the primary server's host name is mcs1 and its IP address is 192.0.2.1, use the following node command:

    1. Use CMAPI to check the status of the cluster node:

    Add Replica Servers with CMAPI

    Add the replica servers with CMAPI:

    1. For each replica server, use CMAPI to add the replica server to the cluster. The previously set API key needs to be provided as part of the X-API-key HTML header.

    For example, if the primary server's host name is mcs1 and the replica server's IP address is 192.0.2.2, use the following node command:

    1. After all replica servers have been added, use CMAPI to confirm that all cluster nodes have been successfully added:

    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in enforcing mode:

    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see How to create an AppArmor Profile on Ubuntu.com.

    Configure Firewalls

    The specific steps to configure the firewall service depend on the platform.

    Configure firewalld (CentOS, RHEL)

    Configure firewalld for Enterprise Cluster on CentOS and RHEL:

    1. Check if the firewalld service is running:

    1. If the firewalld service was stopped to perform the installation, start it now:

    For example, if your cluster nodes are in the 192.0.2.0/24 subnet:

    1. Open up the relevant ports using firewall-cmd:

    1. Reload the runtime configuration:

    Configure UFW (Ubuntu)

    Configure UFW for Enterprise ColumnStore on Ubuntu:

    1. Check if the UFW service is running:

    1. If the UFW service was stopped to perform the installation, start it now:

    1. Open up the relevant ports using ufw.

    For example, if your cluster nodes are in the 192.0.2.0/24 subnet in the range 192.0.2.1 - 192.0.2.3:

    1. Reload the runtime configuration:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Shared Local Storage Topology".

    This page was step 4 of 9.

    Next: Step 5: Test MariaDB Enterprise Server.

    ColumnStore Architectural Overview

    MariaDB ColumnStore enhances MariaDB Enterprise Server with a columnar engine for OLAP and HTAP workloads, using MPP for scalability. It supports cross-engine JOINs, integrates with S3 storage, and provides high-speed bulk loading with multi-node management via REST API.

    MariaDB ColumnStore is a columnar storage engine designed for distributed massively parallel processing (MPP), such as for big data analysis. Deployments can be composed of several MariaDB servers or just one, each running several subprocess working together to provide linear scalability and exceptional performance with real-time response to analytical queries.

    It provides a highly available, fault tolerant, and performant columnar storage engine for MariaDB Enterprise Server. MariaDB Enterprise ColumnStore is designed for data warehousing, decision support systems (DSS), online analytical processing (OLAP), and hybrid transactional-analytical processing (HTAP).

    Benefits

    • Columnar storage engine that enables MariaDB Enterprise Server to perform new workloads

    • Optimized for online analytical process (OLAP) workloads including data warehousing, decision support systems, and business intelligence

    • Single-stack solution for hybrid transactional-analytical workloads to eliminate barriers and prevent data silos

    • Implements cross-engine JOINs to join Enterprise ColumnStore tables with tables using row-based storage engines, such as

    • Smart storage engine that plans and optimizes its own queries using a custom select handler

    • Scalable query execution using massively parallel processing (MPP) strategies, parallel query execution, and distributed function evaluation

    • S3-compatible object storage can be used for highly available, low-cost, multi-regional, resilient, scalable, secure, and virtually limitless data storage

    • High availability and automatic failover by leveraging

    • REST API for multi-node administration with the Cluster Management API (CMAPI) server

    • Connectors for popular BI platforms such as Microsoft Power BI and Tableau

    • High-speed bulk data loading that bypasses the SQL layer and does not block concurrent read-only queries

    Topologies

    MariaDB Enterprise ColumnStore supports multiple topologies. Several options are described below. MariaDB Enterprise ColumnStore can be deployed in other topologies. The topologies on this page are representative of basic product capabilities.

    MariaDB products can be deployed to form other topologies that leverage advanced product capabilities and combine the capabilities of multiple topologies.

    Enterprise ColumnStore with Object Storage

    The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.

    The topology consists of:

    • One or more MaxScale nodes

    • An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI

    The MaxScale nodes:

    • Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)

    • Accept client and application connections

    • Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)

    The ColumnStore nodes:

    • Receive queries from MaxScale

    • Execute queries

    • Use for data

    • Use for the .

    Enterprise ColumnStore with Shared Local Storage

    The MariaDB Enterprise ColumnStore topology with Shared Local Storage delivers production analytics with high availability and fault tolerance by leveraging shared local storage, such as NFS.

    The topology consists of:

    • One or more MaxScale nodes

    • An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI

    The MaxScale nodes:

    • Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)

    • Accept client and application connections

    • Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)

    The ColumnStore nodes:

    • Receive queries from MaxScale

    • Execute queries

    • Use for the .

    Software Architecture

    Software Component
    Role

    MariaDB Enterprise ColumnStore

    MariaDB Enterprise ColumnStore is the columnar storage engine that handles data storage and query optimization/execution.

    MariaDB Enterprise ColumnStore is a columnar storage engine that is optimized for analytical or online analytical processing (OLAP) workloads, data warehousing, and DSS. MariaDB Enterprise ColumnStore can be used for hybrid transactional-analytical processing (HTAP) workloads when paired with a row-based storage engine, like .

    MariaDB Enterprise Server

    MariaDB Enterprise ColumnStore is built on top of . MariaDB Enterprise ColumnStore 5 is included with the standard MariaDB Enterprise Server 10.5 releases, while MariaDB Enterprise ColumnStore 6 is included with the standard MariaDB Enterprise Server 10.6 releases.

    Enterprise ColumnStore interfaces with the Enterprise Server SQL engine through the ColumnStore storage engine plugin.

    MariaDB has been continually improving the integration of MariaDB Enterprise ColumnStore with MariaDB Enterprise Server:

    • MariaDB ColumnStore required special custom-built releases of MariaDB Server.

    • MariaDB Enterprise ColumnStore was included with the standard MariaDB Enterprise Server 10.5 releases up to ES 10.5.5-3. It was the first release to replace the Operations/Administration/Maintenance (OAM) API with the more modern Cluster Management API (CMAPI), which is still in use.

    • Starting with ES 10.5.6-4, MariaDB Enterprise ColumnStore is included with the standard MariaDB Enterprise Server 10.5 releases.

    ColumnStore Storage Engine Plugin

    MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

    The ColumnStore storage engine plugin is a smart storage engine that implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities, such as:

    • Using a custom query planner

    • Selecting data by column instead of by row

    • Parallel query evaluation

    • Distributed aggregations

    As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.

    For additional information, see "".

    Cluster Management API (CMAPI) Server

    The server provides a REST API that can be used to configure and manage Enterprise ColumnStore.

    CMAPI must run on every ColumnStore node in a multi-node deployment but is not required in a single-node deployment.

    The REST API can be used to perform multiple operations:

    • Add ColumnStore nodes

    • Remove ColumnStore nodes

    • Start Enterprise ColumnStore

    • Shutdown Enterprise ColumnStore

    MariaDB MaxScale

    MariaDB Enterprise ColumnStore leverages as an advanced database proxy and query router.

    Multi-node Enterprise ColumnStore deployments must have one or more nodes. MaxScale performs many different roles:

    • Routing writes queries to the primary server

    • Load balancing read queries on replica servers

    • Monitoring node health

    • Performing automatic failover if a node fails

    Storage Architecture

    MariaDB Enterprise ColumnStore's storage architecture provides a columnar storage engine with high availability, fault tolerance, compression, and automatic partitioning for production analytics and data warehousing.

    For additional information, see " and ".

    Columnar Storage Engine

    MariaDB Enterprise ColumnStore is a columnar storage engine for . MariaDB Enterprise ColumnStore enables ES to perform analytical workloads, including online analytical processing (OLAP), data warehousing, decision support systems (DSS), and hybrid transactional-analytical processing (HTAP) workloads.

    Most traditional relational databases use row-based storage engines. In row-based storage engines, all columns for a table are stored contiguously. Row-based storage engines perform very well for transactional workloads but are less performant for analytical workloads.

    Columnar storage engines store each column separately. Columnar storage engines perform very well for analytical workloads. Analytical workloads are characterized by ad hoc queries on very large data sets by relatively few users.

    MariaDB Enterprise ColumnStore automatically partitions each column into extents, which helps improve query performance without using indexes.

    S3-Compatible Object Storage

    MariaDB Enterprise ColumnStore supports S3-compatible object storage.

    S3-compatible object storage is optional, but highly recommended. If S3-compatible object storage is used, Enterprise ColumnStore requires the to use (such as NFS) for high availability.

    S3-compatible object storage is:

    • Compatible: Many object storage services are compatible with the Amazon S3 API.

    • Economical: S3-compatible object storage is often very low cost.

    • Flexible: S3-compatible object storage is available for both cloud and on-premises deployments.

    • Limitless: S3-compatible object storage is often virtually limitless.

    Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

    If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.

    Shared Local Storage

    MariaDB Enterprise ColumnStore can use shared local storage.

    Shared local storage is required for high availability. The specific depend on whether Enterprise ColumnStore is configured to use :

    • When S3-compatible object storage is used, Enterprise ColumnStore requires the to use shared local storage for high availability.

    • When S3-compatible object storage is not used, Enterprise ColumnStore requires the to use shared local storage for high availability.

    The most common shared local storage options for on-premises and cloud deployments are:

    • NFS (Network File System)

    • GlusterFS

    The most common shared local storage options for AWS (Amazon Web Services) deployments are:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    The most common shared local storage option for GCP (Google Cloud Platform) deployments is:

    • Filestore

    Query Evaluation Architecture

    MariaDB Enterprise ColumnStore uses distributed query execution and massively parallel processing (MPP) techniques to achieve vertical and horizontal scalability for production analytics and data warehousing.

    For additional information, see "".

    Extent Elimination

    MariaDB Enterprise ColumnStore uses extent elimination to scale query evaluation as the table size increases.

    Most databases are row-based, utilizing manually created indexes to achieve high performance on large tables. This works well for transactional workloads. However, analytical queries tend to have very low selectivity, so traditional indexes are not typically effective for analytical queries.

    Enterprise ColumnStore uses extent elimination to achieve high performance, without requiring manually created indexes. Enterprise ColumnStore automatically partitions all data into extents. Enterprise ColumnStore stores the minimum and maximum values for each in the extent map. Enterprise ColumnStore uses the minimum and maximum values in the to perform extent elimination.

    When Enterprise ColumnStore performs extent elimination, it compares the query's join conditions and filter conditions (i.e., WHERE clause) to the minimum and maximum values for each extent in the extent map. If the extent's minimum and maximum values fall outside the bounds of the query's conditions, Enterprise ColumnStore skips that extent for the query.

    Extent elimination is automatically performed for every query. It can significantly decrease I/O for columns with clustered values. For example, extent elimination works effectively for series, ordered, patterned, and time-based data.

    Custom Select Handler

    The ColumnStore storage engine plugin implements a custom select handler to fully take advantage of Enterprise ColumnStore's capabilities.

    All storage engines interact with ES using an internal handler API, which is highly extensible. Storage engines can implement different features by implementing different methods within the handler API.

    For select statements, the handler API transforms each query into a SELECT_LEX object, which is provided to the select handler.

    The generic select handler is not optimal for Enterprise ColumnStore, because:

    • Enterprise ColumnStore selects data by column, but the generic selects handler selects data by row.

    • Enterprise ColumnStore supports parallel query evaluation, but the generic select handler does not.

    • Enterprise ColumnStore supports distributed aggregations, but the generic select handler does not.

    • Enterprise ColumnStore supports distributed functions, but the generic select handler does not.

    Smart Storage Engine

    The ColumnStore storage engine plugin is known as a smart storage engine, because it implements a custom select handler. MariaDB Enterprise ColumnStore integrates with MariaDB Enterprise Server using the ColumnStore storage engine plugin. The ColumnStore storage engine plugin enables MariaDB Enterprise Server to interact with ColumnStore tables.

    If a storage engine implements a custom select handler, it is known as a smart storage engine.

    As a smart storage engine, the ColumnStore storage engine plugin tightly integrates Enterprise ColumnStore with ES, but it has enough independence to efficiently execute analytical queries using a completely unique approach.

    Query Planning

    The ColumnStore storage engine plugin is a smart storage engine, so MariaDB Enterprise ColumnStore to plan its queries using the .

    MariaDB Enterprise ColumnStore's query planning is divided into two steps:

    1. ES provides the query's SELECT_LEX object to the . The custom selects handler builds a .

    2. The custom select handler provides the CSEP to the on the same node. The ExeMgr process performs and creates a job list.

    Job Steps

    When Enterprise ColumnStore executes a query, the ExeMgr process on the initiator/aggregator node translates the ColumnStore execution plan (CSEP) into a job list. A job list is a sequence of job steps.

    Enterprise ColumnStore uses many different types of job steps that provide different scalability benefits:

    • Some types of job steps perform operations in a distributed manner using multiple nodes to operate on different extents. Distributed operations provide horizontal scalability.

    • Some types of job steps perform operations in a multi-threaded manner using a thread pool. Performing multi-threaded operations provides vertical scalability.

    As you increase the number of ColumnStore nodes or the number of cores on each node, Enterprise ColumnStore can use those resources to more efficiently execute job steps.

    High Availability and Failover

    MariaDB Enterprise ColumnStore leverages common technologies to provide highly available production analytics with automatic failover:

    Technology
    Role

    Shared Local Storage

    MariaDB Enterprise ColumnStore can use shared local storage.

    Shared local storage is required for high availability. The specific depend on whether Enterprise ColumnStore is configured to use :

    • When S3-compatible object storage is used, Enterprise ColumnStore requires the to use shared local storage for high availability.

    • When S3-compatible object storage is not used, Enterprise ColumnStore requires the to use for high availability.

    The most common shared local storage options for on-premises and cloud deployments are:

    • NFS (Network File System)

    • GlusterFS

    The most common shared local storage options for AWS (Amazon Web Services) deployments are:

    • EBS (Elastic Block Store) Multi-Attach

    • EFS (Elastic File System)

    The most common shared local storage option for GCP (Google Cloud Platform) deployments is:

    • Filestore

    MariaDB Replication

    MariaDB Enterprise ColumnStore requires to synchronize various database objects on multiple nodes for high availability.

    MariaDB replication synchronizes:

    • The schemas for all ColumnStore tables on all nodes

    • The schemas and data for all non-ColumnStore tables on all nodes

    • All other databases objects (stored procedures, stored functions, user accounts, and other objects) on all nodes

    MaxScale

    MariaDB Enterprise ColumnStore requires to achieve high availability, automatic failover, and load balancing.

    MariaDB Monitor (mariadbmon) in MaxScale monitors the health of each Enterprise ColumnStore node.

    MaxScale provides load balancing by routing queries and/or connections to healthy nodes by:

    • Providing query-based routing using Read/Write Split Router (readwritesplit).

    • Providing connection-based routing using Read Connection Router (readconnroute).

    When MaxScale's MariaDB Monitor notices the primary node fail, MariaDB Monitor performs automatic failover by:

    • Promoting a replica node to become the new primary node.

    • Re-configuring all replica nodes to replicate from the new primary node.

    Cluster Management API (CMAPI) Server

    MariaDB Enterprise ColumnStore requires the Cluster Management API (CMAPI) Server for high availability.

    The CMAPI server provides a REST API that can be used to manage and configure Enterprise ColumnStore.

    The CMAPI server has a role in automatic failover. After MaxScale performs automatic failover, the CMAPI server detects the topology change and automatically re-configures the roles of each Enterprise ColumnStore node.

    Data Loading

    MariaDB Enterprise ColumnStore performs bulk data loads very efficiently using a variety of mechanisms including the cpimport tool, specialized handling of certain SQL statements, and minimal locking during data import.

    For additional information, see "".

    cpimport

    MariaDB Enterprise ColumnStore includes a bulk data loading tool called cpimport, which provides several benefits:

    • Bypasses the SQL layer to decrease overhead

    • Does not block read queries

    • Requires a write metadata lock on the table, which can be monitored with the .

    • Appends the new data to the table. While the bulk load is in progress, the newly appended data is temporarily hidden from queries. After the bulk load is complete, the newly appended data is visible to queries.

    Batch Insert Mode

    MariaDB Enterprise ColumnStore enables batch insert mode by default.

    When batch insert mode is enabled, MariaDB Enterprise ColumnStore, MariaDB Enterprise ColumnStore has special handling for the following statements:

    • .

    Enterprise ColumnStore uses the following rules:

    • If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called cpimport.bin.

    • If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.

    Batch insert mode can be disabled by setting the columnstore_use_import_for_batchinsert system variable. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.

    Locking

    MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.

    When a bulk data load is running:

    • Read queries will not be blocked.

    • Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.

    • The write metadata lock (MDL) can be monitored with the .

    Backup and Restore

    MariaDB Enterprise ColumnStore supports backup and restore using well-known tools and methods.

    Component
    Backup Methods

    For additional information, see "".

    S3-Compatible Object Storage

    MariaDB Enterprise ColumnStore can leverage S3 snapshots to backup S3-compatible object storage when it is used for Enterprise ColumnStore's data.

    The S3-compatible object storage can be backed up by:

    1. Locking the database on the primary node

    2. Performing an S3 snapshot using the vendor's standard snapshot functionality

    Shared Local Storage

    MariaDB Enterprise ColumnStore can leverage file system snapshots or file copy tools (such as rsync) to backup shared local storage when it is used for the or the .

    The shared local storage can be backed up by:

    1. Locking the database on the primary node

    2. Performing a file system snapshot or using a file copy tool (such as rsync) to copy the contents of the or the .

    Enterprise Server Data Directory

    MariaDB Enterprise ColumnStore can leverage the standard utility to back up the Enterprise Server data directory.

    The backup contains:

    • All ColumnStore schemas

    • All non-ColumnStore schemas and data

    • All other database objects

    It does not contain:

    • ColumnStore data

    Step 4: Start and Configure MariaDB Enterprise Server

    Step 4: Start and Configure MariaDB Enterprise Server

    Overview

    This page details step 4 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step starts and configures MariaDB Enterprise Server, and MariaDB Enterprise ColumnStore 23.10.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Stop the Enterprise ColumnStore Services

    The installation process might have started some of the ColumnStore services. The services should be stopped prior to making configuration changes.

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise Server service:

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:

    1. On each Enterprise ColumnStore node, stop the CMAPI service:

    Configure Enterprise ColumnStore

    On each Enterprise ColumnStore node, configure Enterprise Server.

    Connector
    MariaDB Connector/R2DBC

    Mandatory system variables and options for ColumnStore Object Storage include:

    Example Configuration

    Configure the S3 Storage Manager

    On each Enterprise ColumnStore node, configure S3 Storage Manager to use S3-compatible storage by editing the /etc/columnstore/storagemanager.cnf configuration file:

    The S3-compatible object storage options are configured under [S3]:

    • The bucket option must be set to the name of the bucket that you created in "Create an S3 Bucket".

    • The endpoint option must be set to the endpoint for the S3-compatible object storage.

    • The aws_access_key_id and aws_secret_access_key options must be set to the access key ID and secret access key for the S3-compatible object storage.

    • To use a specific IAM role, you must uncomment and set

    The local cache options are configured under [Cache]:

    • The cache_size option is set to 2 GB by default.

    • The path option is set to /var/lib/columnstore/storagemanager/cache by default.

    Ensure that the specified path has sufficient storage space for the specified cache size.

    Start the Enterprise ColumnStore Services

    1. On each Enterprise ColumnStore node, start and enable the MariaDB Enterprise Server service, so that it starts automatically upon reboot:

    1. On each Enterprise ColumnStore node, stop the MariaDB Enterprise ColumnStore service:

    After the CMAPI service is installed in the next step, CMAPI will start the Enterprise ColumnStore service as needed on each node. CMAPI disables the Enterprise ColumnStore service to prevent systemd from automatically starting Enterprise ColumnStore upon reboot.

    1. On each Enterprise ColumnStore node, start and enable the CMAPI service, so that it starts automatically upon reboot:

    For additional information, see "".

    Create User Accounts

    The ColumnStore Object Storage topology requires several user accounts. Each user account should be created on the primary server, so that it is replicated to the replica servers.

    Create the Utility User

    Enterprise ColumnStore requires a mandatory utility user account to perform cross-engine joins and similar operations.

    1. On the primary server, create the user account with the CREATE USER statement:

    1. On the primary server, grant the user account SELECT privileges on all databases with the GRANT statement:

    1. On each Enterprise ColumnStore node, configure the ColumnStore utility user:

    1. On each Enterprise ColumnStore node, set the password:

    For details about how to encrypt the password, see "".

    Passwords should meet your organization's password policies. If your MariaDB Enterprise Server instance has a password validation plugin installed, then the password should also meet the configured requirements.

    Create the Replication User

    ColumnStore Object Storage uses MariaDB Replication to replicate writes between the primary and replica servers. As MaxScale can promote a replica server to become a new primary in the event of node failure, all nodes must have a replication user.

    The action is performed on the primary server.

    Create the replication user and grant it the required privileges:

    1. Use the CREATE USER statement to create replication user.

    Replace the referenced IP address with the relevant address for your environment.

    Ensure that the user account can connect to the primary server from each replica.

    1. Grant the user account the required privileges with the GRANT statement.

    Create MaxScale User

    ColumnStore Object Storage 23.10 uses MariaDB MaxScale 22.08 to load balance between the nodes.

    This action is performed on the primary server.

    1. Use the statement to create the MaxScale user:

    Replace the referenced IP address with the relevant address for your environment.

    Ensure that the user account can connect from the IP address of the MaxScale instance.

    1. Use the statement to grant the privileges required by the router:

    1. Use the statement to grant privileges required by the MariaDB Monitor.

    Configure MariaDB Replication

    On each replica server, configure MariaDB Replication:

    1. Use the CHANGE MASTER TO statement to configure the connection to the primary server:

    1. Start replication using the START REPLICA statement:

    1. Confirm that replication is working using the SHOW REPLICA STATUS statement:

    Ensure that the replica server cannot accept local writes by setting the read_only system variable to ON using the SET GLOBAL statement:

    Initiate the Primary Server with CMAPI

    Initiate the primary server using CMAPI.

    1. Create an API key for the cluster. This API key should be stored securely and kept confidential, because it can be used to add cluster nodes to the multi-node Enterprise ColumnStore deployment.

    For example, to create a random 256-bit API key using openssl rand:

    This document will use the following API key in further examples, but users should create their own:

    1. Use CMAPI to add the primary server to the cluster and set the API key. The new API key needs to be provided as part of the X-API-key HTML header.

    For example, if the primary server's host name is mcs1 and its IP address is 192.0.2.1, use the following node command:

    1. Use CMAPI to check the status of the cluster node:

    Add Replica Servers with CMAPI

    Add the replica servers with CMAPI:

    1. For each replica server, use to add the replica server to the cluster. The previously set API key needs to be provided as part of the X-API-key HTML header.

    For example, if the primary server's host name is mcs1 and the replica server's IP address is 192.0.2.2, use the following node command:

    1. After all replica servers have been added, use CMAPI to confirm that all cluster nodes have been successfully added:

    Configure Linux Security Modules (LSM)

    The specific steps to configure the security module depend on the operating system.

    Configure SELinux (CentOS, RHEL)

    Configure SELinux for Enterprise ColumnStore:

    1. To configure SELinux, you have to install the packages required for audit2allow. On CentOS 7 and RHEL 7, install the following:

    On RHEL 8, install the following:

    1. Allow the system to run under load for a while to generate SELinux audit events.

    2. After the system has taken some load, generate an SELinux policy from the audit events using audit2allow:

    If no audit events were found, this will print the following:

    1. If audit events were found, the new SELinux policy can be loaded using semodule:

    1. Set SELinux to enforcing mode:

    1. Set SELinux to enforcing mode by setting SELINUX=enforcing in /etc/selinux/config.

    For example, the file will usually look like this after the change:

    1. Confirm that SELinux is in enforcing mode:

    Configure AppArmor (Ubuntu)

    For information on how to create a profile, see on Ubuntu.com.

    Configure Firewalls

    The specific steps to configure the firewall service depend on the platform.

    Configure firewalld (CentOS, RHEL)

    Configure firewalld for Enterprise Cluster on CentOS and RHEL:

    1. Check if the firewalld service is running:

    1. If the firewalld service was stopped to perform the installation, start it now:

    For example, if your cluster nodes are in the 192.0.2.0/24 subnet:

    1. Open up the relevant ports using firewall-cmd:

    1. Reload the runtime configuration:

    Configure UFW (Ubuntu)

    Configure UFW for Enterprise ColumnStore on Ubuntu:

    1. Check if the UFW service is running:

    1. If the UFW service was stopped to perform the installation, start it now:

    1. Open up the relevant ports using ufw.

    For example, if your cluster nodes are in the 192.0.2.0/24 subnet in the range 192.0.2.1 - 192.0.2.3:

    1. Reload the runtime configuration:

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 4 of 9.

    Step 8: Test MariaDB MaxScale

    Step 8: Test MariaDB MaxScale

    Overview

    This page details step 8 of the 9-step procedure "Deploy ColumnStore Object Storage Topology".

    This step tests MariaDB MaxScale 22.08.

    Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

    Check Global Configuration

    Use command to view the global MaxScale configuration.

    This action is performed on the MaxScale node:

    Output should align to the global MaxScale configuration in the new configuration file you created.

    Check Server Configuration Use the and commands to view the configured server objects.

    This action is performed on the MaxScale node:

    1. Obtain the full list of servers objects:

    For each server object, view the configuration:

    Output should align to the Server Object configuration you performed.

    Check Monitor Configuration

    Use the and commands to view the configured monitors.

    This action is performed on the MaxScale node:

    1. Obtain the full list of monitors:

    1. For each monitor, view the monitor configuration:

    Output should align to the MariaDB Monitor (mariadbmon) configuration you performed.

    Check Service Configuration

    Use the and commands to view the configured routing services.

    This action is performed on the MaxScale node:

    1. Obtain the full list of routing services:

    1. For each service, view the service configuration:

    Output should align to the or configuration you performed.

    Test Application User

    Applications should use a dedicated user account. The user account must be created on the primary server.

    When users connect to MaxScale, MaxScale authenticates the user connection before routing it to an Enterprise Server node. Enterprise Server authenticates the connection as originating from the IP address of the MaxScale node.

    The application users must have one user account with the host IP address of the application server and a second user account with the host IP address of the MaxScale node.

    The requirement of a duplicate user account can be avoided by enabling the proxy_protocol parameter for MaxScale and the proxy_protocol_networks for Enterprise Server.

    Create a User to Connect from MaxScale

    This action is performed on the primary Enterprise ColumnStore node:

    1. Connect to the primary Enterprise ColumnStore node:

    1. Create the database user account for your MaxScale node:

    Replace 192.0.2.10 with the relevant IP address specification for your MaxScale node.

    Passwords should meet your organization's password policies.

    1. Grant the privileges required by your application to the database user account for your MaxScale node:

    The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

    Create a User to Connect from the Application Server

    This action is performed on the primary Enterprise ColumnStore node:

    1. Create the database user account for your application server:

    Replace 192.0.2.11 with the relevant IP address specification for your application server.

    Passwords should meet your organization's password policies.

    1. Grant the privileges required by your application to the d database user account for your application server:

    The privileges shown are designed to allow the tests in the subsequent sections to work. The user account for your production application may require different privileges.

    Test Connection with Application User

    To test the connection, use the MariaDB Client from your application server to connect to an Enterprise ColumnStore node through MaxScale.

    This action is performed on a client connected to the MaxScale node:

    Test Connection with Read Connection Router

    If you configured the Read Connection Router, confirm that MaxScale routes connections to the replica servers.

    On the MaxScale node, use the command to view the available listeners and ports:

    1. Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read Connection Router (in the example, 3308):

    Use the application user credentials you created for the --user and --password options.

    1. In each terminal, query the hostname and server_id system variable and option to identify to which you're connected:

    Different terminals should return different values since MaxScale routes the connections to different nodes.

    Since the router was configured with the slave router option, the Read Connection Router only routes connections to replica servers.

    Test Write Queries with Read/Write Split Router

    If you configured the Read/Write Split Router, confirm that MaxScale routes write queries on this router to the primary Enterprise ColumnStore node.

    on the MaxScale node, use the command to view the available listeners and ports:

    1. Open multiple terminals connected to your application server, in each, use MariaDB Client to connect to the listener port for the Read/Write Split Router (in the example, 3307):

    Use the application user credentials you created for the --user and --password options.

    1. In one terminal, create the test table:

    1. In each terminal, issue an insert.md statement to add a row to the example table with the values of the hostname and server_id system variable and option:

    1. In one terminal, issue a SELECT statement to query the results:

    While MaxScale is handling multiple connections from different terminals, it routed all connections to the current primary Enterprise ColumnStore node, which in the example is mcs1#.

    Test Read Queries with Read/Write Split Router

    If you configured the , confirm that MaxScale routes read queries on this router to replica servers.

    1. On the MaxScale node, use the command to view the available listeners and ports:

    1. In a terminal connected to your application server, use MariaDB Client to connect to the listener port for the (in the example, 3307):

    Use the application user credentials you created for the --user and --password options.

    1. Query the hostname and server_id to identify which server MaxScale routed you to.

    1. Resend the query:

    Confirm that MaxScale routes the SELECT statements to different replica servers.

    For more information on different routing criteria, see slave_selection_criteria

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    This page was step 8 of 9.

    ColumnStore Bulk Data Loading

    Overview

    cpimport is a high-speed bulk load utility that imports data into ColumnStore tables in a fast and efficient manner. It accepts as input any flat file containing data that contains a delimiter between fields of data (i.e. columns in a table). The default delimiter is the pipe (‘|’) character, but other delimiters such as commas may be used as well. The data values must be in the same order as the create table statement, i.e. column 1 matches the first column in the table and so on. Date values must be specified in the format 'yyyy-mm-dd'.

    cpimport – performs the following operations when importing data into a MariaDB ColumnStore database:

    • Data is read from specified flat files.

    • Data is transformed to fit ColumnStore’s column-oriented storage design.

    • Redundant data is tokenized and logically compressed.

    • Data is written to disk.

    It is important to note that:

    • The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.

    • The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.

    • Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.

    There are two primary steps to using the cpimport utility:

    1. Optionally create a job file that is used to load data from a flat file into multiple tables.

    2. Run the cpimport utility to perform the data import.

    Syntax

    The simplest form of cpimport command is

    The full syntax is like this:

    cpimport modes

    Mode 1: Bulk Load from a central location with single data source file

    In this mode, you run the cpimport from your primary node (mcs1). The source file is located at this primary location and the data from cpimport is distributed across all the nodes. If no mode is specified, then this is the default.

    Example:

    Mode 2: Bulk load from central location with distributed data source files

    In this mode, you run the cpimport from your primary node (mcs1). The source data is in already partitioned data files residing on the PMs. Each PM should have the source data file of the same name but containing the partitioned data for the PM

    Example:

    Mode 3: Parallel distributed bulk load

    In this mode, you run cpimport from the individual nodes independently, which will import the source file that exists on that node. Concurrent imports can be executed on every node for the same table.

    Example:

    Note:

    • The bulk loads are an append operation to a table, so they allow existing data to be read and remain unaffected during the process.

    • The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.

    • Upon completion of the load operation, a high-water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. It appends operation provides for consistent read but does not incur the overhead of logging the data.

    Bulk loading data from STDIN

    Data can be loaded from STDIN into ColumnStore by simply not including the loadFile parameter

    Example:

    Bulk loading from AWS S3

    Similarly the AWS cli utility can be utilized to read data from an s3 bucket and pipe the output into cpimport allowing direct loading from S3. This assumes the aws cli program has been installed and configured on the host:

    Example:

    For troubleshooting connectivity problems remove the --quiet option which suppresses client logging including permission errors.

    Bulk loading output of SELECT FROM Table(s)

    Standard in can also be used to directly pipe the output from an arbitrary SELECT statement into cpimport. The select statement may select from non-columnstore tables such as or . In the example below, the db2.source_table is selected from, using the -N flag to remove non-data formatting. The -q flag tells the mysql client to not cache results which will avoid possible timeouts causing the load to fail.

    Example:

    Bulk loading from JSON

    Let's create a sample ColumnStore table:

    Now let's create a sample products.json file like this:

    We can then bulk load data from JSON into Columnstore by first piping the data to and then to using a one-line command.

    Example:

    In this example, the JSON data is coming from a static JSON file, but this same method will work for, and output streamed from any datasource using JSON such as an API or NoSQL database. For more information on 'jq', please view the manual here .

    Bulk loading into multiple tables

    There are two ways multiple tables can be loaded:

    1. Run multiple cpimport jobs simultaneously. Tables per import should be unique or for each import should be unique if using mode 3.

    2. Use colxml utility: colxml creates an XML job file for your database schema before you can import data. Multiple tables may be imported by either importing all tables within a schema or listing specific tables using the -t option in colxml. Then, using cpimport, that uses the job file generated by colxml. Here is an example of how to use colxml and cpimport to import data into all the tables in a database schema

    colxml syntax

    Example usage of colxml

    The following tables comprise a database name ‘tpch2’:

    1. First, put delimited input data file for each table in /usr/local/mariadb/columnstore/data/bulk/data/import. Each file should be named .tbl.

    2. Run colxml for the load job for the ‘tpch2’ database as shown here:

    Now actually run cpimport to use the job file generated by the colxml execution

    Handling Differences in Column Order and Values

    If there are some differences between the input file and table definition then the colxml utility can be utilized to handle these cases:

    • Different order of columns in the input file from table order

    • Input file column values to be skipped / ignored.

    • Target table columns to be defaulted.

    In this case run the colxml utility (the -t argument can be useful for producing a job file for one table if preferred) to produce the job xml file and then use this a template for editing and then subsequently use that job file for running cpimport.

    Consider the following simple table example:

    This would produce a colxml file with the following table element:

    If your input file had the data such that hire_date comes before salary then the following modification will allow correct loading of that data to the original table definition (note the last 2 Column elements are swapped):

    The following example would ignore the last entry in the file and default salary to it's default value (in this case null):

    • IgnoreFields instructs cpimport to ignore and skip the particular value at that position in the file.

    • DefaultColumn instructs cpimport to default the current table column and not move the column pointer forward to the next delimiter.

    Both instructions can be used indepedently and as many times as makes sense for your data and table definition.

    Binary Source Import

    It is possible to import using a binary file instead of a CSV file using fixed length rows in binary data. This can be done using the '-I' flag which has two modes:

    • -I1 - binary mode with NULLs accepted Numeric fields containing NULL will be treated as NULL unless the column has a default value

    • -I2 - binary mode with NULLs saturated NULLs in numeric fields will be saturated

    The following table shows how to represent the data in the binary format:

    Datatype
    Description

    For NULL values the following table should be used:

    Datatype
    Signed NULL
    Unsigned NULL

    Date Struct

    The spare bits in the Date struct "must" be set to 0x3E.

    DateTime Struct

    Working Folders & Logging

    As of version 1.4, cpimport uses the /var/lib/columnstore/bulk folder for all work being done. This folder contains:

    1. Logs

    2. Rollback info

    3. Job info

    4. A staging folder

    The log folder typically contains:

    A typical log might look like this:

    Prior to version 1.4, this folder was located at /usr/local/mariadb/columnstore/bulk.

    Multi-Node Localstorage

    This guide provides steps for deploying a multi-node ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.

    Overview

    Software Version
    Diagram
    Features
    $ maxctrl show maxscale
    ┌──────────────┬───────────────────────────────────────────────────────┐
    │ Version      │ 22.08.15                                              │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Commit       │ 3761fa7a52046bc58faad8b5a139116f9e33364c              │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Started At   │ Thu, 05 Aug 2021 20:21:20 GMT                         │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Activated At │ Thu, 05 Aug 2021 20:21:20 GMT                         │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Uptime       │ 868                                                   │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Config Sync  │ null                                                  │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Parameters   │ {                                                     │
    │              │     "admin_auth": true,                               │
    │              │     "admin_enabled": true,                            │
    │              │     "admin_gui": true,                                │
    │              │     "admin_host": "0.0.0.0",                          │
    │              │     "admin_log_auth_failures": true,                  │
    │              │     "admin_pam_readonly_service": null,               │
    │              │     "admin_pam_readwrite_service": null,              │
    │              │     "admin_port": 8989,                               │
    │              │     "admin_secure_gui": false,                        │
    │              │     "admin_ssl_ca_cert": null,                        │
    │              │     "admin_ssl_cert": null,                           │
    │              │     "admin_ssl_key": null,                            │
    │              │     "admin_ssl_version": "MAX",                       │
    │              │     "auth_connect_timeout": "10000ms",                │
    │              │     "auth_read_timeout": "10000ms",                   │
    │              │     "auth_write_timeout": "10000ms",                  │
    │              │     "cachedir": "/var/cache/maxscale",                │
    │              │     "config_sync_cluster": null,                      │
    │              │     "config_sync_interval": "5000ms",                 │
    │              │     "config_sync_password": "*****",                  │
    │              │     "config_sync_timeout": "10000ms",                 │
    │              │     "config_sync_user": null,                         │
    │              │     "connector_plugindir": "/usr/lib64/mysql/plugin", │
    │              │     "datadir": "/var/lib/maxscale",                   │
    │              │     "debug": null,                                    │
    │              │     "dump_last_statements": "never",                  │
    │              │     "execdir": "/usr/bin",                            │
    │              │     "language": "/var/lib/maxscale",                  │
    │              │     "libdir": "/usr/lib64/maxscale",                  │
    │              │     "load_persisted_configs": true,                   │
    │              │     "local_address": null,                            │
    │              │     "log_debug": false,                               │
    │              │     "log_info": false,                                │
    │              │     "log_notice": true,                               │
    │              │     "log_throttling": {                               │
    │              │         "count": 10,                                  │
    │              │         "suppress": 10000,                            │
    │              │         "window": 1000                                │
    │              │     },                                                │
    │              │     "log_warn_super_user": false,                     │
    │              │     "log_warning": true,                              │
    │              │     "logdir": "/var/log/maxscale",                    │
    │              │     "max_auth_errors_until_block": 10,                │
    │              │     "maxlog": true,                                   │
    │              │     "module_configdir": "/etc/maxscale.modules.d",    │
    │              │     "ms_timestamp": false,                            │
    │              │     "passive": false,                                 │
    │              │     "persistdir": "/var/lib/maxscale/maxscale.cnf.d", │
    │              │     "piddir": "/var/run/maxscale",                    │
    │              │     "query_classifier": "qc_sqlite",                  │
    │              │     "query_classifier_args": null,                    │
    │              │     "query_classifier_cache_size": 289073971,         │
    │              │     "query_retries": 1,                               │
    │              │     "query_retry_timeout": "5000ms",                  │
    │              │     "rebalance_period": "0ms",                        │
    │              │     "rebalance_threshold": 20,                        │
    │              │     "rebalance_window": 10,                           │
    │              │     "retain_last_statements": 0,                      │
    │              │     "session_trace": 0,                               │
    │              │     "skip_permission_checks": false,                  │
    │              │     "sql_mode": "default",                            │
    │              │     "syslog": true,                                   │
    │              │     "threads": 1,                                     │
    │              │     "users_refresh_interval": "0ms",                  │
    │              │     "users_refresh_time": "30000ms",                  │
    │              │     "writeq_high_water": 16777216,                    │
    │              │     "writeq_low_water": 8192                          │
    │              │ }                                                     │
    └──────────────┴───────────────────────────────────────────────────────┘
    $ maxctrl list servers
    ┌────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┐
    │ Server │ Address        │ Port │ Connections │ State           │ GTID   │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs1   │ 192.0.2.1      │ 3306 │ 1           │ Master, Running │ 0-1-25 │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs2   │ 192.0.2.2      │ 3306 │ 1           │ Slave, Running  │ 0-1-25 │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs3   │ 192.0.2.3      │ 3306 │ 1           │ Slave, Running  │ 0-1-25 │
    └────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┘
    $ maxctrl show server mcs1
    ┌─────────────────────┬───────────────────────────────────────────┐
    │ Server              │ mcs1                                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Address             │ 192.0.2.1                                 │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Port                │ 3306                                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ State               │ Master, Running                           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Version             │ 11.4.5-3-MariaDB-enterprise-log           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Last Event          │ master_up                                 │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Triggered At        │ Thu, 05 Aug 2021 20:22:26 GMT             │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Services            │ connection_router_service                 │
    │                     │ query_router_service                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Monitors            │ columnstore_monitor                       │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Master ID           │ -1                                        │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Node ID             │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Slave Server IDs    │                                           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Current Connections │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Total Connections   │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Max Connections     │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Statistics          │ {                                         │
    │                     │     "active_operations": 0,               │
    │                     │     "adaptive_avg_select_time": "0ns",    │
    │                     │     "connection_pool_empty": 0,           │
    │                     │     "connections": 1,                     │
    │                     │     "max_connections": 1,                 │
    │                     │     "max_pool_size": 0,                   │
    │                     │     "persistent_connections": 0,          │
    │                     │     "reused_connections": 0,              │
    │                     │     "routed_packets": 0,                  │
    │                     │     "total_connections": 1                │
    │                     │ }                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Parameters          │ {                                         │
    │                     │     "address": "192.0.2.1",               │
    │                     │     "disk_space_threshold": null,         │
    │                     │     "extra_port": 0,                      │
    │                     │     "monitorpw": null,                    │
    │                     │     "monitoruser": null,                  │
    │                     │     "persistmaxtime": "0ms",              │
    │                     │     "persistpoolmax": 0,                  │
    │                     │     "port": 3306,                         │
    │                     │     "priority": 0,                        │
    │                     │     "proxy_protocol": false,              │
    │                     │     "rank": "primary",                    │
    │                     │     "socket": null,                       │
    │                     │     "ssl": false,                         │
    │                     │     "ssl_ca_cert": null,                  │
    │                     │     "ssl_cert": null,                     │
    │                     │     "ssl_cert_verify_depth": 9,           │
    │                     │     "ssl_cipher": null,                   │
    │                     │     "ssl_key": null,                      │
    │                     │     "ssl_verify_peer_certificate": false, │
    │                     │     "ssl_verify_peer_host": false,        │
    │                     │     "ssl_version": "MAX"                  │
    │                     │ }                                         │
    └─────────────────────┴───────────────────────────────────────────┘
    $ maxctrl list monitors
    ┌─────────────────────┬─────────┬──────────────────┐
    │ Monitor             │ State   │ Servers          │
    ├─────────────────────┼─────────┼──────────────────┤
    │ columnstore_monitor │ Running │ mcs1, mcs2, mcs3 │
    └─────────────────────┴─────────┴──────────────────┘
    $ maxctrl show monitor columnstore_monitor
    ┌─────────────────────┬─────────────────────────────────────┐
    │ Monitor             │ columnstore_monitor                 │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Module              │ mariadbmon                          │
    ├─────────────────────┼─────────────────────────────────────┤
    │ State               │ Running                             │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Servers             │ mcs1                                │
    │                     │ mcs2                                │
    │                     │ mcs3                                │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Parameters          │ {                                   │
    │                     │     "backend_connect_attempts": 1,  │
    │                     │     "backend_connect_timeout": 3,   │
    │                     │     "backend_read_timeout": 3,      │
    │                     │     "backend_write_timeout": 3,     │
    │                     │     "disk_space_check_interval": 0, │
    │                     │     "disk_space_threshold": null,   │
    │                     │     "events": "all",                │
    │                     │     "journal_max_age": 28800,       │
    │                     │     "module": "mariadbmon",         │
    │                     │     "monitor_interval": 2000,       │
    │                     │     "password": "*****",            │
    │                     │     "script": null,                 │
    │                     │     "script_timeout": 90,           │
    │                     │     "user": "mxs"                   │
    │                     │ }                                   │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Monitor Diagnostics │ {}                                  │
    └─────────────────────┴─────────────────────────────────────┘
    $ maxctrl list services
    ┌───────────────────────────┬────────────────┬─────────────┬───────────────────┬──────────────────┐
    │ Service                   │ Router         │ Connections │ Total Connections │ Servers          │
    ├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
    │ connection_router_Service │ readconnroute  │ 0           │ 0                 │ mcs1, mcs2, mcs3 │
    ├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
    │ query_router_service      │ readwritesplit │ 0           │ 0                 │ mcs1, mcs2, mcs3 │
    └───────────────────────────┴────────────────┴─────────────┴───────────────────┴──────────────────┘
    $ maxctrl show service query_router_service
    ┌─────────────────────┬─────────────────────────────────────────────────────────────┐
    │ Service             │ query_router_service                                        │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Router              │ readwritesplit                                              │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ State               │ Started                                                     │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Started At          │ Sat Aug 28 21:41:16 2021                                    │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Current Connections │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Total Connections   │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Max Connections     │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Cluster             │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Servers             │ mcs1                                                        │
    │                     │ mcs2                                                        │
    │                     │ mcs3                                                        │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Services            │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Filters             │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Parameters          │ {                                                           │
    │                     │     "auth_all_servers": false,                              │
    │                     │     "causal_reads": "false",                                │
    │                     │     "causal_reads_timeout": "10000ms",                      │
    │                     │     "connection_keepalive": "300000ms",                     │
    │                     │     "connection_timeout": "0ms",                            │
    │                     │     "delayed_retry": false,                                 │
    │                     │     "delayed_retry_timeout": "10000ms",                     │
    │                     │     "disable_sescmd_history": false,                        │
    │                     │     "enable_root_user": false,                              │
    │                     │     "idle_session_pool_time": "-1000ms",                    │
    │                     │     "lazy_connect": false,                                  │
    │                     │     "localhost_match_wildcard_host": true,                  │
    │                     │     "log_auth_warnings": true,                              │
    │                     │     "master_accept_reads": false,                           │
    │                     │     "master_failure_mode": "fail_instantly",                │
    │                     │     "master_reconnection": false,                           │
    │                     │     "max_connections": 0,                                   │
    │                     │     "max_sescmd_history": 50,                               │
    │                     │     "max_slave_connections": 255,                           │
    │                     │     "max_slave_replication_lag": "0ms",                     │
    │                     │     "net_write_timeout": "0ms",                             │
    │                     │     "optimistic_trx": false,                                │
    │                     │     "password": "*****",                                    │
    │                     │     "prune_sescmd_history": true,                           │
    │                     │     "rank": "primary",                                      │
    │                     │     "retain_last_statements": -1,                           │
    │                     │     "retry_failed_reads": true,                             │
    │                     │     "reuse_prepared_statements": false,                     │
    │                     │     "router": "readwritesplit",                             │
    │                     │     "session_trace": false,                                 │
    │                     │     "session_track_trx_state": false,                       │
    │                     │     "slave_connections": 255,                               │
    │                     │     "slave_selection_criteria": "LEAST_CURRENT_OPERATIONS", │
    │                     │     "strict_multi_stmt": false,                             │
    │                     │     "strict_sp_calls": false,                               │
    │                     │     "strip_db_esc": true,                                   │
    │                     │     "transaction_replay": false,                            │
    │                     │     "transaction_replay_attempts": 5,                       │
    │                     │     "transaction_replay_max_size": 1073741824,              │
    │                     │     "transaction_replay_retry_on_deadlock": false,          │
    │                     │     "type": "service",                                      │
    │                     │     "use_sql_variables_in": "all",                          │
    │                     │     "user": "mxs",                                          │
    │                     │     "version_string": null                                  │
    │                     │ }                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Router Diagnostics  │ {                                                           │
    │                     │     "avg_sescmd_history_length": 0,                         │
    │                     │     "max_sescmd_history_length": 0,                         │
    │                     │     "queries": 0,                                           │
    │                     │     "replayed_transactions": 0,                             │
    │                     │     "ro_transactions": 0,                                   │
    │                     │     "route_all": 0,                                         │
    │                     │     "route_master": 0,                                      │
    │                     │     "route_slave": 0,                                       │
    │                     │     "rw_transactions": 0,                                   │
    │                     │     "server_query_statistics": []                           │
    │                     │ }                                                           │
    └─────────────────────┴─────────────────────────────────────────────────────────────┘
    $ sudo mariadb
    CREATE USER 'app_user'@'192.0.2.10' IDENTIFIED BY 'app_user_passwd';
    GRANT ALL ON test.* TO 'app_user'@'192.0.2.10';
    CREATE USER 'app_user'@'192.0.2.11' IDENTIFIED BY 'app_user_passwd';
    GRANT ALL ON test.* TO 'app_user'@'192.0.2.11';
    $ mariadb --host 192.0.2.10 --port 3307
          --user app_user --password
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3308 \
          --user app_user --password
    SELECT @@global.hostname, @@global.server_id;
    
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs2 |                  2 |
    +-------------------+--------------------+
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3307 \
          --user app_user --password
    CREATE TABLE test.load_balancing_test (
       id INT PRIMARY KEY AUTO_INCREMENT,
       hostname VARCHAR(256),
       server_id INT
    );
    INSERT INTO test.load_balancing_test (hostname, server_id)
    VALUES (@@global.hostname, @@global.server_id);
    SELECT * FROM test.load_balancing_test;
    +----+----------+-----------+
    | id | hostname | server_id |
    +----+----------+-----------+
    |  1 | mcs1     |         1 |
    |  2 | mcs1     |         1 |
    |  3 | mcs1     |         1 |
    +----+----------+-----------+
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3307 \
          --user app_user --password
    SELECT @@global.hostname, @@global.server_id;
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs2 |                  2 |
    +-------------------+--------------------+
    SELECT @@global.hostname, @@global.server_id;
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs3 |                  3 |
    +-------------------+--------------------+
    $ sudo systemctl stop mariadb
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl stop mariadb-columnstore-cmapi
    [mariadb]
    bind_address                           = 0.0.0.0
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    log_bin                                = mariadb-bin
    log_bin_index                          = mariadb-bin.index
    relay_log                              = mariadb-relay
    relay_log_index                        = mariadb-relay.index
    log_slave_updates                      = ON
    gtid_strict_mode                       = ON
    
    # This must be unique on each Enterprise ColumnStore node
    server_id                              = 1
    $ sudo systemctl start mariadb
    $ sudo systemctl enable mariadb
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl start mariadb-columnstore-cmapi
    $ sudo systemctl enable mariadb-columnstore-cmapi
    CREATE USER 'util_user'@'127.0.0.1'
    IDENTIFIED BY 'util_user_passwd';
    GRANT SELECT, PROCESS ON *.*
    TO 'util_user'@'127.0.0.1';
    $ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    $ sudo mcsSetConfig CrossEngineSupport Port 3306
    $ sudo mcsSetConfig CrossEngineSupport User util_user
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_passwd
    CREATE USER 'repl'@'192.0.2.%' IDENTIFIED BY 'repl_passwd';
    GRANT REPLICA MONITOR,
       REPLICATION REPLICA,
       REPLICATION REPLICA ADMIN,
       REPLICATION MASTER ADMIN
    ON *.* TO 'repl'@'192.0.2.%';
    CREATE USER 'mxs'@'192.0.2.%'
    IDENTIFIED BY 'mxs_passwd';
    GRANT SHOW DATABASES ON *.* TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.columns_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.db TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.procs_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.proxies_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.roles_mapping TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.tables_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.user TO 'mxs'@'192.0.2.%';
    GRANT BINLOG ADMIN,
       READ_ONLY ADMIN,
       RELOAD,
       REPLICA MONITOR,
       REPLICATION MASTER ADMIN,
       REPLICATION REPLICA ADMIN,
       REPLICATION REPLICA,
       SHOW DATABASES,
       SELECT
    ON *.* TO 'mxs'@'192.0.2.%';
    CHANGE MASTER TO
       MASTER_HOST='192.0.2.1',
       MASTER_USER='repl',
       MASTER_PASSWORD='repl_passwd',
       MASTER_USE_GTID=slave_pos;
    START REPLICA;
    SHOW REPLICA STATUS;
    SET GLOBAL read_only=ON;
    $ openssl rand -hex 32
    
    93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":120, "node": "192.0.2.1"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:39:14.672142",
      "node_id": "192.0.2.1"
    }
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      }
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":120, "node": "192.0.2.2"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:42:42.796050",
      "node_id": "192.0.2.2"
    }
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    $ sudo yum install policycoreutils policycoreutils-python
    $ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utils
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    
    Nothing to do
    $ sudo semodule -i mariadb_local.pp
    $ sudo setenforce enforcing
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=enforcing
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Enforcing
    $ sudo systemctl status firewalld
    $ sudo systemctl start firewalld
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="3306" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8600-8630" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8640" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8700" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8800" protocol="tcp"
       accept'
    $ sudo firewall-cmd --reload
    $ sudo ufw status verbose
    $ sudo ufw enable
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 3306 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8600:8630 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8640 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8700 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8800 proto tcp
    $ sudo ufw reload
    Distributed functions
  • Extent elimination

  • Check the status of Enterprise ColumnStore

    Resilient: S3-compatible object storage is often low maintenance and highly available, since many services use resilient cloud infrastructure.

  • Scalable: S3-compatible object storage is often highly optimized for read and write scaling.

  • Secure: S3-compatible object storage is often encrypted-at-rest.

  • Enterprise ColumnStore supports extent elimination, but the generic select handler does not.

  • Enterprise ColumnStore has its own query planner, but the generic select handler cannot use it.

  • Inserts each row in the order the rows are read from the source file. Users can optimize data loads for Enterprise ColumnStore's automatic partitioning by loading presorted data files. For additional information, see "Load Ordered Data in Proper Order".

  • Supports parallel distributed bulk loads

  • Imports data from text files

  • Imports data from binary files

  • Imports data from standard input (stdin)

  • MariaDB Enterprise ColumnStore

    • Columnar storage engine

    • Query execution

    • Data storage

    MariaDB Enterprise Server

    • Enterprise-grade database server

    ColumnStore Storage Engine Plugin

    • Storage engine plugin

    • Integrates MariaDB Enterprise ColumnStore into MariaDB Enterprise Server

    Cluster Management API (CMAPI)

    • REST API

    • Used for administrative tasks

    MariaDB MaxScale

    • Database proxy

    • Accepts connections

    • Routes queries

    • Performs auto-failover

    S3-compatible object storage

    • HA for data

    • Optional.

    Shared Local Storage

    • With S3: HA for Storage Manager directory

    • Without S3: HA for DB Root directories

    MariaDB Replication

    • Schema replication (ColumnStore tables)

    • Schema and data replication (non-ColumnStore tables)

    • Database object replication

    MaxScale

    • Monitoring

    • Automatic failover

    • Load balancing

    Cluster Management API (CMAPI) Server

    • REST API

    • Administration

    • Add nodes

    • Remove nodes

    S3-compatible object storage

    • S3 snapshot

    Shared Local Storage

    • File system snapshot

    • File copy

    Enterprise Server Data Directory

    • MariaDB Enterprise Backup

    MariaDB MaxScale
    S3-compatible object storage
    Shared Local Storage
    Storage Manager directory
    Shared Local Storage
    DB Root directories
    MariaDB Enterprise Server
    ColumnStore Storage Engine
    Cluster Management API (CMAPI)
    MariaDB MaxScale
    MaxScale
    MariaDB Enterprise ColumnStore
    ColumnStore Storage Architecture
    MariaDB Enterprise Server
    Storage Manager directory
    Shared Local Storage
    Shared Local Storage requirements
    S3-compatible object storage
    Storage Manager directory
    DB Root directories
    MariaDB Enterprise ColumnStore Query Evaluation
    extent
    extent map
    custom select handler
    custom select handler
    ColumnStore Execution Plan (CSEP)
    ExeMgr process
    extent elimination
    Shared Local Storage requirements
    S3-compatible object storage
    Storage Manager directory
    DB Root directories
    Shared Local Storage
    MariaDB Replication
    MariaDB MaxScale
    MariaDB Enterprise ColumnStore Data Loading
    MariaDB Enterprise ColumnStore Backup and Restore
    Storage Manager directory
    DB Root directories
    Storage Manager directory
    DB Root directories
    columnstore-topology-s3
    es-columnstore-topology-nfs-no-title
    ECStore-QueryExecutionwith-S3-FlowChart
    ECStore-QueryExecutionExtentElimination
    ECStoreDataLoadingS3FlowChart
    EntColStoreBackupS3FlowChart

    Set this system variable to ON.

    Set this option to the file you want to use for the Relay Logs. Setting this option enables relay logging.

    Set this option to the file you want to use to index Relay Log filenames.

    Sets the numeric Server ID for this MariaDB Enterprise Server. The value set on this option must be unique to each node.

    iam_role_name, sts_region, and sts_endpoint
    .
  • To use the IAM role assigned to an EC2 instance, you must uncomment ec2_iam_mode=enabled.

  • Set this system variable to utf8

    Set this system variable to utf8_general_ci

    columnstore_use_import_for_batchinsert

    Set this system variable to ALWAYS to always use cpimport for LOAD DATA INFILE and INSERT...SELECT statements.

    Set this system variable to ON.

    Set this option to the file you want to use for the Binary Log. Setting this option enables binary logging.

    Set this option to the file you want to use to track binlog filenames.

    Credentials Management for MariaDB Enterprise ColumnStore
    CMAPI
    How to create an AppArmor Profile
    Next: Step 5: Test MariaDB Enterprise Server.

    DECIMAL

    As equiv. INT

    As equiv. INT

    FLOAT

    0xFFAAAAAA

    N/A

    DOUBLE

    0xFFFAAAAAAAAAAAAAULL

    N/A

    DATE

    0xFFFFFFFE

    N/A

    DATETIME

    0xFFFFFFFFFFFFFFFEULL

    N/A

    CHAR/VARCHAR

    Fill with '\0'

    N/A

    INT/TINYINT/SMALLINT/BIGINT

    Little-endian format for the numeric data

    FLOAT/DOUBLE

    IEEE format native to the computer

    CHAR/VARCHAR

    Data padded with '\0' for the length of the field. An entry that is all '\0' is treated as NULL

    DATE

    Using the Date struct below

    DATETIME

    Using the DateTime struct below

    DECIMAL

    Stored using an integer representation of the DECIMAL without the decimal point. With precision/width of 2 or less 2 bytes should be used, 3-4 should use 3 bytes, 4-9 should use 4 bytes and 10+ should use 8 bytes

    BIGINT

    0x8000000000000000ULL

    0xFFFFFFFFFFFFFFFEULL

    INT

    0x80000000

    0xFFFFFFFE

    SMALLINT

    0x8000

    0xFFFE

    TINYINT

    0x80

    jq
    cpimport
    here
    PMs
    cpimport-mode1
    cpimport-mode2
    cpimport-mode3

    0xFE

    This procedure describes the deployment of the ColumnStore Shared Local Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.

    MariaDB Enterprise ColumnStore 5 is a columnar storage engine for MariaDB Enterprise Server 10.5. Enterprise ColumnStore is suitable for Online Analytical Processing (OLAP) workloads.

    This procedure has 9 steps, which are executed in sequence.

    This procedure represents basic product capability and deploys 3 Enterprise ColumnStore nodes and 1 MaxScale node.

    This page provides an overview of the topology, requirements, and deployment procedures.

    Please read and understand this procedure before executing.

    Procedure Steps

    Step
    Description

    Prepare ColumnStore Nodes

    Configure Shared Local Storage

    Install MariaDB Enterprise Server

    Start and Configure MariaDB Enterprise Server

    Test MariaDB Enterprise Server

    Install MariaDB MaxScale

    Support

    Customers can obtain support by submitting a support case.

    Components

    The following components are deployed during this procedure:

    Component
    Function

    Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.

    Database proxy that extends the availability, scalability, and security of MariaDB Enterprise Servers

    MariaDB Enterprise Server Components

    Component
    Description

    • Columnar storage engine • Highly available • Optimized for Online Analytical Processing (OLAP) workloads • Scalable query execution • Cluster Management API (CMAPI) provides a REST API for multi-node administration.

    MariaDB MaxScale Components

    Component
    Description

    Listener

    Listens for client connections to MaxScale then passes them to the router service

    MariaDB Monitor

    Tracks changes in the state of MariaDB Enterprise Servers.

    Read Connection Router

    Routes connections from the listener to any available Enterprise ColumnStore node

    Read/Write Split Router

    Routes read operations from the listener to any available Enterprise ColumnStore node, and routes write operations from the listener to a specific server that MaxScale uses as the primary server

    Server Module

    Connection configuration in MaxScale to an Enterprise ColumnStore node

    Topology

    The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.

    The topology consists of:

    • One or more MaxScale nodes

    • An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI

    The MaxScale nodes:

    • Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)

    • Accept client and application connections

    • Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)

    The ColumnStore nodes:

    • Receive queries from MaxScale

    • Execute queries

      • Use shared local storage for the Storage Manager directory

    Requirements

    These requirements are for the ColumnStore Object Storage topology when deployed with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.

    • Node Count

    • Operating System

    • Minimum Hardware Requirements

    • Recommended Hardware Requirements

    • Storage Requirements

    • S3-Compatible Object Storage Requirements

    • Preferred Object Storage Providers: Cloud

    • Preferred Object Storage Providers: Hardware

    • Shared Local Storage Directories

    • Shared Local Storage Options

    • Recommended Storage Options

    Node Count

    • MaxScale nodes, 1 or more are required.

    • Enterprise ColumnStore nodes, 3 or more are required for high availability. You should always have an odd number of nodes in a multi-node ColumnStore deployment to avoid split brain scenarios.

    Operating System

    In alignment to the enterprise lifecycle, the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5 is provided for:

    • CentOS Linux 7 (x86_64)

    • Debian 10 (x86_64)

    • Red Hat Enterprise Linux 7 (x86_64)

    • Red Hat Enterprise Linux 8 (x86_64)

    • Ubuntu 18.04 LTS (x86_64)

    • Ubuntu 20.04 LTS (x86_64)

    Minimum Hardware Requirements

    MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the recommended hardware requirements instead.

    The minimum hardware requirements are:

    Component
    CPU
    Memory

    MaxScale node

    4+ cores

    4+ GB

    Enterprise ColumnStore node

    4+ cores

    4+ GB

    MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.

    If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:

    And the following error message will be raised to the client:

    Recommended Hardware Requirements

    MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.

    The recommended hardware requirements are:

    Component
    CPU
    Memory

    MaxScale node

    8+ cores

    16+ GB

    Enterprise ColumnStore node

    64+ cores

    128+ GB

    Storage Requirements

    The ColumnStore Object Storage topology requires the following storage types:

    Storage Type
    Description

    The ColumnStore Object Storage topology uses shared local storage for the to store metadata.

    Shared Local Storage Directories

    The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.

    The Storage Manager directory is located at the following path by default:

    • /var/lib/columnstore/storagemanager

    Shared Local Storage Options

    The most common shared local storage options for the ColumnStore Object Storage topology are:

    Shared Local Storage
    Common Usage
    Description

    EBS (Elastic Block Store) Multi-Attach

    AWS

    • EBS is a high-performance block-storage service for AWS (Amazon Web Services). • EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported. • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

    EFS (Elastic File System)

    AWS

    • EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services). • For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data. EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).

    Filestore

    GCP

    • Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform). • For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.

    GlusterFS

    On-premises

    Enterprise ColumnStore Management with CMAPI

    Enterprise ColumnStore's CMAPI (Cluster Management API) is a REST API that can be used to manage a multi-node Enterprise ColumnStore cluster.

    Many tools are capable of interacting with REST APIs. For example, the curl utility could be used to make REST API calls from the command-line.

    Many programming languages also have libraries for interacting with REST APIs.

    The examples below show how to use the CMAPI with curl.

    URL Endpoint Format for REST API

    For example:

    • shutdown

    • start

    • status

    Required Request Headers

    • 'x-api-key': '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'

    • 'Content-Type': 'application/json'

    x-api-key can be set to any value of your choice during the first call to the server. Subsequent connections will require this same key.

    Get Status

    Start Cluster

    Stop Cluster

    Add Node

    Remove Node

    Quick Reference

    MariaDB Enterprise Server Configuration Management

    Method
    Description

    Configuration File

    Configuration files (such as /etc/my.cnf) can be used to set system-variables and options. The server must be restarted to apply changes made to configuration files.

    Command-line

    The server can be started with command-line options that set system-variables and options.

    SQL

    Users can set system-variables that support dynamic changes on-the-fly using the SET statement.

    MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.

    To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.

    Distribution
    Example Configuration File Path
    • CentOS

    • Red Hat Enterprise Linux (RHEL)

    /etc/my.cnf.d/z-custom-mariadb.cnf

    • Debian

    • Ubuntu

    /etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf

    MariaDB Enterprise Server Service Management

    The systemctl command is used to start and stop the MariaDB Enterprise Server service.

    Operation
    Command

    Start

    sudo systemctl start mariadb

    Stop

    sudo systemctl stop mariadb

    Restart

    sudo systemctl restart mariadb

    Enable during startup

    sudo systemctl enable mariadb

    Disable during startup

    sudo systemctl disable mariadb

    Status

    sudo systemctl status mariadb

    For additional information, see "".

    MariaDB Enterprise Server Logs

    MariaDB Enterprise Server produces log data that can be helpful in problem diagnosis.

    Log filenames and locations may be overridden in the server configuration. The default location of logs is the data directory. The data directory is specified by the datadir system variable.

    Log
    System Variable/Option
    Default Filename

    <hostname>.err

    server_audit.log

    <hostname>-slow.log

    Enterprise ColumnStore Service Management

    The systemctl command is used to start and stop the ColumnStore service.

    Operation
    Command

    Start

    sudo systemctl start mariadb-columnstore

    Stop

    sudo systemctl stop mariadb-columnstore

    Restart

    sudo systemctl restart mariadb-columnstore

    Enable during startup

    sudo systemctl enable mariadb-columnstore

    Disable during startup

    sudo systemctl disable mariadb-columnstore

    Status

    sudo systemctl status mariadb-columnstore

    In the ColumnStore Object Storage topology, the mariadb-columnstore service should not be enabled. The CMAPI service restarts Enterprise ColumnStore as needed, so it does not need to start automatically upon reboot.

    Enterprise ColumnStore CMAPI Service Management

    The systemctl command is used to start and stop the CMAPI service.

    Operation
    Command

    Start

    sudo systemctl start mariadb-columnstore-cmapi

    Stop

    sudo systemctl stop mariadb-columnstore-cmapi

    Restart

    sudo systemctl restart mariadb-columnstore-cmapi

    Enable during startup

    sudo systemctl enable mariadb-columnstore-cmapi

    Disable during startup

    sudo systemctl disable mariadb-columnstore-cmapi

    Status

    sudo systemctl status mariadb-columnstore-cmapi

    For additional information on endpoints, see "CMAPI".

    MaxScale Configuration Management

    MaxScale can be configured using several methods. These methods make use of MaxScale's REST API.

    Method
    Benefits

    Command-line utility to perform administrative tasks through the REST API. See MaxCtrl Commands.

    MaxGUI is a graphical utility that can perform administrative tasks through the REST API.

    The REST API can be used directly. For example, the curl utility could be used to make REST API calls from the command-line. Many programming languages also have libraries to interact with REST APIs.

    The procedure on these pages configures MaxScale using MaxCtrl.

    MaxScale Service Management

    The systemctl command is used to start and stop the MaxScale service.>

    Operation
    Command

    Start

    sudo systemctl start maxscale

    Stop

    sudo systemctl stop maxscale

    Restart

    sudo systemctl restart maxscale

    Enable during startup

    sudo systemctl enable maxscale

    Disable during startup

    sudo systemctl disable maxscale

    Status

    sudo systemctl status maxscale

    For additional information, see "Start and Stop Services".

    Next Step

    Navigation in the procedure Shared Local Storage topology

    Next: Step 1: Prepare ColumnStore Nodes.

    • Enterprise Server 10.5

    • Enterprise Server 10.6

    • Enterprise Server 11.4

    Columnar storage engine with S3-compatible object storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales read via MaxScale

    • Bulk data import

    Set this system variable to ON.

    Set this option to the file you want to use for the Relay Logs. Setting this option enables relay logging.

    Set this option to the file you want to use to index Relay Log filenames.

    Sets the numeric Server ID for this MariaDB Enterprise Server. The value set on this option must be unique to each node.

    maxctrl show maxscale
    maxctrl list servers
    maxctrl show server
    maxctrl list monitors
    maxctrl show monitor
    maxctrl list services
    maxctrl show service
    Read Connection Router (readconnroute)
    Read/Write Split Router (readwritesplit)
    maxctrl list listeners
    maxctrl list listeners
    Read/Write Split Router (readwritesplit)
    maxctrl list listeners
    Read/Write Split Router (readwritesplit)
    Next: Step 9: Import Data
    $ sudo systemctl stop mariadb
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl stop mariadb-columnstore-cmapi
    [mariadb]
    bind_address                           = 0.0.0.0
    log_error                              = mariadbd.err
    character_set_server                   = utf8
    collation_server                       = utf8_general_ci
    log_bin                                = mariadb-bin
    log_bin_index                          = mariadb-bin.index
    relay_log                              = mariadb-relay
    relay_log_index                        = mariadb-relay.index
    log_slave_updates                      = ON
    gtid_strict_mode                       = ON
    
    # This must be unique on each Enterprise ColumnStore node
    server_id                              = 1
    [ObjectStorage]
    …
    service = S3
    …
    [S3]
    bucket                = your_columnstore_bucket_name
    endpoint              = your_s3_endpoint
    aws_access_key_id     = your_s3_access_key_id
    aws_secret_access_key = your_s3_secret_key
    # iam_role_name       = your_iam_role
    # sts_region          = your_sts_region
    # sts_endpoint        = your_sts_endpoint
    # ec2_iam_mode        = enabled
    
    [Cache]
    cache_size = your_local_cache_size
    path       = your_local_cache_path
    $ sudo systemctl start mariadb
    $ sudo systemctl enable mariadb
    $ sudo systemctl stop mariadb-columnstore
    $ sudo systemctl start mariadb-columnstore-cmapi
    $ sudo systemctl enable mariadb-columnstore-cmapi
    CREATE USER 'util_user'@'127.0.0.1'
    IDENTIFIED BY 'util_user_passwd';
    GRANT SELECT, PROCESS ON *.*
    TO 'util_user'@'127.0.0.1';
    $ sudo mcsSetConfig CrossEngineSupport Host 127.0.0.1
    $ sudo mcsSetConfig CrossEngineSupport Port 3306
    $ sudo mcsSetConfig CrossEngineSupport User util_user
    $ sudo mcsSetConfig CrossEngineSupport Password util_user_passwd
    CREATE USER 'repl'@'192.0.2.%' IDENTIFIED BY 'repl_passwd';
    GRANT REPLICA MONITOR,
       REPLICATION REPLICA,
       REPLICATION REPLICA ADMIN,
       REPLICATION MASTER ADMIN
    ON *.* TO 'repl'@'192.0.2.%';
    CREATE USER 'mxs'@'192.0.2.%'
    IDENTIFIED BY 'mxs_passwd';
    GRANT SHOW DATABASES ON *.* TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.columns_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.db TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.procs_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.proxies_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.roles_mapping TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.tables_priv TO 'mxs'@'192.0.2.%';
    
    GRANT SELECT ON mysql.user TO 'mxs'@'192.0.2.%';
    GRANT BINLOG ADMIN,
       READ_ONLY ADMIN,
       RELOAD,
       REPLICA MONITOR,
       REPLICATION MASTER ADMIN,
       REPLICATION REPLICA ADMIN,
       REPLICATION REPLICA,
       SHOW DATABASES,
       SELECT
    ON *.* TO 'mxs'@'192.0.2.%';
    CHANGE MASTER TO
       MASTER_HOST='192.0.2.1',
       MASTER_USER='repl',
       MASTER_PASSWORD='repl_passwd',
       MASTER_USE_GTID=slave_pos;
    START REPLICA;
    SHOW REPLICA STATUS;
    SET GLOBAL read_only=ON;
    $ openssl rand -hex 32
    
    93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":120, "node": "192.0.2.1"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:39:14.672142",
      "node_id": "192.0.2.1"
    }
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      }
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       --data '{"timeout":120, "node": "192.0.2.2"}' \
       | jq .
    {
      "timestamp": "2020-10-28 00:42:42.796050",
      "node_id": "192.0.2.2"
    }
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
       --header 'Content-Type:application/json' \
       --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
       | jq .
    {
      "timestamp": "2020-12-15 00:40:34.353574",
      "192.0.2.1": {
        "timestamp": "2020-12-15 00:40:34.362374",
        "uptime": 11467,
        "dbrm_mode": "master",
        "cluster_mode": "readwrite",
        "dbroots": [
          "1"
        ],
        "module_id": 1,
        "services": [
          {
            "name": "workernode",
            "pid": 19202
          },
          {
            "name": "controllernode",
            "pid": 19232
          },
          {
            "name": "PrimProc",
            "pid": 19254
          },
          {
            "name": "ExeMgr",
            "pid": 19292
          },
          {
            "name": "WriteEngine",
            "pid": 19316
          },
          {
            "name": "DMLProc",
            "pid": 19332
          },
          {
            "name": "DDLProc",
            "pid": 19366
          }
        ]
      },
      "192.0.2.2": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "192.0.2.3": {
        "timestamp": "2020-12-15 00:40:34.428554",
        "uptime": 11437,
        "dbrm_mode": "slave",
        "cluster_mode": "readonly",
        "dbroots": [
          "2"
        ],
        "module_id": 2,
        "services": [
          {
            "name": "workernode",
            "pid": 17789
          },
          {
            "name": "PrimProc",
            "pid": 17813
          },
          {
            "name": "ExeMgr",
            "pid": 17854
          },
          {
            "name": "WriteEngine",
            "pid": 17877
          }
        ]
      },
      "num_nodes": 3
    }
    $ sudo yum install policycoreutils policycoreutils-python
    $ sudo yum install policycoreutils python3-policycoreutils policycoreutils-python-utils
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    $ sudo grep mysqld /var/log/audit/audit.log | audit2allow -M mariadb_local
    
    Nothing to do
    $ sudo semodule -i mariadb_local.pp
    $ sudo setenforce enforcing
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=enforcing
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protected.
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    $ sudo getenforce
    Enforcing
    $ sudo systemctl status firewalld
    $ sudo systemctl start firewalld
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="3306" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8600-8630" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8640" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8700" protocol="tcp"
       accept'
    $ sudo firewall-cmd --permanent --add-rich-rule='
       rule family="ipv4"
       source address="192.0.2.0/24"
       destination address="192.0.2.0/24"
       port port="8800" protocol="tcp"
       accept'
    $ sudo firewall-cmd --reload
    $ sudo ufw status verbose
    $ sudo ufw enable
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 3306 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8600:8630 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8640 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8700 proto tcp
    
    $ sudo ufw allow from 192.0.2.0/24 to 192.0.2.3 port 8800 proto tcp
    $ sudo ufw reload
    cpimport dbName tblName [loadFile]
    cpimport dbName tblName [loadFile]
    [-h] [-m mode] [-f filepath] [-d DebugLevel]
    [-c readBufferSize] [-b numBuffers] [-r numReaders]
    [-e maxErrors] [-B libBufferSize] [-s colDelimiter] [-E EnclosedByChar]
    [-C escChar] [-j jobID] [-p jobFilePath] [-w numParsers]
    [-n nullOption] [-P pmList] [-i] [-S] [-q batchQty]
    
    positional parameters:
    	dbName     Name of the database to load
    	tblName    Name of table to load
    	loadFile   Optional input file name in current directory,
    			unless a fully qualified name is given.
    			If not given, input read from STDIN.
    Options:
    	-b	Number of read buffers
    	-c	Application read buffer size(in bytes)
    	-d	Print different level(1-3) debug message
    	-e	Max number of allowable error per table per PM
    	-f	Data file directory path.
    			Default is current working directory.
    			In Mode 1, -f represents the local input file path.
    			In Mode 2, -f represents the PM based input file path.
    			In Mode 3, -f represents the local input file path.
    	-l	Name of import file to be loaded, relative to -f path. (Cannot be used with -p)
    	-h	Print this message.
    	-q	Batch Quantity, Number of rows distributed per batch in Mode 1
    	-i	Print extended info to console in Mode 3.
    	-j	Job ID. In simple usage, default is the table OID.
    			unless a fully qualified input file name is given.
    	-n	NullOption (0-treat the string NULL as data (default);
    			1-treat the string NULL as a NULL value)
    	-p	Path for XML job description file.
    	-r	Number of readers.
    	-s	The delimiter between column values.
    	-B	I/O library read buffer size (in bytes)
    	-w	Number of parsers.
    	-E	Enclosed by character if field values are enclosed.
    	-C	Escape character used in conjunction with 'enclosed by'
    			character, or as part of NULL escape sequence ('\N');
    			default is '\'
    	-I	Import binary data; how to treat NULL values:
    			1 - import NULL values
    			2 - saturate NULL values
    	-P	List of PMs ex: -P 1,2,3. Default is all PMs.
    	-S	Treat string truncations as errors.
    	-m	mode
    			1 - rows will be loaded in a distributed manner across PMs.
    			2 - PM based input files loaded onto their respective PM.
    			3 - input files will be loaded on the local PM.
    cpimport -m1 mytest mytable mytable.tbl
    cpimport -m2 mytest mytable -l /home/mydata/mytable.tbl
    cpimport -m3 mytest mytable /home/mydata/mytable.tbl
    cpimport db1 table1
    aws s3 cp --quiet s3://dthompson-test/trades_bulk.csv - | cpimport test trades -s ","
    mariadb -q -e 'select * from source_table;' -N <source-db> | cpimport -s '\t' <target-db> <target-table>
    CREATE DATABASE `json_columnstore`;
    
    USE `json_columnstore`;
    
    CREATE TABLE `products` (
      `product_name` VARCHAR(11) NOT NULL DEFAULT '',
      `supplier` VARCHAR(128) NOT NULL DEFAULT '',
      `quantity` VARCHAR(128) NOT NULL DEFAULT '',
      `unit_cost` VARCHAR(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    [{
      "_id": {
        "$oid": "5968dd23fc13ae04d9000001"
      },
      "product_name": "Sildenafil Citrate",
      "supplier": "Wisozk Inc",
      "quantity": 261,
      "unit_cost": "$10.47"
    }, {
      "_id": {
        "$oid": "5968dd23fc13ae04d9000002"
      },
      "product_name": "Mountain Juniperus Ashei",
      "supplier": "Keebler-Hilpert",
      "quantity": 292,
      "unit_cost": "$8.74"
    }, {
      "_id": {
        "$oid": "5968dd23fc13ae04d9000003"
      },
      "product_name": "Dextromethorphan HBR",
      "supplier": "Schmitt-Weissnat",
      "quantity": 211,
      "unit_cost": "$20.53"
    }]
    cat products.json | jq -r '.[] | [.product_name,.supplier,.quantity,.unit_cost] | @csv' | cpimport json_columnstore products -s ',' -E '"'
    colxml mytest -j299
    cpimport -m1 -j299
    Usage: colxml [options] dbName
    
    Options: 
       -d Delimiter (default '|')
       -e Maximum allowable errors (per table)
       -h Print this message
       -j Job id (numeric)
       -l Load file name
       -n "name in quotes"
       -p Path for XML job description file that is generated
       -s "Description in quotes"
       -t Table name
       -u User
       -r Number of read buffers
       -c Application read buffer size (in bytes)
       -w I/O library buffer size (in bytes), used to read files
       -x Extension of file name (default ".tbl")
       -E EnclosedByChar (if data has enclosed values)
       -C EscapeChar
       -b Debug level (1-3)
    MariaDB[tpch2]> show tables;
    +---------------+
    | Tables_in_tpch2 |
    +--------------+
    | customer    |
    | lineitem    |
    | nation      |
    | orders      |
    | part        |
    | partsupp    |
    | region      |
    | supplier    |
    +--------------+
    8 rows in set (0.00 sec)
    /usr/local/mariadb/columnstore/bin/colxml tpch2 -j500
    Running colxml with the following parameters:
    2015-10-07 15:14:20 (9481) INFO :
    Schema: tpch2
    Tables:
    Load Files:
    -b 0
    -c 1048576
    -d |
    -e 10
    -j 500
    -n
    -p /usr/local/mariadb/columnstore/data/bulk/job/
    -r 5
    -s
    -u
    -w 10485760
    -x tbl
    File completed for tables:
    tpch2.customer
    tpch2.lineitem
    tpch2.nation
    tpch2.orders
    tpch2.part
    tpch2.partsupp
    tpch2.region
    tpch2.supplier
    Normal exit.
    /usr/local/mariadb/columnstore/bin/cpimport -j 500
    Bulkload root directory : /usr/local/mariadb/columnstore/data/bulk
    job description file : Job_500.xml
    2015-10-07 15:14:59 (9952) INFO : successfully load job file /usr/local/mariadb/columnstore/data/bulk/job/Job_500.xml
    2015-10-07 15:14:59 (9952) INFO : PreProcessing check starts
    2015-10-07 15:15:04 (9952) INFO : PreProcessing check completed
    2015-10-07 15:15:04 (9952) INFO : preProcess completed, total run time : 5 seconds
    2015-10-07 15:15:04 (9952) INFO : No of Read Threads Spawned = 1
    2015-10-07 15:15:04 (9952) INFO : No of Parse Threads Spawned = 3
    2015-10-07 15:15:06 (9952) INFO : For table tpch2.customer: 150000 rows processed and 150000 rows inserted.
    2015-10-07 15:16:12 (9952) INFO : For table tpch2.nation: 25 rows processed and 25 rows inserted.
    2015-10-07 15:16:12 (9952) INFO : For table tpch2.lineitem: 6001215 rows processed and 6001215 rows inserted.
    2015-10-07 15:16:31 (9952) INFO : For table tpch2.orders: 1500000 rows processed and 1500000 rows inserted.
    2015-10-07 15:16:33 (9952) INFO : For table tpch2.part: 200000 rows processed and 200000 rows inserted.
    2015-10-07 15:16:44 (9952) INFO : For table tpch2.partsupp: 800000 rows processed and 800000 rows inserted.
    2015-10-07 15:16:44 (9952) INFO : For table tpch2.region: 5 rows processed and 5 rows inserted.
    2015-10-07 15:16:45 (9952) INFO : For table tpch2.supplier: 10000 rows processed and 10000 rows inserted.
    CREATE TABLE emp (
    emp_id INT, 
     dept_id INT,
    name VARCHAR(30), 
    salary INT, 
    hire_date DATE) ENGINE=columnstore;
    <Table tblName="test.emp" 
          loadName="emp.tbl" maxErrRow="10">
       <Column colName="emp_id"/>
       <Column colName="dept_id"/>
       <Column colName="name"/>
       <Column colName="salary"/>
       <Column colName="hire_date"/>
     </Table>
    <Table tblName="test.emp" 
          loadName="emp.tbl" maxErrRow="10">
       <Column colName="emp_id"/>
       <Column colName="dept_id"/>
       <Column colName="name"/>
       <Column colName="hire_date"/>
       <Column colName="salary"/>
     </Table>
    <Table tblName="test.emp"        
               loadName="emp.tbl" maxErrRow="10">
          <Column colName="emp_id"/>
          <Column colName="dept_id"/>
          <Column colName="name"/>
          <Column colName="hire_date"/>
          <IgnoreField/>
          <DefaultColumn colName="salary"/>
        </Table>
    Example
    cpimport -I1 mytest mytable /home/mydata/mytable.bin
    struct Date
    {
      unsigned spare : 6;
      unsigned day : 6;
      unsigned month : 4;
      unsigned year : 16
    };
    struct DateTime
    {
      unsigned msecond : 20;
      unsigned second : 6;
      unsigned minute : 6;
      unsigned hour : 6;
      unsigned day : 6;
      unsigned month : 4;
      unsigned year : 16
    };
    -rw-r--r--. 1 root  root        0 Dec 29 06:41 cpimport_1229064143_21779.err
    -rw-r--r--. 1 root  root     1146 Dec 29 06:42 cpimport_1229064143_21779.log
    2020-12-29 06:41:44 (21779) INFO : Running distributed import (mode 1) on all PMs...
    2020-12-29 06:41:44 (21779) INFO2 : /usr/bin/cpimport.bin -s , -E " -R /tmp/columnstore_tmp_files/BrmRpt112906414421779.rpt -m 1 -P pm1-21779 -T SYSTEM -u388952c1-4ab8-46d6-9857-c44827b1c3b9 bts flights
    2020-12-29 06:41:58 (21779) INFO2 : Received a BRM-Report from 1
    2020-12-29 06:41:58 (21779) INFO2 : Received a Cpimport Pass from PM1
    2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 2
    2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM2
    2020-12-29 06:42:03 (21779) INFO2 : Received a BRM-Report from 3
    2020-12-29 06:42:03 (21779) INFO2 : BRM updated successfully
    2020-12-29 06:42:03 (21779) INFO2 : Received a Cpimport Pass from PM3
    2020-12-29 06:42:04 (21779) INFO2 : Released Table Lock
    2020-12-29 06:42:04 (21779) INFO2 : Cleanup succeed on all PMs
    2020-12-29 06:42:04 (21779) INFO : For table bts.flights: 374573 rows processed and 374573 rows inserted.
    2020-12-29 06:42:04 (21779) INFO : Bulk load completed, total run time : 20.3052 seconds
    2020-12-29 06:42:04 (21779) INFO2 : Shutdown of all child threads Finished!!
    Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.
    ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.
    https://{server}:{port}/cmapi/{version}/{route}/{command}
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/start \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/shutdown \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .
    $ curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .
    $ maxctrl show maxscale
    ┌──────────────┬───────────────────────────────────────────────────────┐
    │ Version      │ 22.08.15                                              │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Commit       │ 3761fa7a52046bc58faad8b5a139116f9e33364c              │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Started At   │ Thu, 05 Aug 2021 20:21:20 GMT                         │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Activated At │ Thu, 05 Aug 2021 20:21:20 GMT                         │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Uptime       │ 868                                                   │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Config Sync  │ null                                                  │
    ├──────────────┼───────────────────────────────────────────────────────┤
    │ Parameters   │ {                                                     │
    │              │     "admin_auth": true,                               │
    │              │     "admin_enabled": true,                            │
    │              │     "admin_gui": true,                                │
    │              │     "admin_host": "0.0.0.0",                          │
    │              │     "admin_log_auth_failures": true,                  │
    │              │     "admin_pam_readonly_service": null,               │
    │              │     "admin_pam_readwrite_service": null,              │
    │              │     "admin_port": 8989,                               │
    │              │     "admin_secure_gui": false,                        │
    │              │     "admin_ssl_ca_cert": null,                        │
    │              │     "admin_ssl_cert": null,                           │
    │              │     "admin_ssl_key": null,                            │
    │              │     "admin_ssl_version": "MAX",                       │
    │              │     "auth_connect_timeout": "10000ms",                │
    │              │     "auth_read_timeout": "10000ms",                   │
    │              │     "auth_write_timeout": "10000ms",                  │
    │              │     "cachedir": "/var/cache/maxscale",                │
    │              │     "config_sync_cluster": null,                      │
    │              │     "config_sync_interval": "5000ms",                 │
    │              │     "config_sync_password": "*****",                  │
    │              │     "config_sync_timeout": "10000ms",                 │
    │              │     "config_sync_user": null,                         │
    │              │     "connector_plugindir": "/usr/lib64/mysql/plugin", │
    │              │     "datadir": "/var/lib/maxscale",                   │
    │              │     "debug": null,                                    │
    │              │     "dump_last_statements": "never",                  │
    │              │     "execdir": "/usr/bin",                            │
    │              │     "language": "/var/lib/maxscale",                  │
    │              │     "libdir": "/usr/lib64/maxscale",                  │
    │              │     "load_persisted_configs": true,                   │
    │              │     "local_address": null,                            │
    │              │     "log_debug": false,                               │
    │              │     "log_info": false,                                │
    │              │     "log_notice": true,                               │
    │              │     "log_throttling": {                               │
    │              │         "count": 10,                                  │
    │              │         "suppress": 10000,                            │
    │              │         "window": 1000                                │
    │              │     },                                                │
    │              │     "log_warn_super_user": false,                     │
    │              │     "log_warning": true,                              │
    │              │     "logdir": "/var/log/maxscale",                    │
    │              │     "max_auth_errors_until_block": 10,                │
    │              │     "maxlog": true,                                   │
    │              │     "module_configdir": "/etc/maxscale.modules.d",    │
    │              │     "ms_timestamp": false,                            │
    │              │     "passive": false,                                 │
    │              │     "persistdir": "/var/lib/maxscale/maxscale.cnf.d", │
    │              │     "piddir": "/var/run/maxscale",                    │
    │              │     "query_classifier": "qc_sqlite",                  │
    │              │     "query_classifier_args": null,                    │
    │              │     "query_classifier_cache_size": 289073971,         │
    │              │     "query_retries": 1,                               │
    │              │     "query_retry_timeout": "5000ms",                  │
    │              │     "rebalance_period": "0ms",                        │
    │              │     "rebalance_threshold": 20,                        │
    │              │     "rebalance_window": 10,                           │
    │              │     "retain_last_statements": 0,                      │
    │              │     "session_trace": 0,                               │
    │              │     "skip_permission_checks": false,                  │
    │              │     "sql_mode": "default",                            │
    │              │     "syslog": true,                                   │
    │              │     "threads": 1,                                     │
    │              │     "users_refresh_interval": "0ms",                  │
    │              │     "users_refresh_time": "30000ms",                  │
    │              │     "writeq_high_water": 16777216,                    │
    │              │     "writeq_low_water": 8192                          │
    │              │ }                                                     │
    └──────────────┴───────────────────────────────────────────────────────┘
    $ maxctrl list servers
    ┌────────┬────────────────┬──────┬─────────────┬─────────────────┬────────┐
    │ Server │ Address        │ Port │ Connections │ State           │ GTID   │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs1   │ 192.0.2.1      │ 3306 │ 1           │ Master, Running │ 0-1-25 │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs2   │ 192.0.2.2      │ 3306 │ 1           │ Slave, Running  │ 0-1-25 │
    ├────────┼────────────────┼──────┼─────────────┼─────────────────┼────────┤
    │ mcs3   │ 192.0.2.3      │ 3306 │ 1           │ Slave, Running  │ 0-1-25 │
    └────────┴────────────────┴──────┴─────────────┴─────────────────┴────────┘
    $ maxctrl show server mcs1
    ┌─────────────────────┬───────────────────────────────────────────┐
    │ Server              │ mcs1                                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Address             │ 192.0.2.1                                 │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Port                │ 3306                                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ State               │ Master, Running                           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Version             │ 11.4.5-3-MariaDB-enterprise-log           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Last Event          │ master_up                                 │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Triggered At        │ Thu, 05 Aug 2021 20:22:26 GMT             │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Services            │ connection_router_service                 │
    │                     │ query_router_service                      │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Monitors            │ columnstore_monitor                       │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Master ID           │ -1                                        │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Node ID             │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Slave Server IDs    │                                           │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Current Connections │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Total Connections   │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Max Connections     │ 1                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Statistics          │ {                                         │
    │                     │     "active_operations": 0,               │
    │                     │     "adaptive_avg_select_time": "0ns",    │
    │                     │     "connection_pool_empty": 0,           │
    │                     │     "connections": 1,                     │
    │                     │     "max_connections": 1,                 │
    │                     │     "max_pool_size": 0,                   │
    │                     │     "persistent_connections": 0,          │
    │                     │     "reused_connections": 0,              │
    │                     │     "routed_packets": 0,                  │
    │                     │     "total_connections": 1                │
    │                     │ }                                         │
    ├─────────────────────┼───────────────────────────────────────────┤
    │ Parameters          │ {                                         │
    │                     │     "address": "192.0.2.1",               │
    │                     │     "disk_space_threshold": null,         │
    │                     │     "extra_port": 0,                      │
    │                     │     "monitorpw": null,                    │
    │                     │     "monitoruser": null,                  │
    │                     │     "persistmaxtime": "0ms",              │
    │                     │     "persistpoolmax": 0,                  │
    │                     │     "port": 3306,                         │
    │                     │     "priority": 0,                        │
    │                     │     "proxy_protocol": false,              │
    │                     │     "rank": "primary",                    │
    │                     │     "socket": null,                       │
    │                     │     "ssl": false,                         │
    │                     │     "ssl_ca_cert": null,                  │
    │                     │     "ssl_cert": null,                     │
    │                     │     "ssl_cert_verify_depth": 9,           │
    │                     │     "ssl_cipher": null,                   │
    │                     │     "ssl_key": null,                      │
    │                     │     "ssl_verify_peer_certificate": false, │
    │                     │     "ssl_verify_peer_host": false,        │
    │                     │     "ssl_version": "MAX"                  │
    │                     │ }                                         │
    └─────────────────────┴───────────────────────────────────────────┘
    $ maxctrl list monitors
    ┌─────────────────────┬─────────┬──────────────────┐
    │ Monitor             │ State   │ Servers          │
    ├─────────────────────┼─────────┼──────────────────┤
    │ columnstore_monitor │ Running │ mcs1, mcs2, mcs3 │
    └─────────────────────┴─────────┴──────────────────┘
    $ maxctrl show monitor columnstore_monitor
    ┌─────────────────────┬─────────────────────────────────────┐
    │ Monitor             │ columnstore_monitor                 │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Module              │ mariadbmon                          │
    ├─────────────────────┼─────────────────────────────────────┤
    │ State               │ Running                             │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Servers             │ mcs1                                │
    │                     │ mcs2                                │
    │                     │ mcs3                                │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Parameters          │ {                                   │
    │                     │     "backend_connect_attempts": 1,  │
    │                     │     "backend_connect_timeout": 3,   │
    │                     │     "backend_read_timeout": 3,      │
    │                     │     "backend_write_timeout": 3,     │
    │                     │     "disk_space_check_interval": 0, │
    │                     │     "disk_space_threshold": null,   │
    │                     │     "events": "all",                │
    │                     │     "journal_max_age": 28800,       │
    │                     │     "module": "mariadbmon",         │
    │                     │     "monitor_interval": 2000,       │
    │                     │     "password": "*****",            │
    │                     │     "script": null,                 │
    │                     │     "script_timeout": 90,           │
    │                     │     "user": "mxs"                   │
    │                     │ }                                   │
    ├─────────────────────┼─────────────────────────────────────┤
    │ Monitor Diagnostics │ {}                                  │
    └─────────────────────┴─────────────────────────────────────┘
    $ maxctrl list services
    ┌───────────────────────────┬────────────────┬─────────────┬───────────────────┬──────────────────┐
    │ Service                   │ Router         │ Connections │ Total Connections │ Servers          │
    ├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
    │ connection_router_Service │ readconnroute  │ 0           │ 0                 │ mcs1, mcs2, mcs3 │
    ├───────────────────────────┼────────────────┼─────────────┼───────────────────┼──────────────────┤
    │ query_router_service      │ readwritesplit │ 0           │ 0                 │ mcs1, mcs2, mcs3 │
    └───────────────────────────┴────────────────┴─────────────┴───────────────────┴──────────────────┘
    $ maxctrl show service query_router_service
    ┌─────────────────────┬─────────────────────────────────────────────────────────────┐
    │ Service             │ query_router_service                                        │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Router              │ readwritesplit                                              │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ State               │ Started                                                     │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Started At          │ Sat Aug 28 21:41:16 2021                                    │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Current Connections │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Total Connections   │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Max Connections     │ 0                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Cluster             │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Servers             │ mcs1                                                        │
    │                     │ mcs2                                                        │
    │                     │ mcs3                                                        │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Services            │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Filters             │                                                             │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Parameters          │ {                                                           │
    │                     │     "auth_all_servers": false,                              │
    │                     │     "causal_reads": "false",                                │
    │                     │     "causal_reads_timeout": "10000ms",                      │
    │                     │     "connection_keepalive": "300000ms",                     │
    │                     │     "connection_timeout": "0ms",                            │
    │                     │     "delayed_retry": false,                                 │
    │                     │     "delayed_retry_timeout": "10000ms",                     │
    │                     │     "disable_sescmd_history": false,                        │
    │                     │     "enable_root_user": false,                              │
    │                     │     "idle_session_pool_time": "-1000ms",                    │
    │                     │     "lazy_connect": false,                                  │
    │                     │     "localhost_match_wildcard_host": true,                  │
    │                     │     "log_auth_warnings": true,                              │
    │                     │     "master_accept_reads": false,                           │
    │                     │     "master_failure_mode": "fail_instantly",                │
    │                     │     "master_reconnection": false,                           │
    │                     │     "max_connections": 0,                                   │
    │                     │     "max_sescmd_history": 50,                               │
    │                     │     "max_slave_connections": 255,                           │
    │                     │     "max_slave_replication_lag": "0ms",                     │
    │                     │     "net_write_timeout": "0ms",                             │
    │                     │     "optimistic_trx": false,                                │
    │                     │     "password": "*****",                                    │
    │                     │     "prune_sescmd_history": true,                           │
    │                     │     "rank": "primary",                                      │
    │                     │     "retain_last_statements": -1,                           │
    │                     │     "retry_failed_reads": true,                             │
    │                     │     "reuse_prepared_statements": false,                     │
    │                     │     "router": "readwritesplit",                             │
    │                     │     "session_trace": false,                                 │
    │                     │     "session_track_trx_state": false,                       │
    │                     │     "slave_connections": 255,                               │
    │                     │     "slave_selection_criteria": "LEAST_CURRENT_OPERATIONS", │
    │                     │     "strict_multi_stmt": false,                             │
    │                     │     "strict_sp_calls": false,                               │
    │                     │     "strip_db_esc": true,                                   │
    │                     │     "transaction_replay": false,                            │
    │                     │     "transaction_replay_attempts": 5,                       │
    │                     │     "transaction_replay_max_size": 1073741824,              │
    │                     │     "transaction_replay_retry_on_deadlock": false,          │
    │                     │     "type": "service",                                      │
    │                     │     "use_sql_variables_in": "all",                          │
    │                     │     "user": "mxs",                                          │
    │                     │     "version_string": null                                  │
    │                     │ }                                                           │
    ├─────────────────────┼─────────────────────────────────────────────────────────────┤
    │ Router Diagnostics  │ {                                                           │
    │                     │     "avg_sescmd_history_length": 0,                         │
    │                     │     "max_sescmd_history_length": 0,                         │
    │                     │     "queries": 0,                                           │
    │                     │     "replayed_transactions": 0,                             │
    │                     │     "ro_transactions": 0,                                   │
    │                     │     "route_all": 0,                                         │
    │                     │     "route_master": 0,                                      │
    │                     │     "route_slave": 0,                                       │
    │                     │     "rw_transactions": 0,                                   │
    │                     │     "server_query_statistics": []                           │
    │                     │ }                                                           │
    └─────────────────────┴─────────────────────────────────────────────────────────────┘
    $ sudo mariadb
    CREATE USER 'app_user'@'192.0.2.10' IDENTIFIED BY 'app_user_passwd';
    GRANT ALL ON test.* TO 'app_user'@'192.0.2.10';
    CREATE USER 'app_user'@'192.0.2.11' IDENTIFIED BY 'app_user_passwd';
    GRANT ALL ON test.* TO 'app_user'@'192.0.2.11';
    $ mariadb --host 192.0.2.10 --port 3307
          --user app_user --password
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3308 \
          --user app_user --password
    SELECT @@global.hostname, @@global.server_id;
    
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs2 |                  2 |
    +-------------------+--------------------+
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3307 \
          --user app_user --password
    CREATE TABLE test.load_balancing_test (
       id INT PRIMARY KEY AUTO_INCREMENT,
       hostname VARCHAR(256),
       server_id INT
    );
    INSERT INTO test.load_balancing_test (hostname, server_id)
    VALUES (@@global.hostname, @@global.server_id);
    SELECT * FROM test.load_balancing_test;
    +----+----------+-----------+
    | id | hostname | server_id |
    +----+----------+-----------+
    |  1 | mcs1     |         1 |
    |  2 | mcs1     |         1 |
    |  3 | mcs1     |         1 |
    +----+----------+-----------+
    $ maxctrl list listeners
    ┌────────────────────────────┬──────┬──────┬─────────┬───────────────────────────┐
    │ Name                       │ Port │ Host │ State   │ Service                   │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ connection_router_listener │ 3308 │ ::   │ Running │ connection_router_service │
    ├────────────────────────────┼──────┼──────┼─────────┼───────────────────────────┤
    │ query_router_listener      │ 3307 │ ::   │ Running │ query_router_service      │
    └────────────────────────────┴──────┴──────┴─────────┴───────────────────────────┘
    $ mariadb --host 192.0.2.10 --port 3307 \
          --user app_user --password
    SELECT @@global.hostname, @@global.server_id;
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs2 |                  2 |
    +-------------------+--------------------+
    SELECT @@global.hostname, @@global.server_id;
    +-------------------+--------------------+
    | @@global.hostname | @@global.server_id |
    +-------------------+--------------------+
    |              mcs3 |                  3 |
    +-------------------+--------------------+

    Enterprise Server 10.5, Enterprise ColumnStore 5, MaxScale 2.5

  • Enterprise Server 10.6, Enterprise ColumnStore 23.02, MaxScale 22.08

  • Step 7

    Start and Configure MariaDB MaxScale

    Step 8

    Test MariaDB MaxScale

    Step 9

    Import Data

    • GlusterFS is a distributed file system. • GlusterFS supports replication and failover.

    NFS (Network File System)

    On-premises

    • NFS is a distributed file system. • If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately. • For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.

    <hostname>.log

    <hostname>-bin

    Step 1
    Step 2
    Step 3
    Step 4
    Step 5
    Step 6
    MariaDB Enterprise ColumnStore
    Shared Local Storage
    Storage Manager directory
    REST API
    MariaDB Enterprise Server
    MariaDB MaxScale

    Single-Node S3

    This guide provides steps for deploying a single-node S3 ColumnStore, setting up the environment, installing the software, and bulk importing data for online analytical processing (OLAP) workloads.

    Overview

    This procedure describes the deployment of the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.

    MariaDB Enterprise ColumnStore 5 is a columnar storage engine for MariaDB Enterprise Server 10.5. Enterprise ColumnStore is suitable for Online Analytical Processing (OLAP) workloads.

    This procedure has 9 steps, which are executed in sequence.

    This procedure represents basic product capability and deploys 3 Enterprise ColumnStore nodes and 1 MaxScale node.

    This page provides an overview of the topology, requirements, and deployment procedures.

    Please read and understand this procedure before executing.

    Procedure Steps

    Step
    Description

    Prepare ColumnStore Nodes

    Configure Shared Local Storage

    Install MariaDB Enterprise Server

    Start and Configure MariaDB Enterprise Server

    Test MariaDB Enterprise Server

    Install MariaDB MaxScale

    Support

    Customers can obtain support by submitting a support case.

    Components

    The following components are deployed during this procedure:

    Component
    Function

    Modern SQL RDBMS with high availability, pluggable storage engines, hot online backups, and audit logging.

    Database proxy that extends the availability, scalability, and security of MariaDB Enterprise Servers

    MariaDB Enterprise Server Components

    Component
    Description

    • Columnar storage engine

    • Highly available

    • Optimized for Online Analytical Processing (OLAP) workloads

    • Scalable query execution

    MariaDB MaxScale Components

    Component
    Description

    Listener

    Listens for client connections to MaxScale then passes them to the router service

    MariaDB Monitor

    Tracks changes in the state of MariaDB Enterprise Servers.

    Read Connection Router

    Routes connections from the listener to any available Enterprise ColumnStore node

    Read/Write Split Router

    Routes read operations from the listener to any available Enterprise ColumnStore node, and routes write operations from the listener to a specific server that MaxScale uses as the primary server

    Server Module

    Connection configuration in MaxScale to an Enterprise ColumnStore node

    Topology

    The MariaDB Enterprise ColumnStore topology with Object Storage delivers production analytics with high availability, fault tolerance, and limitless data storage by leveraging S3-compatible storage.

    The topology consists of:

    • One or more MaxScale nodes

    • An odd number of ColumnStore nodes (minimum of 3) running ES, Enterprise ColumnStore, and CMAPI

    The MaxScale nodes:

    • Monitor the health and availability of each ColumnStore node using the MariaDB Monitor (mariadbmon)

    • Accept client and application connections

    • Route queries to ColumnStore nodes using the Read/Write Split Router (readwritesplit)

    The ColumnStore nodes:

    • Receive queries from MaxScale

    • Execute queries

    • Use S3-compatible object storage for data

    • Use shared local storage for the Storage Manager directory

    Requirements

    These requirements are for the ColumnStore Object Storage topology when deployed with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5.

    • Node Count

    • Operating System

    • Minimum Hardware Requirements

    • Recommended Hardware Requirements

    • Storage Requirements

    • S3-Compatible Object Storage Requirements

    • Preferred Object Storage Providers: Cloud

    • Preferred Object Storage Providers: Hardware

    • Shared Local Storage Directories

    • Shared Local Storage Options

    Node Count

    • MaxScale nodes, 1 or more are required.

    • Enterprise ColumnStore nodes, 3 or more are required for high availability. You should always have an odd number of nodes in a multi-node ColumnStore deployment to avoid split brain scenarios.

    Operating System

    In alignment to the enterprise lifecycle, the ColumnStore Object Storage topology with MariaDB Enterprise Server 10.5, MariaDB Enterprise ColumnStore 5, and MariaDB MaxScale 2.5 is provided for:

    • CentOS Linux 7 (x86_64)

    • Debian 10 (x86_64)

    • Red Hat Enterprise Linux 7 (x86_64)

    • Red Hat Enterprise Linux 8 (x86_64)

    • Ubuntu 18.04 LTS (x86_64)

    • Ubuntu 20.04 LTS (x86_64)

    Minimum Hardware Requirements

    MariaDB Enterprise ColumnStore's minimum hardware requirements are not intended for production environments, but the minimum hardware requirements can be appropriate for development and test environments. For production environments, see the recommended hardware requirements instead.

    The minimum hardware requirements are:

    Component
    CPU
    Memory

    MaxScale node

    4+ cores

    4+ GB

    Enterprise ColumnStore node

    4+ cores

    4+ GB

    MariaDB Enterprise ColumnStore will refuse to start if the system has less than 3 GB of memory.

    If Enterprise ColumnStore is started on a system with less memory, the following error message will be written to the ColumnStore system log called crit.log:

    And the following error message will be raised to the client:

    Recommended Hardware Requirements

    MariaDB Enterprise ColumnStore's recommended hardware requirements are intended for production analytics.

    The recommended hardware requirements are:

    Component
    CPU
    Memory

    MaxScale node

    8+ cores

    16+ GB

    Enterprise ColumnStore node

    64+ cores

    128+ GB

    Storage Requirements

    The ColumnStore Object Storage topology requires the following storage types:

    Storage Type
    Description

    The ColumnStore Object Storage topology uses S3-compatible object storage to store data.

    The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.

    S3-Compatible Object Storage Requirements

    The ColumnStore Object Storage topology uses S3-compatible object storage to store data.

    Many S3-compatible object storage services exist. MariaDB Corporation cannot make guarantees about all S3-compatible object storage services, because different services provide different functionality.

    For the preferred S3-compatible object storage providers that provide cloud and hardware solutions, see the following sections:

    • Cloud

    • Hardware

    The use of non-cloud and non-hardware providers is at your own risk.

    If you have any questions about using specific S3-compatible object storage with MariaDB Enterprise ColumnStore, contact us.

    Preferred Object Storage Providers: Cloud

    • Amazon Web Services (AWS) S3

    • Google Cloud Storage

    • Azure Storage

    • Alibaba Cloud Object Storage Service

    Preferred Object Storage Providers: Hardware

    • Cloudian HyperStore

    • Cohesity S3

    • Dell EMC

    • IBM Cloud Object Storage

    • Seagate Lyve Rack

    • Quantum ActiveScale

    Shared Local Storage Directories

    The ColumnStore Object Storage topology uses shared local storage for the Storage Manager directory to store metadata.

    The Storage Manager directory is located at the following path by default:

    • /var/lib/columnstore/storagemanager

    Shared Local Storage Options

    The most common shared local storage options for the ColumnStore Object Storage topology are:

    Shared Local Storage
    Common Usage
    Description

    EBS (Elastic Block Store) Multi-Attach

    AWS

    • EBS is a high-performance block-storage service for AWS (Amazon Web Services).

    • EBS Multi-Attach allows an EBS volume to be attached to multiple instances in AWS. Only clustered file systems, such as GFS2, are supported.

    • For deployments in AWS, EBS Multi-Attach is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data.

    EFS (Elastic File System)

    AWS

    • EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).

    • For deployments in AWS, EFS is a recommended option for the Storage Manager directory, and Amazon S3 storage is the recommended option for data. EFS is a scalable, elastic, cloud-native NFS file system for AWS (Amazon Web Services).

    Filestore

    GCP

    • Filestore is high-performance, fully managed storage for GCP (Google Cloud Platform).

    • For deployments in GCP, Filestore is the recommended option for the Storage Manager directory, and Google Object Storage (S3-compatible) is the recommended option for data.

    GlusterFS

    On-premises

    Recommended Storage Options

    For best results, MariaDB Corporation would recommend the following storage options:

    Environment
    Object Storage For Data
    Shared Local Storage For Storage Manager

    AWS

    Amazon S3 storage

    EBS Multi-Attach or EFS

    GCP

    Google Object Storage (S3-compatible)

    Filestore

    On-premises

    Any S3-compatible object storage

    NFS

    Enterprise ColumnStore Management with CMAPI

    Enterprise ColumnStore's CMAPI (Cluster Management API) is a REST API that can be used to manage a multi-node Enterprise ColumnStore cluster.

    Many tools are capable of interacting with REST APIs. For example, the curl utility could be used to make REST API calls from the command-line.

    Many programming languages also have libraries for interacting with REST APIs.

    The examples below show how to use the CMAPI with curl.

    URL Endpoint Format for REST API

    For example:

    • https://mcs1:8640/cmapi/0.4.0/cluster/shutdown

    • https://mcs1:8640/cmapi/0.4.0/cluster/start

    • https://mcs1:8640/cmapi/0.4.0/cluster/status

    With CMAPI 1.4 and later:

    • https://mcs1:8640/cmapi/0.4.0/cluster/node

    With CMAPI 1.3 and earlier:

    • https://mcs1:8640/cmapi/0.4.0/cluster/add-node

    • https://mcs1:8640/cmapi/0.4.0/cluster/remove-node

    Required Request Headers

    • 'x-api-key': '93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd'

    • 'Content-Type': 'application/json'

    x-api-key can be set to any value of your choice during the first call to the server. Subsequent connections will require this same key.

    Get Status

    curl examples remain valid but are now considered legacy.

    $ mcs cluster status

    Start Cluster

    $ mcs cluster start --timeout 20

    Stop Cluster

    $ mcs cluster shutdown --timeout 20

    Add Node

    • With CMAPI 1.4 and later:

    • With CMAPI 1.3 and earlier:

    Remove Node

    • With CMAPI 1.4 and later:

    • With CMAPI 1.3 and earlier:

    Quick Reference

    MariaDB Enterprise Server Configuration Management

    Method
    Description

    Configuration File

    Configuration files (such as /etc/my.cnf) can be used to set system-variables and options. The server must be restarted to apply changes made to configuration files.

    Command-line

    The server can be started with command-line options that set system-variables and options.

    SQL

    Users can set system-variables that support dynamic changes on-the-fly using the SET statement.

    MariaDB Enterprise Server packages are configured to read configuration files from different paths, depending on the operating system. Making custom changes to Enterprise Server default configuration files is not recommended because custom changes may be overwritten by other default configuration files that are loaded later.

    To ensure that your custom changes will be read last, create a custom configuration file with the z- prefix in one of the include directories.

    Distribution
    Example Configuration File Path

    Distribution

    Example Configuration File Path

    • CentOS

    • Red Hat Enterprise Linux (RHEL)

    /etc/my.cnf.d/z-custom-mariadb.cnf

    • Debian

    • Ubuntu

    /etc/mysql/mariadb.conf.d/z-custom-mariadb.cnf

    MariaDB Enterprise Server Service Management

    The systemctl command is used to start and stop the MariaDB Enterprise Server service.

    Operation
    Command

    Start

    sudo systemctl start mariadb

    Stop

    sudo systemctl stop mariadb

    Restart

    sudo systemctl restart mariadb

    Enable during startup

    sudo systemctl enable mariadb

    Disable during startup

    sudo systemctl disable mariadb

    Status

    sudo systemctl status mariadb

    For additional information, see "".

    MariaDB Enterprise Server Logs

    MariaDB Enterprise Server produces log data that can be helpful in problem diagnosis.

    Log filenames and locations may be overridden in the server configuration. The default location of logs is the data directory. The data directory is specified by the datadir system variable.

    Log
    System Variable/Option
    Default Filename

    <hostname>.err

    server_audit.log

    <hostname>-slow.log

    Enterprise ColumnStore Service Management

    The systemctl command is used to start and stop the ColumnStore service.

    Operation
    Command

    Start

    sudo systemctl start mariadb-columnstore

    Stop

    sudo systemctl stop mariadb-columnstore

    Restart

    sudo systemctl restart mariadb-columnstore

    Enable during startup

    sudo systemctl enable mariadb-columnstore

    Disable during startup

    sudo systemctl disable mariadb-columnstore

    Status

    sudo systemctl status mariadb-columnstore

    In the ColumnStore Object Storage topology, the mariadb-columnstore service should not be enabled. The CMAPI service restarts Enterprise ColumnStore as needed, so it does not need to start automatically upon reboot.

    Enterprise ColumnStore CMAPI Service Management

    The systemctl command is used to start and stop the CMAPI service.

    Operation
    Command

    Start

    sudo systemctl start mariadb-columnstore-cmapi

    Stop

    sudo systemctl stop mariadb-columnstore-cmapi

    Restart

    sudo systemctl restart mariadb-columnstore-cmapi

    Enable during startup

    sudo systemctl enable mariadb-columnstore-cmapi

    Disable during startup

    sudo systemctl disable mariadb-columnstore-cmapi

    Status

    sudo systemctl status mariadb-columnstore-cmapi

    For additional information on endpoints, see "CMAPI".

    MaxScale Configuration Management

    MaxScale can be configured using several methods. These methods make use of MaxScale's REST API.

    Method
    Benefits

    Command-line utility to perform administrative tasks through the REST API. See MaxCtrl Commands.

    MaxGUI is a graphical utility that can perform administrative tasks through the REST API.

    The REST API can be used directly. For example, the curl utility could be used to make REST API calls from the command-line. Many programming languages also have libraries to interact with REST APIs.

    The procedure on these pages configures MaxScale using MaxCtrl.

    MaxScale Service Management

    The systemctl command is used to start and stop the MaxScale service.

    Operation
    Command

    Start

    sudo systemctl start maxscale

    Stop

    sudo systemctl stop maxscale

    Restart

    sudo systemctl restart maxscale

    Enable during startup

    sudo systemctl enable maxscale

    Disable during startup

    sudo systemctl disable maxscale

    Status

    sudo systemctl status maxscale

    For additional information, see "".

    Next Step

    Navigation in the procedure "Deploy ColumnStore Object Storage Topology":

    Next: Step 1: Prepare ColumnStore Nodes.

    • Enterprise Server 10.5

    • Enterprise Server 10.6

    • Enterprise Server 11.4

    Columnar storage engine with S3-compatible object storage

    • Highly available

    • Automatic failover via MaxScale and CMAPI

    • Scales read via MaxScale

    • Bulk data import

    • Enterprise Server 10.5, Enterprise ColumnStore 5, MaxScale 2.5

    • Enterprise Server 10.6, Enterprise ColumnStore 23.02, MaxScale 22.08

    Apr 30 21:54:35 a1ebc96a2519 PrimProc[1004]: 35.668435 |0|0|0| C 28 CAL0000: Error total memory available is less than 3GB.
    ERROR 1815 (HY000): Internal error: System is not ready yet. Please try again.
    https://{server}:{port}/cmapi/{version}/{route}/{command}
    $ curl -k -s https://mcs1:8640/cmapi/0.4.0/cluster/status \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/start \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/shutdown \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/add-node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .
    $ curl -k -s -X DELETE https://mcs1:8640/cmapi/0.4.0/cluster/node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .
    $ curl -k -s -X PUT https://mcs1:8640/cmapi/0.4.0/cluster/remove-node \
          --header 'Content-Type:application/json' \
          --header 'x-api-key:93816fa66cc2d8c224e62275bd4f248234dd4947b68d4af2b29671dd7d5532dd' \
          --data '{"timeout":20, "node": "192.0.2.2"}' \
          | jq .

    provides a REST API for multi-node administration

    Step 7

    Start and Configure MariaDB MaxScale

    Step 8

    Test MariaDB MaxScale

    Step 9

    Import Data

    • GlusterFS is a distributed file system.

    • GlusterFS supports replication and failover.

    NFS (Network File System)

    On-premises

    • NFS is a distributed file system.

    • If NFS is used, the storage should be mounted with the sync option to ensure that each node flushes its changes immediately.

    • For on-premises deployments, NFS is the recommended option for the Storage Manager directory, and any S3-compatible storage is the recommended option for data.

    <hostname>.log

    <hostname>-bin

    Recommended Storage Options
    Step 1
    Step 2
    Step 3
    Step 4
    Step 5
    Step 6
    MariaDB Enterprise ColumnStore
    S3-Compatible Object Storage
    Shared Local Storage
    MaxCtrl
    MaxGUI
    REST API
    MariaDB Enterprise Server
    MariaDB MaxScale
    Reporting Bugs
    MaxScale's REST API
    MaxCtrl
    mariadbmon
    MaxScale's REST API
    MaxScale's REST API
    MaxCtrl
    mariadbmon
    MaxScale's REST API
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    CDC tutorial
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxCtrl
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxScale's REST API
    MaxCtrl
    MaxGUI
    here
    MariaDB 10.5.4
    ColumnStore 23.10 Release Notes
    ColumnStore 23.02 Release Notes
    ColumnStore 22.08 Release Notes
    ColumnStore 6 Release Notes
    ColumnStore 5.6 Release Notes
    Cluster Management API (CMAPI)
    InnoDB has a row size of 64KB
    CREATE DATABASE
    CREATE TABLE
    LOAD DATA INFILE
    LOAD DATA INFILE
    SELECT
    Versions
    MariaDB Package Repository Setup and Usage
    CREATE DATABASE
    CREATE TABLE
    SELECT
    MariaDB Replication
    ANALYZE TABLE
    ANALYZE TABLE
    Versions
    MariaDB Package Repository Setup and Usage
    statement
    statement
    MariaDB Topologies
    mariadb client
    SELECT
    METADATA_LOCK_INFO plugin
    MariaDB Replication
    Versions
    MariaDB Package Repository Setup and Usage
    Versions
    MariaDB Package Repository Setup and Usage
    Versions
    MariaDB Package Repository Setup and Usage
    CREATE USER
    character_set_server
    collation_server
    LOAD DATA INFILE
    INSERT...SELECT
    information_schema.PLUGINS
    this page
    this page
    MariaDB Client
    information_schema.PLUGINS
    Versions
    MariaDB Package Repository Setup and Usage
    CREATE TABLE
    Compression
    Optimize Linux Kernel Parameters
    DECIMAL
    NUMERIC
    FIXED
    CREATE USER
    GRANT
    mariadb-backup
    FLUSH TABLES WITH READ LOCK
    MariaDB Backup
    MariaDB Backup
    UNLOCK TABLES
    MariaDB Backup
    MariaDB Backup
    FLUSH TABLES WITH READ LOCK
    MariaDB Backup
    MariaDB Backup
    UNLOCK TABLES
    MariaDB Backup
    METADATA_LOCK_INFO plugin
    INSERT INTO .. SELECT FROM ..
    METADATA_LOCK_INFO plugin
    LOAD DATA [ LOCAL ] INFILE
    INSERT .. SELECT
    MariaDB Client
    information_schema.PLUGINS
    system variables
    options
    system variables
    options
    system variables
    SET
    INTEGER
    DECIMAL
    DATE
    DATETIME
    CHAR
    VARCHAR
    gtid_strict_mode
    mariadb_es_repo_setup
    gtid_strict_mode
    MariaDB Client
    SHOW GLOBAL STATUS
    MariaDB Client
    SHOW GLOBAL VARIABLES
    Versions
    MariaDB Package Repository Setup and Usage
    Versions
    MariaDB Package Repository Setup and Usage
    information_schema.PLUGINS
    SELECT
    Topologies
    MariaDB Backup
    MariaDB Backup
    MariaDB Backup
    MariaDB Client
    MariaDB Replication
    gtid_slave_pos
    CHANGE MASTER TO
    START SLAVE
    METADATA_LOCK_INFO plugin
    METADATA_LOCK_INFO plugin
    MariaDB Client
    CREATE DATABASE
    CREATE TABLE
    DATE
    DATETIME
    MariaDB Client
    SELECT
    DATE
    MariaDB replication
    system variables
    options
    system variables
    options
    system variables
    SET
    EXPLAIN
    enumeration
    numeric
    numeric
    numeric
    numeric
    numeric
    numeric
    numeric
    numeric
    enumeration
    numeric
    numeric
    enumeration
    enumeration
    numeric
    numeric
    numeric
    enumeration
    enumeration
    enumeration
    enumeration
    LOAD DATA INFILE
    INSERT INTO SELECT FROM
    InnoDB
    Hybrid transactional-analytical processing (HTAP)
    CREATE USER
    GRANT
    GRANT
    character_set_server
    collation_server
    gtid_strict_mode
    log_bin
    log_bin_index
    log_slave_updates
    relay_log
    relay_log_index
    server_id
    InnoDB
    InnoDB
    METADATA_LOCK_INFO plugin
    LOAD DATA INFILE
    INSERT INTO .. SELECT FROM
    METADATA_LOCK_INFO plugin
    MariaDB Enterprise Backup
    Starting and Stopping MariaDB
    CREATE USER
    GRANT
    GRANT
    character_set_server
    collation_server
    gtid_strict_mode
    log_bin
    log_bin_index
    log_slave_updates
    relay_log
    relay_log_index
    server_id
    MyISAM
    InnoDB
    Starting and Stopping MariaDB
    MariaDB Error Log
    log_error
    MariaDB Enterprise Audit Log
    server_audit_file_path
    Slow Query Log
    slow_query_log_file
    General Query Log
    general_log_file
    Binary Log
    log_bin
    Starting and Stopping MariaDB
    Starting and Stopping MariaDB
    MariaDB Error Log
    log_error
    MariaDB Enterprise Audit Log
    server_audit_file_path
    Slow Query Log
    slow_query_log_file
    General Query Log
    general_log_file
    Binary Log
    log_bin

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.

    This page is: Copyright © 2025 MariaDB. All rights reserved.