MariaDB ColumnStore is the community version of the ColumnStore storage engine included with MariaDB Community Server. It provides columnar storage for scalable analytical processing and smart transactions.
MariaDB Community Server Convergence
MariaDB ColumnStore has been converging with MariaDB Community Server throughout the past few releases:
In MariaDB ColumnStore 1.2 and earlier, MariaDB ColumnStore required special custom-built releases of MariaDB Server.
The simplified installation process makes open source analytics accessible to a wider audience.
CS10.5.6 and before
To install the latest version, see Deploy MariaDB ColumnStore 5.4.
MariaDB ColumnStore is available on select platforms:
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 7
SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12
MariaDB ColumnStore is a columnar database:
Row-based storage engines are very performant for transactional or OLTP workloads.
Transactional workloads are generally characterized by a fixed set of queries using a relatively small data set.
Columnar storage engines are more performant for analytical or OLAP workloads.
Analytical workloads are characterized by ad hoc queries on very large data sets.
Where indexes can be used to optimize query performance for transactional workloads, the size of the data sets and the ad hoc nature of the queries preclude the use of indexes to optimize for analytical queries.
ColumnStore is designed specifically to handle analytical workloads.
Data is written to disk by column rather than row and is automatically partitioned. No indexes are necessary.
Hybrid Transactional and Analytical Processing
MariaDB ColumnStore supports Hybrid Transactional and Analytical Processing (HTAP) workloads:
Hybrid Transactional and Analytical Processing (HTAP) workloads require both transactional and analytical queries.
HTAP workloads are also known as "Smart Transactions", "Augmented Transactions" "Translytical", or "Hybrid Operational-Analytical Processing (HOAP)".
ColumnStore can perform as the analytical storage engine for HTAP.
MariaDB Replication can replicate data between the transactional and analytical engines to maintain data consistency.
MariaDB MaxScale is a high-performance database proxy, which can dynamically route transactional queries to the transactional storage engine and analytical queries to ColumnStore.
S3 Storage Manager
MariaDB ColumnStore supports S3-compatible storage:
ColumnStore can use any object store that is compatible with the Amazon S3 API.
Using cloud storage for ColumnStore data allows for practically limitless data storage while also providing high availability.
ColumnStore's "Storage Manager" uses a persistent local disk cache for read/write operations so that network latency has minimal performance impact on ColumnStore.
In some cases, it will perform better than local disk operations.
Connectors to BI Tools
MariaDB ColumnStore includes support for multiple connectors and data adapters to enable the use of popular data ingestion and business intelligence tools such as:
Data Ingestion Tools
MariaDB ColumnStore supports data ingestion tools:
cpimport ColumnStore bulk data ingestion utility which includes command-line options for loading a CSV file from Amazon S3 (and compatible) buckets.
Apache Spark connector to directly export data from Spark DataFrames to MariaDB ColumnStore
Kafka data adapter for rapid data ingestion
UDAF C++ API
MariaDB ColumnStore supports a Distributed User Defined Aggregate Functions (UDAF) C++ API:
The Distributed User Defined Aggregate Functions (UDAF) C++ API allows anyone to create aggregate functions of arbitrary complexity for distributed execution in the ColumnStore storage engine.
These functions can also be used as Analytic (Window) functions just like any built in aggregate function.