Release Notes for MariaDB ColumnStore 1.4.2¶
MariaDB ColumnStore is a columnar storage engine. This is the first release in the ColumnStore 1.4 series. This release contains a variety of new features and fixes, compared to MariaDB ColumnStore 1.2.5.
MariaDB ColumnStore 1.4.2 was released on 2020-01-06.
MariaDB Server Convergence¶
Until now, MariaDB ColumnStore has been maintained as a custom fork of MariaDB Server, to handle the unique way that queries are handled for distributed processing.
With this release, a joint project between the MariaDB Server and MariaDB ColumnStore engineering teams, ColumnStore now works as a pluggable storage engine on the standard MariaDB Enterprise Server 10.4 platform.
MariaDB Enterprise Server 10.4 includes distributed processing engine support features. These features are not present in the older 10.3 and 10.2 release series.
A standard MariaDB Server is now used for ColumnStore UM (User Module) nodes. ColumnStore users can now enjoy the benefits of MariaDB Server 10.4, and MariaDB Server 10.4 users are now able to deploy ColumnStore on top of their existing stack.
S3 Storage Manager¶
MariaDB ColumnStore now has the ability to use any object store that is Amazon S3 API compatible. The new "Storage Manager" uses a persistent disk cache for read/write operations so that it has minimal performance impact on ColumnStore. In some cases it will perform better than local disk operations.
postConfigure, edit the
storagemanager.cnfconfiguration file to specify the S3 connection parameters (as detailed in the S3 section of that file), and the local machine configuration (as detailed in the ObjectStorage and Cache sections). The configuration file is documented in-line. Enable the Storage Manager by setting ObjectStorage/Service to
postConfigure, and when promoted for type of storage, select the
cpimport S3 Support¶
cpimport is a high-speed bulk data loading utility for ColumnStore.
cpimport now includes command-line options for loading a CSV file from Amazon S3 (and compatible) buckets.
S3 Authentication Key
S3 Secret Key
S3 Hostname (omit if using Amazon S3, this is the default)
When these options are set,
cpimport will use the path/filename provided to load an object from object storage instead of a local file. Current behavior is to download the entire file into memory before processing.
Expanded Data Type Support¶
Please note that for
cpimport the current system time of the PM node is used.
MODA() Mode Average UDAF¶
MODA() UDAF (User-Defined Aggregate Function) determines the mode average.
MODA() has tie-break behavior to use the closest to the average, and then the smallest absolute value.
Statement-Based Replication Support¶
Statement-based replication into ColumnStore tables is supported by setting
columnstore_replication_slave=on on the UM that will apply the replication data. Row-based replication events on ColumnStore replica (slave) tables will currently fail, generating an error viewable with
SHOW SLAVE STATUS
The performance of BRM (Block Resolution Manager) snapshots has been increased for improved performance when committing data to ColumnStore.
To reduce SSD wear and and increase write performance for large data sets containing many columns, ColumnStore now allocates disk as-needed, writing only real data and padding to fill the remainder of an 8KB block. ColumnStore previously wrote twice -- once to pre-allocate an empty file for each new extent (8 million item file for a column), and a second time to fill the file with real data.
The outer "ORDER BY" of a query is now processed using ColumnStore’s engine instead of MariaDB server. This uses a faster sorting algorithm for higher performance with larger result sets.
Joins use a new hash algorithm which is significantly faster and requires significantly less initial memory to execute.
Memory cleanup after query execution now occurs in a separate thread. This previously occurred in the main ExeMgr thread, which could delay execution of new queries.
InfiniDB Alias Eliminated¶
ColumnStore 1.2 and earlier included the InfiniDB engine as an alias. This alias has now been removed. All ColumnStore tables must now be created with the engine name "columnstore". All MariaDB system variables prefixed with "infinidb_" have now been removed.
vtable Replaced by Query Execution Handlers¶
vtable has been replaced with a set of query execution handlers:
Derived Handler, and table API mode.
The vtable mode switch (
infinidb_vtable_mode system variable) has been eliminated. Two new session variables have been added:
Select Handler is the replacement for a vtable, and is the default query execution handler. It is expected to provide the fastest execution path for the whole query.
Select Handler lacks support for some vtable features, including:
INSERT .. SELECT
SELECT INTO OUTFILE
Select Handler fails to execute a query, an error is returned. If a query fails under the
Select Handler, set
columnstore_select_handler=off for the session. This will cause the Server to hand-off query execution to the
Derived Handler. The query must be restarted after the session variable has been set.
Derived Handler fails to execute a query, an error is returned. If a query fails under both the
Select Handler and
Derived Handler, set
columnstore_derived_handler=off for the session. This will cause table API execution, an equivalent to disabled vtable mode in ColumnStore 1.2.x and earlier. The query must be restarted after the session variables have been set.
ColumnStore is included with MariaDB Enterprise Server 10.4 on select Platforms.
ColumnStore is available for deployment from package tarball and repository. ColumnStore is not available for deployment from binary tarballs.
"Distributed Install" Method Eliminated¶
The "distributed install" method which pushed packages onto other nodes during
postConfigure has been removed. ColumnStore packages must now be installed on all nodes prior to startup.
Configuration Path Changes¶
ColumnStore XML configuration files have moved to
MariaDB Enterprise Server configuration options for ColumnStore have moved to
/etc/my.cnf.d/columnstore.cnf and the default MariaDB Enterprise Server
my.cnf will load this file.
Data Directory Path Change¶
The ColumnStore data directory has moved to
/var/lib/columnstore and is separate from the MariaDB Server data directory at
Executable Path Changes¶
ColumnStore binaries have moved to
/usr/sbin, and the libraries are in the OS standard
/usr library path. Some ColumnStore binaries have been renamed to avoid conflict, including:
User Account for Cross-Engine Joins¶
Cross-engine joins depend on TCP connection from
ExeMgr to the Server process. Since the database
root user in MariaDB Enterprise Server 10.4 authenticates only by UNIX socket, a dedicated user must be created to support cross-engine joins. The cross engine section of
Columnstore.xml should be edited accordingly.
Can result in crashes, hangs, stalls¶
Certain window function queries could crash the Server process. (MCOL-3434)
Can result in unexpected behavior¶
DISTINCTconcatenates even non-distinct values. (MCOL-2146)
Wrong results could be returned for a complex query with subquery and window functions over
Pipe operator (
|) could return wrong results. (MCOL-174)
DIVoperator could return wrong results. (MCOL-179)
Comparison of padded strings could provide incorrect results. (MCOL-1559)
CREATE TABLEcould fail when table name contained space and certain characters; not
A-Z a-z 0-9 _(MCOL-2219)
DISTINCTcould be performed in incorrect order relative to Window functions and
cpimportoutputs value truncation warning when read buffer (
-b) is set to
Cross-engine joins with query using
Bulk write API writes were possible when writes were suspended. (MCOL-3576)
INstatements could produce incorrect results. (MCOL-3448)
JOINcould significantly waste memory. (MCOL-1758)
Memory leaks. (MCOL-3621)
Performance of some queries, such as those containing
UNION, may be worse than on ColumnStore 1.2.x.
NOT LIKEqueries currently fall back to a slower execution method.
Columnstore_commit_hashstatus variable added
Columnstore_versionstatus variable added
columnstore_compression_typesystem variable added
columnstore_decimal_scalesystem variable added
columnstore_derived_handlersystem variable added
columnstore_diskjoin_bucketsizesystem variable added
columnstore_diskjoin_largesidelimitsystem variable added
columnstore_diskjoin_smallsidelimitsystem variable added
columnstore_double_for_decimal_mathsystem variable added
columnstore_group_by_handlersystem variable added
columnstore_import_for_batchinsert_delimitersystem variable added
columnstore_import_for_batchinsert_enclosed_bysystem variable added
columnstore_local_querysystem variable added
columnstore_orderby_threadssystem variable added
columnstore_ordered_onlysystem variable added
columnstore_replication_slavesystem variable added
columnstore_select_handlersystem variable added
columnstore_string_scan_thresholdsystem variable added
columnstore_stringtable_thresholdsystem variable added
columnstore_um_mem_limitsystem variable added
columnstore_use_decimal_scalesystem variable added
columnstore_use_import_for_batchinsertsystem variable added
columnstore_varbin_always_hexsystem variable added
COLUMNSTORE_COLUMNSinformation schema table added
COLUMNSTORE_EXTENTSinformation schema table added
COLUMNSTORE_FILESinformation schema table added
COLUMNSTORE_TABLESinformation schema table added
--columnstore-columnscommand-line option added
--columnstore-compression-typecommand-line option added
--columnstore-decimal-scalecommand-line option added
--columnstore-derived-handlercommand-line option added
--columnstore-diskjoin-bucketsizecommand-line option added
--columnstore-diskjoin-largesidelimitcommand-line option added
--columnstore-diskjoin-smallsidelimitcommand-line option added
--columnstore-double-for-decimal-mathcommand-line option added
--columnstore-extentscommand-line option added
--columnstore-filescommand-line option added
--columnstore-group-by-handlercommand-line option added
--columnstore-import-for-batchinsert-delimitercommand-line option added
--columnstore-import-for-batchinsert-enclosed-bycommand-line option added
--columnstore-local-querycommand-line option added
--columnstore-orderby-threadscommand-line option added
--columnstore-ordered-onlycommand-line option added
--columnstore-replication-slavecommand-line option added
--columnstore-select-handlercommand-line option added
--columnstore-string-scan-thresholdcommand-line option added
--columnstore-stringtable-thresholdcommand-line option added
--columnstore-tablescommand-line option added
--columnstore-um-mem-limitcommand-line option added
--columnstore-use-decimal-scalecommand-line option added
--columnstore-use-import-for-batchinsertcommand-line option added
--columnstore-varbin-always-hexcommand-line option added
--columnstorecommand-line option added
In alignment to the MariaDB Corporation Engineering Policy, MariaDB ColumnStore 1.4.2 is provided for:
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 7
SUSE Linux Enterprise Server 15
SUSE Linux Enterprise Server 12