MariaDB Enterprise ColumnStore Data Loading with INSERT .. SELECT
This page is part of MariaDB's Documentation.
The parent of this page is: Data Import with MariaDB Enterprise ColumnStore
Topics on this page:
Overview
MariaDB Enterprise ColumnStore automatically translates INSERT INTO .. SELECT FROM ..
statements into bulk data loads. By default, it translates the statement into a bulk data load that uses cpimport.bin
, which is an internal wrapper around the cpimport
tool.
Intended Use Cases
You can load data using INSERT INTO .. SELECT FROM ..
in the following cases:
You are loading data into a ColumnStore table by querying one or more local tables.
Batch Insert Mode
MariaDB Enterprise ColumnStore enables batch insert mode by default.
When batch insert mode is enabled, MariaDB Enterprise ColumnStore has special handling for INSERT INTO .. SELECT FROM .. statements.
Enterprise ColumnStore uses the following rules:
If the statement is executed outside of a transaction, Enterprise ColumnStore loads the data using cpimport, which is a command-line utility that is designed to efficiently load data in bulk. Enterprise ColumnStore executes cpimport using a wrapper called
cpimport.bin
.If the statement is executed inside of a transaction, Enterprise ColumnStore loads the data using the DML interface, which is slower.
Batch insert mode can be disabled by setting the columnstore_OFF
. When batch insert mode is disabled, Enterprise ColumnStore executes the statements using the DML interface, which is slower.
Locking
MariaDB Enterprise ColumnStore requires a write metadata lock (MDL) on the table when a bulk data load is performed with cpimport.
When a bulk data load is running:
Read queries will not be blocked.
Write queries and concurrent bulk data loads on the same table will be blocked until the bulk data load operation is complete, and the write metadata lock on the table has been released.
The write metadata lock (MDL) can be monitored with the METADATA_
LOCK_ .INFO plugin
Import the Schema
Before data can be imported into the tables, the schema must be created.
Connect to the primary server using MariaDB Client:
$ mariadb --host 192.168.0.100 --port 5001 \ --user db_user --password \ --default-character-set=utf8
After the command is executed, it will prompt you for a password.
For each database that you are importing, create the database with the CREATE DATABASE statement:
CREATE DATABASE inventory;
For each table that you are importing, create the table with the CREATE TABLE statement:
CREATE TABLE inventory.products ( product_name varchar(11) NOT NULL DEFAULT '', supplier varchar(128) NOT NULL DEFAULT '', quantity varchar(128) NOT NULL DEFAULT '', unit_cost varchar(128) NOT NULL DEFAULT '' ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
Note
To get the best performance from Enterprise ColumnStore, make sure to follow Enterprise ColumnStore's best practices for schema design.
Appends Data
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read. Appending data reduces the I/O requirements of bulk data loads, so that larger data sets can be loaded very efficiently.
While the bulk load is in progress, the newly appended data is temporarily hidden from queries.
After the bulk load is complete, the newly appended data is visible to queries.
Sort the Query Results
When MariaDB Enterprise ColumnStore performs a bulk data load, it appends data to the table in the order in which the data is read.
The order of data can have a significant effect on performance with Enterprise ColumnStore. If your data is not already sorted, it can be helpful to sort the query results using an ORDER BY
clause.
For example:
# Perform import from other table
# and insert in sorted order
INSERT INTO inventory.orders
SELECT *
FROM innodb_inventory.orders
ORDER BY order_date;
For additional information, see "Load Ordered Data in Proper Order".
Confirm the Field Delimiter
Before importing a table's data into MariaDB Enterprise ColumnStore, confirm that the field delimiter is not present in the data.
The field delimiter is determined by the columnstore_7
, which corresponds to the BEL
character.
To use a different delimiter, you can set the field delimiter.
Set the Field Delimiter
When the data is passed to cpimport, each value is separated by a field delimiter. The field delimiter is determined by the columnstore_
By default, Enterprise ColumnStore sets the field delimiter to the ASCII value 7
, which corresponds to the BEL
character. In general, setting the field delimiter is only required if your data contains this value.
Set the field delimiter by setting the columnstore_
For example, if you want to use a comma (,
) as the field delimiter, you would set columnstore_44
:
# Set field delimiter
SET SESSION columnstore_import_for_batchinsert_delimiter=44;
# Perform import from other table
INSERT INTO inventory.products
SELECT *
FROM innodb_inventory.products;
Set the Quoting Style
When the data is passed to cpimport, each value is enclosed by a quote character. The quote character is determined by the columnstore_
By default, Enterprise ColumnStore sets the quote character to the ASCII value 17
, which corresponds to the DC1
character. In general, setting the quote character is only required if your data contains this value.
Set the quote character by setting the columnstore_
For example, if you want to use a double quote ("
) as the quote character, you would set columnstore_34
:
# Set quote character
SET SESSION columnstore_import_for_batchinsert_enclosed_by=34;
# Perform import from other table
INSERT INTO inventory.products
SELECT *
FROM innodb_inventory.products;