Step 5: Bulk Import of Data

Overview

This page details step 5 of a 5-step procedure for deploying Single-Node Enterprise ColumnStore Enterprise ColumnStore 5 with Object storage.

This step bulk imports data to Enterprise ColumnStore.

Interactive commands are detailed. Alternatively, the described operations can be performed using automation.

Import the Schema

Before data can be imported into the tables, create a matching schema.

On the primary server, create the schema:

  1. For each database that you are importing, create the database with the CREATE DATABASE statement:

    CREATE DATABASE inventory;
    
  2. For each table that you are importing, create the table with the CREATE TABLE statement:

    CREATE TABLE inventory.products (
       product_name varchar(11) NOT NULL DEFAULT '',
       supplier varchar(128) NOT NULL DEFAULT '',
       quantity varchar(128) NOT NULL DEFAULT '',
       unit_cost varchar(128) NOT NULL DEFAULT ''
    ) ENGINE=Columnstore DEFAULT CHARSET=utf8;
    

Import the Data

Enterprise ColumnStore supports multiple methods to import data into ColumnStore tables.

cpimport

MariaDB Enterprise ColumnStore includes cpimport, which is a command-line utility designed to efficiently load data in bulk.

To import your data from a TSV (tab-separated values) file, on the primary server run cpimport:

$ sudo cpimport -s '\t' inventory products /tmp/inventory-products.tsv

LOAD DATA INFILE

When data is loaded with the LOAD DATA INFILE statement, MariaDB Enterprise ColumnStore loads the data using cpimport, which is a command-line utility designed to efficiently load data in bulk.

To import your data from a TSV (tab-separated values) file, on the primary server use LOAD DATA INFILE statement:

LOAD DATA INFILE '/tmp/inventory-products.tsv'
INTO TABLE inventory.products;

Import from Remote Database

MariaDB Enterprise ColumnStore can also import data directly from a remote database. A simple method is to query the table using the SELECT statement, and then pipe the results into cpimport, which is a command-line utility that is designed to efficiently load data in bulk.

To import your data from a remote MariaDB database:

$ mariadb --quick \
   --skip-column-names \
   --execute="SELECT * FROM inventory.products" \
   | cpimport -s '\t' inventory products

Next Step

Navigation in the Single-Node Enterprise ColumnStore topology with Object storage deployment procedure:

  • This page was step 5 of 5.

  • This procedure is complete.