Bulk Data Loading

You are viewing an old version of this article. View the current version here.

cpimport is a high-speed bulk load utility that imports data into ColumnStore tables in a fast and efficient manner. It accepts as input any flat file containing data that contains a delimiter between fields of data (i.e. columns in a table). The default delimiter is the pipe (‘|’) character, but other delimiters such as commas may be used as well. cpimport – performs the following operations when importing data into an MariaDB ColumnStore database:

  • Data is read from specified flat files
  • Data is transformed to fit InfiniDB’s column-oriented storage design
  • Redundant data is tokenized and logically compressed
  • Data is written to disk

The 2 most-common ways to use cpimport are: 1) from the UM: cpimport will distribute rows to all Performance Modules; and 2) from a PM: cpimport will load the imported rows only on the PM from which is was invoked.

There are two primary steps to using the cpimport utility:

  1. Optionally create a job file that is used to load data from a flat file into multiple tables
  2. Run the cpimport utility to perform the data import

Note:

  • The bulk loads are an append operation to a table so they allow existing data to be read and remain unaffected during the process.
  • The bulk loads do not write their data operations to the transaction log; they are not transactional in nature but are considered an atomic operation at this time. Information markers, however, are placed in the transaction log so the DBA is aware that a bulk operation did occur.
  • Upon completion of the load operation, a high water mark in each column file is moved in an atomic operation that allows for any subsequent queries to read the newly loaded data. This append operation provides for consistent read but does not incur the overhead of logging the data.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.