MariaDB Xpand Data Loading

Overview

MariaDB Xpand installations include the clustrix_import command, which bulk loads data into Xpand very efficiently by taking advantage of Xpand's parallel processing capabilities to efficiently bulk load data.

This documentation applies to both Xpand topologies.

Backing Up Data

MariaDB Xpand loads data into Xpand tables using SQL files. SQL files can be produced from existing MariaDB Enterprise Server databases using the mariadb-dump command:

$ mariadb-dump --user root --password \
      --all-databases \
      --ignore-database mysql \
      --single-transaction \
      --events \
      --routines \
      > data.sql

The command redirects the output into a data.sql file, which can be used to import into MariaDB Xpand.

Note

The mariadb-dump command ignores the mysql database. Xpand does not use this database, instead storing internal system information using the system database.

Loading Data

The recommended data load method for MariaDB Xpand is the clustrix_import utility. The clustrix_import utility is included in the MariaDB Xpand installation in the /opt/clustrix/bin directory. The clustrix_import command loads data from an SQL file. When using both Xpand topologies, clustrix_import must be executed on one the Xpand nodes.

InnoDB tables in the SQL file are automatically converted to Xpand tables during the import.

To import data into Xpand Nodes:

$ clustrix_import \
      --user xpandm --password xpandm_passwd \
      --import-file data.sql

At a high level, clustrix_import loads data with the following procedure:

  1. Connects to the local Xpand node

  2. Retrieves a list of available Xpand nodes

  3. Reads the data from the SQL file

  4. Pre-slices the data

  5. Writes multiple replicas of each slice to multiple Xpand nodes in parallel, which ensures fault tolerance

Xpand Loading Data with clustrix_import

Special Considerations

There are some special considerations in some cases.

Xpand Storage Engine Topology

When loading data into an Xpand Storage Engine topology, clustrix_import only interacts with the Xpand nodes. It does not create any database objects on the MariaDB Enterprise Server (ES) nodes.

Several types of objects need to be created directly on the ES nodes, such as:

  • Databases

  • Tables that use other storage engines

  • Views

  • User accounts and privileges

  • Stored procedures and functions

  • Events

ES nodes also require a replication solution, such as MariaDB Replication, to synchronize these database objects. Without replication, these database objects need to be created separately on each ES node.

Once the database is created on an ES node, the ES node can access any Xpand tables in that database on the Xpand nodes:

CREATE DATABSE accounts;

SHOW TABLES IN accounts:
+--------------------+
| Tables_in_accounts |
+--------------------+
| us_accounts        |
| eu_accounts        |
| me_accounts        |
+--------------------+

For additional information, see "MariaDB Xpand Topologies".