MariaDB Xpand Fast Backup and Restore

MariaDB Xpand supports BACKUP and RESTORE statements to allow you to back up and restore data on all Xpand tables in your deployment.

The system variables, system tables, and statements referenced on this page are only available on the Xpand nodes. When using the Xpand Storage Engine topology, you need to connect to an Xpand node to use these features.

Parallel Backup and Restore

Xpand implements Fast Backup and Restore as a binary backup mechanism that works at the row-level. Each node sends its data directly to the backup target in parallel, eliminating bottlenecks and allowing backup to scale with deployment size. Restore is handled similarly; the initiating node coordinates with other participating nodes in parallel to read from the dump file and restore replicas.

The structure of the dumped data consists of text files describing the schema and metadata about the backup. The dumped data also contains binary files of compacted row data and data consistency information. Modifying these files may make it impossible to restore from the backup.

Xpand uses a Cyclic Redundancy Check (CRC) during Backup and Restore. Calculating the CRC allows the Restore function to validate the data written during the Backup function. CDC adherence also requires that any failed backup or restore be restarted from the beginning.

Xpand's Fast Backup and Restore:

  • Is implemented as SQL commands that are executed from a MariaDB client.

  • Supports passive (versus active) mode FTP or SFTP

  • Supports password-based authentication for SFTP

  • Cannot be executed via stored procedures, functions, or triggers.

Note

Xpand recommends running BACKUP and RESTORE in a screen session or using tmux to avoid connection failures.

Xpand also recommends running both BACKUP and RESTORE during off-peak hours.

BACKUP

Backup runs as a single transaction and will pin BigC (Xpand's garbage collection) for the duration of the operation. This may require operational consideration if your deployment does not have enough free space for the undo log created from this long running transaction.

Syntax:

BACKUP identifier [, identifier]
[EXCLUDING excluded_identifier]
TO 'target' [COMPRESSED]

Both the identifier and excluded_identifier should refer to valid system objects. This can be a fully qualified name, or include wildcards (*) for the database, object, or both.

Additionally the excluded_identifier can specify a given table.

Xpand uses username and password authentication. The server path for FTP represents the relative path from the FTP root directory whereas the path for the SFTP server should be the absolute path from the Linux root directory. For example: "ftp://user:password@servername/relative_path/from_ftp_root" or "sftp://user:password@servername:absolute_path/from_Linux_root".

Note

Care should be taken when excluding objects from a backup. This can cause errors during RESTORE if another object depends on the excluded object.

For example, if a trigger or view refers to an excluded object that does not already exist in the restored environment, an error will occur during the RESTORE.

Triggers, views, and foreign keys that reference an excluded object will be backed up, unless they are specifically excluded.

Examples:

BACKUP db_name1.*, db_name2.tbl_name1, db_name2.tbl_name2 TO 'ftp://storage01/uploads/johndoe+kibbles.jun01';

BACKUP *.* EXCLUDING db_name2.tbl_name1 TO 'ftp://storage01/uploads/johndoe.kibbles.jun01';

BACKUP test TO 'sftp://plato:fy569gk@piquet.colo.sproutsys.com:22/tmp/tc5326/tc5326.bkup';

SFTP

To use an SFTP for Backup and Restore, specify the absolute path of the file location (versus relative path for FTP), and provide a username and password for authentication. For example: "sftp://user:password@servername:absolute_path/from_Linux_root"

Compression

To create a compressed backup, specify the COMPRESSED option. This functionality reduces bandwidth requirements for backup to remote locations and reduces the amount of storage utilized, but adds additional latency to complete a backup.

Tuning BACKUP Performance

You may want to tune BACKUP to be slower or faster:

SET GLOBAL backup_backup_concurrency = desired_value;

The global variable backup_backup_concurrency determines the number of tables that can be backed up simultaneously. With the default value of 1, the backup process is limited to one table at a time.

Getting Information on Backups

The following queries show the status of all currently running backups and restores.

  • Displaying all Backups on a Source

    SELECT * FROM system.backups
    WHERE source="ftp://storage01/uploads";
    

    Example output:

    *************************** 1. row **************************
    source: ftp://storage01/uploads
    backup: 1_52T_large_dogfood_dbx2
    cluster_name: cl5d97786267d0d4cb
    version: 5.0.45-clustrix-7.5.2
    status: ERROR
    start_time: 2016-06-15 01:43:36
    completed_time: 2016-06-15 03:19:31
    size: 0
    
  • Displaying the Status of Backup/Restore

    Run the following query to view information for a backup in progress

    SELECT * FROM system.backup_status;
    

    Example output:

    *************************** 1. row **************************
    nodeid: 2
    id: 5756997494884148226
    type: BACKUP
    url: ftp://storage01/uploads/1_52T_large_dogfood_dbx2_2
    objs: (("test" . "")("clustrix_ui" . "")("bugstest_01" ."")("jtunes_01" . "")("longstats_01" . "")("npktest_01" ."")("perfstats_01" . "")("run_resources_01" . "")("statd_01" ."")("bugstest_02" . "")("jtunes_02" . "")("longstats_02" ."")("npktest_02" . "")("perfstats_02" . "")("run_resources_02" ."")("statd_02" . "*"))
    start_time: 2016-06-22 22:50:32
    expected_bytes: 793977192448
    replicas: 1070
    rows: 49589227
    bytes: 23991322310
    
  • Displaying all Backed Up Tables on a Source

    SELECT * FROM system.backup_tables
    WHERE source="ftp://storage01/uploads" limit 2;
    
+-------------------------+--------------------------+-------------+------------+------+
| source                  | backup                   | db          |table       | size |
+-------------------------+--------------------------+-------------+------------+------+
| ftp://storage01/uploads | 1_52T_large_dogfood_dbx2 | bugstest_01 |attach_data | 0    |
+-------------------------+--------------------------+-------------+------------+------+
| ftp://storage01/uploads | 1_52T_large_dogfood_dbx2 | bugstest_01 |attachments | 0    |
+-------------------------+--------------------------+-------------+------------+------+

RESTORE

Use the following syntax to restore objects previously backed up with Fast Backup.

RESTORE identifier [AS identifier] [, identifier [AS identifier]]
[EXCLUDING excluded_identifier]
FROM target [REPLICAS=N]

Both the identifier and excluded_identifier should refer to valid system objects. This can be a fully qualified name, or include wildcards (*) for the database, object, or both.

Additionally, the excluded_identifier can specify a given table.

Tables may be renamed during the restore process by providing a table alias AS identifier.

The server path for FTP represents the relative path from the FTP root directory whereas the path for the SFTP server should be the absolute path from the Linux root directory. For example: "ftp://user:password@servername/relative_path/from_ftp_root" or "sftp://user:password@servername:absolute_path/from_Linux_root".

Note

Care should be taken when restoring from a backup that used the EXCLUDING option. This can cause errors during restoration if another object depends on the excluded object(s).

For example, if a trigger or view refers to an object that does not already exist in the restored environment, an error will occur.

Triggers, views, and foreign keys that reference an object excluded from the backup will be successfully restored only if the object exists in the restored environment.

Examples:

RESTORE db_name.tbl_name AS db_name_backup.tbl_name
FROM 'ftp://username@server.com/backups/backupfolders/backup_file';

RESTORE tbl_name, tbl_name2, tbl_name3
FROM 'ftp://username@ftp://server.com/backups/backupfolders/backup_file';

RESTORE ehms FROM 'sftp://root@kartma6127/root/ehmssftp';

Replicas

The number of replicas for a table or index is recorded within the backups. Normally when RESTORE is run, tables and indexes are restored with the same number of replicas as the original. Using the REPLICAS option allows you to explicitly specify the number of replicas that should be created as part of the RESTORE operation.

For example, if a table had three replicas when backed up, the following example would create the table with only two replicas.

RESTORE * FROM 'ftp://storage01/uploads/johndoe.kibbles.jun01'
REPLICAS = 2;

Tuning RESTORE Performance

You may want to tune RESTORE to be slower or faster.

SET GLOBAL backup_restore_concurrency = desired_value;

The global variable backup_restore_concurrency determines the number of slices that can be restored simultaneously to each node.

Global Variables

The following global variables impact Fast Backup and Restore. The defaults provided are generally acceptable. These variables are not available by session.

System Variable

Description

Default

backup_backup_concurrency

The number of tables that can be backed up simultaneously.

1

backup_restore_concurrency

The maximum number of slices restored concurrently on each node.

16

backup_write_compression_level

Compression level from 1 (fastest) to 9 (best compression)

6

MariaDB Dump

In addition to the BACKUP and RESTORE statements available through the Xpand service, you can also back up your Xpand tables through the Xpand service socket using the mariadb-dump command.

This would only operate on the Xpand tables and data, ignoring tables using other storage engines.

Backing Up Data

To back up Xpand using mariadb-dump, run the following command:

$ mariadb-dump --user user --password \
      --socket /data/clustrix/mysql.sock \
      --single-transaction \
      --master-data=2 \
      --all-databases \
      > backup.sql
  • The --single-transaction flag ensures a consistent snapshot of the database by querying for all of the data in a single transaction and permits continued access to the Xpand while the dump is being created.

  • The --master-data=2 flag inserts a CHANGE MASTER TO statement in a comment near the top of the dump file, indicating the location in the binary log where a Replica Server must start to be consistent with the dump. If no binlog is being created, omit the --master-data flag.

Restoring Data

To restore Xpand, use the MariaDB Client:

$ mariadb --user user --password \
      --socket /data/clustrix/mysql.sock \
      < backup.sql

The client connects to the Xpand service and executes each SQL statement in the backup file to restore the database and the Xpand tables.