Introduction to State Snapshot Transfers (SSTs)

You are viewing an old version of this article. View the current version here.

In a State Snapshot Transfer (SST), the cluster provisions nodes by transferring a full data copy from one node to another. When a new node joins the cluster, the new node initiates a State Snapshot Transfer to synchronize its data with a node that is already part of the cluster.

Types of SSTs

There are two conceptually different ways to transfer a state from one MariaDB server to another:

  1. Logical

The only SST method of this type is the mysqldump SST method, which actually uses the mysqldump utility to get a logical dump of the donor. This SST method requires the joiner node to be fully initialized and ready to accept connections before the transfer. This method is, by definition, blocking, in that it blocks the donor node from modifying its own state for the duration of the transfer. It is also the slowest of all, and that might be an issue in a cluster with a lot of load.

  1. Physical

SST methods of this type physically copy the data files from the donor node to the joiner node. This requires that the joiner node is initialized after the transfer. The mariabackup SST method and a few other SST methods fall into this category. These SST methods are much faster than the mysqldump SST method, but they have certain limitations. For example, they can be used only on server startup and the joiner node must be configured very similarly to the donor node (e.g. innodb_file_per_table should be the same and so on). Some of the SST methods in this category are non-blocking on the donor node, meaning that the donor node is still able to process queries while donating the SST (e.g. the mariabackup SST method is non-blocking).

SST Methods

SST methods are supported via a scriptable interface. New SST methods could potentially be developed by creating new SST scripts. The scripts usually have names of the form wsrep_sst_<method> where <method> is one of the SST methods listed below.

You can choose your SST method by setting the wsrep_sst_method system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be a SST donor. For example:

SET GLOBAL wsrep_sst_method='mariabackup';

It can also be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_method = mariabackup

For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set wsrep_sst_method to the same value on all nodes, since any node will usually be a donor or joiner node at some point.

MariaDB Galera Cluster comes with the following built-in SST methods:

mariabackup

This SST method uses the Mariabackup utility for performing SSTs. It is one of the two non-locking methods. This is the recommended SST method if you require the ability to run queries on the donor node during the SST. Note that if you use the mariabackup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor to the joiner. This is a limitation inherited from the xtrabackup-v2 SST method.

This SST method supports GTID.

This SST method supports Data at Rest Encryption.

This SST method is available from MariaDB 10.1.26 and MariaDB 10.2.10.

With this SST method, it is impossible to upgrade the cluster between some major versions; see MDEV-27437.

See mariabackup SST method for more information.

rsync / rsync_wan

rsync is the default method. This method uses the rsync utility to create a snapshot of the donor node. rsync should be available by default on all modern Linux distributions. The donor node is blocked with a read lock during the SST. This is the fastest SST method, especially for large datasets since it copies binary data. Because of that, this is the recommended SST method if you do not need to allow the donor node to execute queries during the SST.

The rsync method runs rsync in --whole-file mode, assuming that nodes are connected by fast local network links so that the default delta transfer mode would consume more processing time than it may save on data transfer bandwidth. When having a distributed cluster with slow links between nodes, the rsync_wan method runs rsync in the default delta transfer mode, which may reduce data transfer time substantially when an older datadir state is already present on the joiner node. Both methods are actually implemented by the same script, wsrep_sst_rsync_wan is just a symlink to the wsrep_sst_rsync script and the actual rsync mode to use is determined by the name the script was called by.

This SST method supports GTID.

This SST method supports Data at Rest Encryption.

Use of this SST method could result in data corruption when using innodb_use_native_aio (the default) if the donor is older than MariaDB 10.3.35, MariaDB 10.4.25, MariaDB 10.5.16, MariaDB 10.6.8, or MariaDB 10.7.4; see MDEV-25975. Starting with those donor versions, wsrep_sst_method=rsync is a reliable way to upgrade the cluster to a newer major version.

As of MariaDB 10.1.36, MariaDB 10.2.18, and MariaDB 10.3.10, stunnel can be used to encrypt data over the wire. Be sure to have stunnel installed. You will also need to generate certificates and keys. See the stunnel documentation for information on how to do that. Once you have the keys, you will need to add the tkey and tcert options to the [sst] option group in your MariaDB configuration file, such as:

The rsync SST method does not support tables created with the DATA DIRECTORY or INDEX DIRECTORY clause. Use the mariabackup SST method as an alternative to support this feature.

[sst]
tkey = /etc/my.cnf.d/certificates/client-key.pem
tcert = /etc/my.cnf.d/certificates/client-cert.pem

You also need to run the certificate directory through openssl rehash.

mysqldump

This SST method runs mysqldump on the donor node and pipes the output to the mysql client connected to the joiner node. The mysqldump SST method needs a username/password pair set in the wsrep_sst_auth variable in order to get the dump. The donor node is blocked with a read lock during the SST. This is the slowest SST method.

This SST method supports GTID.

This SST method supports Data at Rest Encryption.

xtrabackup-v2

In MariaDB 10.1 and later, Mariabackup is the recommended backup method to use instead of Percona XtraBackup.

In MariaDB 10.3, Percona XtraBackup is not supported. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

In MariaDB 10.2 and MariaDB 10.1, Percona XtraBackup is only partially supported. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

This SST method uses the Percona XtraBackup utility for performing SSTs. It is one of the two non-blocking methods. Note that if you use the xtrabackup-v2 SST method, you also need to have socat installed on the server. Since Percona XtraBackup is a third party product, this SST method requires an additional installation some additional configuration. Please refer to Percona's xtrabackup SST documentation for information from the vendor.

This SST method does not support GTID.

This SST method does not support Data at Rest Encryption.

This SST method is available from MariaDB Galera Cluster 5.5.37 and MariaDB Galera Cluster 10.0.10.

See xtrabackup-v2 SST method for more information.

xtrabackup

In MariaDB 10.1 and later, Mariabackup is the recommended backup method to use instead of Percona XtraBackup.

In MariaDB 10.3, Percona XtraBackup is not supported. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

In MariaDB 10.2 and MariaDB 10.1, Percona XtraBackup is only partially supported. See Percona XtraBackup Overview: Compatibility with MariaDB for more information.

This SST method is an older SST method that uses the Percona XtraBackup utility for performing SSTs. The xtrabackup-v2 SST method should be used instead of the xtrabackup SST method starting from MariaDB 5.5.33.

This SST method does not support GTID.

This SST method does not support Data at Rest Encryption.

Authentication

All SST methods except rsync require authentication via username and password. You can tell the client what username and password to use by setting the wsrep_sst_auth system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be a SST donor. For example:

SET GLOBAL wsrep_sst_auth = 'mariabackup:password';

It can also be set in a server option group in an option file prior to starting up a node:

[mariadb]
...
wsrep_sst_auth = mariabackup:password

Some authentication plugins do not require a password. For example, the unix_socket and gssapi authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of wsrep_sst_auth empty. For example:

[mariadb]
...
wsrep_sst_auth = mariabackup:

See the relevant description or page for each SST method to find out what privileges need to be granted to the user and whether the privileges are needed on the donor node or joiner node for that method.

SSTs and Systemd

MariaDB's systemd unit file has a default startup timeout of about 90 seconds on most systems. If an SST takes longer than this default startup timeout on a joiner node, then systemd will assume that mysqld has failed to startup, which causes systemd to kill the mysqld process on the joiner node. To work around this, you can reconfigure the MariaDB systemd unit to have an infinite timeout, such as by executing one of the following commands:

If you are using systemd 228 or older, then you can execute the following to set an infinite timeout:

sudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]

TimeoutStartSec=0
EOF
sudo systemctl daemon-reload

Systemd 229 added the infinity option, so if you are using systemd 229 or later, then you can execute the following to set an infinite timeout:

sudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]

TimeoutStartSec=infinity
EOF
sudo systemctl daemon-reload

See Configuring the Systemd Service Timeout for more details.

Note that systemd 236 added the EXTEND_TIMEOUT_USEC environment variable that allows services to extend the startup timeout during long-running processes. Starting with MariaDB 10.1.35, MariaDB 10.2.17, and MariaDB 10.3.8, on systems with systemd versions that support it, MariaDB uses this feature to extend the startup timeout during long SSTs. Therefore, if you are using systemd 236 or later, then you should not need to manually override TimeoutStartSec, even if your SSTs run for longer than the configured value. See MDEV-15607 for more information.

SST Failure

An SST failure generally renders the joiner node unusable. Therefore, when an SST failure is detected, the joiner node will abort.

Restarting a node after a mysqldump SST failure may require manual restoration of the administrative tables.

SSTs and Data at Rest Encryption

Look at the description of each SST method to determine which methods support Data at Rest Encryption.

For logical SST methods like mysqldump, each node should be able to have different encryption keys. For physical SST methods, all nodes need to have the same encryption keys, since the donor node will copy encrypted data files to the joiner node, and the joiner node will need to be able to decrypt them.

Minimal Cluster Size

In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3.

When using an SST method that blocks the donor, there is yet another reason to require a minimum of 3 nodes. In a 3-node cluster, if one node is acting as an SST joiner and one other node is acting as an SST donor, then there is still one more node to continue executing queries.

Manual SSTs

In some cases, if Galera Cluster's automatic SSTs repeatedly fail, then it can be helpful to perform a "manual SST". See the following pages on how to do that:

Known Issues

mysqld_multi

SST scripts can't currently read the [mysqldN] option groups in option files that are read by instances managed by mysqld_multi.

See MDEV-18863 for more information.

See Also

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.