Getting Started with MariaDB Galera Cluster

MariaDB Galera Cluster is MariaDB plus the MySQL-wsrep patch from Codership. It is currently available on Unix platforms only.

In MariaDB 5.5 and MariaDB 10.0, MariaDB Galera Server is a separate package installed instead of the standard MariaDB Server package. Since MariaDB 10.1, the MariaDB Server and MariaDB Galera Server packages have been combined and Galera packages and their dependencies get installed automatically when installing MariaDB. The Galera parts remain dormant until configured, like a plugin or storage engine.

The most recent release of MariaDB 10.2 is:
MariaDB 10.2.44 Stable (GA) Download Now

The most recent release of MariaDB 10.1 is:
MariaDB 10.1.48 Stable (GA) Download Now

The current versions of the Galera wsrep provider library are 26.4.18 for Galera 4.
For convenience, packages containing these libraries are included in the MariaDB YUM and APT repositories.

Currently MariaDB Galera Cluster only supports the InnoDB/XtraDB storage engine.

Downloading and installing MariaDB Galera Cluster

There are two things you need:

  1. MariaDB Galera Cluster
  1. The Galera wsrep provider library

Download and install MariaDB Galera Cluster binaries/packages as you would regular MariaDB Server. In 10.1 they are part of the regular download, in 10.0 and 5.5 they are separate downloads. This includes installing MariaDB Galera Cluster using Yum or Apt. The primary difference for 10.0 and 5.5 is to install the MariaDB Galera Server package instead of the MariaDB Server package and to install the Galera package (the Galera wsrep provider). The Galera package is included in the MariaDB repositories to make installation easier.

If you choose the package manager route, start by configuring your package manager using the Repository Configuration Tool. Then use Yum or Apt (depending on which package manager you use) to install MariaDB Galera Server and Galera. For example, on Ubuntu, you would do the following after configuring Apt:

sudo apt-get update

# For server versions prior to 10.1 :
sudo apt-get install mariadb-galera-server

# For 10.1 versions :
sudo apt-get install mariadb-server

Full instructions for installing MariaDB Galera Cluster with yum and apt-get are available at the following two locations:

If MariaDB is already installed on the server the package manager will uninstall the appropriate packages prior to installing the MariaDB Galera Cluster packages.

MariaDB Galera Cluster starting with 10.0.24

To unify the location of the libgalera_smm.so library between the bintar, rpm, and deb packages, the library is now found at lib/galera/libgalera_smm.so in the bintar packages, with a symlink in the lib directory that points to it.

One note if you decide to compile from source: When compiling MariaDB Galera Cluster from the source tarball set: -DWITH_WSREP=ON and -DWITH_INNODB_DISALLOW_WRITES=1.

A great resource for Galera users is Codership on Google Groups (codership-team 'at' googlegroups (dot) com) - If you use Galera it is recommended you subscribe.

Prerequisites

Swap size requirements

During normal operation a MariaDB Galera node does not consume much more memory than a regular MariaDB server. Additional memory is consumed for the certification index and uncommitted writesets, but normally this should not be noticeable in a typical application. There is one exception though:

  1. Writeset caching during state transfer. When a node is receiving a state transfer it cannot process and apply incoming writesets because it has no state to apply them to yet. Depending on a state transfer mechanism (e.g. mysqldump) the node that sends the state transfer may not be able to apply writesets as well. Thus they need to cache those writesets for a catch-up phase. Currently the writesets are cached in memory and, if the system runs out of memory either the state transfer will fail or the cluster will block waiting for the state transfer to end.

To control memory usage for writeset caching, check the Galera parameters: gcs.recv_q_hard_limit, gcs.recv_q_soft_limit, and gcs.max_throttle.

Application requirements

See the Galera Limitations page for a complete list of requirements and limitations.

Server configuration limits

  • general log and slow query log have to be file type and cannot be CVS or any other storage engine
  • query cache has to be disabled (only for versions prior to 5.5.40-galera and 10.0.14-galera)

Getting started

Bootstrapping a new cluster

To bootstrap a new cluster you need to start the first mysqld server with the option --wsrep-new-cluster on the command line like so:

$ mysqld --wsrep-new-cluster

This implies to the server that there is no existing cluster to connect to and it will create a new history UUID.

Restarting the server with the same setting will cause it to create new history UUID again, it won't reconnect to the old cluster. See next section about how to reconnect to an existing cluster.

However, keep in mind that most users are not going to bootstrap a server by executing "mysqld --wsrep-new-cluster" manually. Instead, most users will use one of the following wrappers to bootstrap a node, depending on the server's operating system.

SysVinit

On operating systems that use SysV init, a node can be bootstrapped in the following way:

$ service mysql bootstrap

Systemd

On operating systems that use systemd, a node can be bootstrapped in the following way:

$ galera_new_cluster
MariaDB starting with 10.1.8

Systemd support and the galera_new_cluster script were added in MariaDB 10.1.

Adding another node to a cluster

Once you have a cluster running and you want to add/reconnect another node to it, you must supply an address of one of the cluster members in the cluster address URL. E.g. if the first node of the cluster has the address 192.168.0.1, then adding a second node would look like this:

$ mysqld --wsrep_cluster_address=gcomm://192.168.0.1  # DNS names work as well

The new node only needs to connect to one of the existing members. It will automatically retrieve the cluster map and reconnect to the rest of the nodes.

Once all members agree on the membership, state exchange will be initiated during which the new node will be informed of the cluster state. If its state is different from that of the cluster (which is normally the case) it will request a snapshot of the state from the cluster[1]) and install it before becoming ready for use.

Restarting the cluster

If you shut down all nodes, you effectively terminated the cluster (not the data of course, but the running cluster), hence the right way is to start the all the nodes with gcomm://<node1 address>,<node2 address>,...?pc.wait_prim=no again. On one of the nodes set global wsrep_provider_options="pc.bootstrap=true";.

State Snapshot Transfer

There are two conceptually different ways to transfer a state from one MariaDB server to another:

  1. Using mysqldump. This requires the receiving server to be fully initialized and ready to accept connections before the transfer. This method is, by definition, blocking, in that it blocks the donor server from modifying its own state for the duration of the transfer. It is also the slowest of all, and that might be an issue in a loaded cluster.
  1. Copying data files directly. This requires that the receiving server is initialized after the transfer. xtrabackup, and other methods fall into this category. These methods are much faster than mysqldump, but they have certain limitations. For example, they can be used only on server startup and the receiving server must be configured very similarly to the donor (e.g. innodb_file_per_table should be the same and so on). Some of these methods (e.g. xtrabackup) can be potentially made non-blocking on the donor. Such methods are supported via a scriptable interface.

SST scripts

MariaDB Galera Cluster comes with the following SST scripts:

mysqldump

This is a default method. This script runs only on the sending side and pipes mysqldump output to the mysql client connected to the receiving server. mysqldump needs a username/password pair set in the wsrep_sst_auth variable in order to get the dump. This method requires a READ LOCK (the donor will be read-only during the whole process).

rsync

This method uses the rsync utility for snapshots. rsync should be available by default on all modern linux distributions. This is the fastest and recommended method, especially for large datasets since it copies binary data. It requires a READ LOCK during the whole process, as for mysqldump.

xtrabackup

This method uses the xtrabackup open-source hot backup utility to perform snapshot state transfers. It is the only non-locking method however it requires some additional setup. Please refer to the xtrabackup SST documentation for more information. xtrabackup-v2 should always be used starting from MariaDB 5.5.33 and later versions.

State transfer failure

It is easy to see that a failure in state transfer generally renders the receiving node unusable. Therefore, should the failure be detected, it will abort. Restarting a node after a mysqldump failure may require manual restoration of the administrative tables.

Minimal cluster size

In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3. Blocking state transfer is yet another reason to require a minimum of 3 nodes in order to enjoy service availability in case one of the members fails and needs to be restarted. While two of the members will be engaged in state transfer, the remaining member(s) will be able to keep on serving client requests.

Configuration and monitoring

A number of parameters need to be set in order for MariaDB Galera to work:

Mandatory settings

  1. wsrep_provider — Path to the Galera library
  2. wsrep_cluster_address — see cluster connection URL
  3. binlog_format=ROW — see Binary Log Formats
  4. default_storage_engine=InnoDB
  5. innodb_autoinc_lock_mode=2
  6. innodb_doublewrite=1 (the default) when using Galera provider of version >= 2.0.
  7. query_cache_size=0 (only for versions prior to 5.5.40-galera, 10.0.14-galera and 10.1.2)
  8. wsrep_on=ON — Enable wsrep replication (starting 10.1.1)

Optional settings

These are just optimizations made relatively safe by synchronous replication — you always recover from another node.

  1. innodb_flush_log_at_trx_commit=0

Configuration

Monitoring

  1. Galera status variables can be queried with the standard
SHOW STATUS LIKE 'wsrep_%';
  1. The notification command can be defined to be invoked when cluster membership or node status changes. It can communicate the event to a some monitoring agent.

See also

Footnotes

  1. The cluster will choose the most suitable state donor or, alternatively, a desired node can be specified for that role on startup

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.