1 of 83

Galera Cluster

MariaDB Galera Cluster

MariaDB Galera Cluster is a virtually synchronous multi-master cluster that runs on Linux only. Its Enterprise version is MariaDB Enterprise Cluster (powered by Galera).

Quickstart Guides Galera Management Galera Architecture Galera Security Galera Use Cases High Availability Reference

Quickstart Guides

Get started quickly with MariaDB Galera Cluster using these guides. Follow step-by-step instructions to deploy and configure a highly available, multi-master cluster for your applications.

MariaDB Galera Cluster Replication Guide

MariaDB Galera Cluster Replication quickstart guide

Quickstart Guide: About Galera Replication

Galera Replication is a core technology enabling MariaDB Galera Cluster to provide a highly available and scalable database solution. It is characterized by its virtually synchronous replication, ensuring strong data consistency across all cluster nodes.

Galera Management

Galera Management in MariaDB handles synchronous multi-master replication, ensuring high availability, data consistency, failover, and seamless node provisioning across clusters.

Installation & Deployment

Galera Test Repositories

To facilitate development and QA, we have created some test repositories for the Galera wsrep provider.

These are test repositories. There will be periods when they do not work at all, or work incorrectly, or possibly cause earthquakes, typhoons, and tornadoes. You have been warned.

Galera Test Repositories for YUM

Advanced Installation (From Source)

Building the Galera wsrep Package on Ubuntu and Debian

The instructions on this page were used to create the galera package on the Ubuntu and Debian Linux distributions. This package contains the wsrep provider for .

The version of the wsrep provider is 25.3.5. We also provide 25.2.9 for those that need or want it. Prior to that, the wsrep version was 23.2.7.

Install prerequisites:

Clone from and checkout mariadb-3.x banch:

Configuration

Galera Cluster Address

URLs in Galera take a particular format:

Schema

gcomm⁣- This is the option to use for a working implementation.
dummy⁣- Used for running tests and profiling, does not do any actual replication, and all following parameters are ignored.

Cluster address

The cluster address shouldn't be empty like gcomm://. This should never be hardcoded into any configuration files.
To connect the node to an existing cluster, the cluster address should contain the address of any member of the cluster you want to join.
The cluster address can also contain a comma-separated list of multiple members of the cluster. It is good practice to list all possible members of the cluster, for example. ⁣gcomm:<node1 name or ip>,<node2 name or ip2>,<node3 name or ip> Alternately, if multicast is used, put the multicast address instead of the list of nodes. Each member address or multicast address can specify <node name or ip>:<port>

Option list

The variable is used to set a . These parameters can also be provided (and overridden) as part of the URL. Unlike options provided in a configuration file, they will not endure and need to be resubmitted with each connection.

A useful option to set is pc.wait_prim=noto ensure the server will start running even if it can't determine a primary node. This is useful if all members go down at the same time.

Port

By default, gcomm listens on all interfaces. The port is either provided in the cluster address or will default to 4567 if not set.

_{This page is licensed: CC BY-SA / Gnu FDL}

Configuring Auto-Eviction

Auto-Eviction enhances cluster stability by automatically removing non-responsive or "unhealthy" nodes in MariaDB Galera Cluster. This prevents a single problematic node from degrading the entire cluster's performance. In a Galera Cluster, each node monitors the network response times of other nodes. If a node becomes unresponsive due to reasons like memory swapping, network congestion, or a hung process, it can delay and potentially disrupt cluster operations. Auto-Eviction provides a deterministic method to isolate these misbehaving nodes effectively.

Auto-Eviction Process

The Auto-Eviction process is based on a consensus mechanism among the healthy cluster members.

Monitoring and Delay List: Each node in the cluster monitors the response times from all its peers. If a given node fails to respond within the expected timeframes, the other nodes will add an entry for it to their internal "delayed list."
Eviction Trigger: If a of the cluster nodes independently add the same peer to their delayed lists, it triggers the Auto-Eviction protocol.
Eviction: The cluster evicts the unresponsive node, removing it from the . The evicted node will enter a non-primary state and must be restarted to

The sensitivity of this process is determined by the evs.auto_evict parameter.

Configuration

Auto-Eviction is configured by passing the evs.auto_evict within the wsrep_provider_options in your MariaDB configuration file (my.cnf).

The value of evs.auto_evict determines the threshold for eviction. It defines how many times a peer can be placed on the delayed list before the node votes to evict it.

In the above example example, if a node registers that a peer has been delayed 5 times, it will vote to have that peer evicted from the cluster.

To disable Auto-Eviction, you can set the value to 0:

Even when disabled, the node will continue to monitor response times and log information about delayed peers; it just won't vote to evict them.

The Auto-Eviction feature is directly related to the that control how the cluster detects unresponsive nodes in the first place. These parameters define what it means for a node to be "delayed."

Parameter

Description

Tuning these values in conjunction with evs.auto_evict allows you to define how aggressively the cluster will fence off struggling nodes.

_{This page is licensed: CC BY-SA / Gnu FDL}

Using the Notification Command (wsrep_notify_cmd)

MariaDB Galera Cluster provides a powerful automation feature through the wsrep_notify_cmd . When this variable is configured, the MariaDB server will automatically execute a specified command or script in response to changes in the cluster's membership or the local node's state.

This is extremely useful for integrating the cluster with external systems:

System

Description

General Operations

Backing Up a MariaDB Galera Cluster

The recommended strategy for creating a full, consistent backup of a MariaDB Galera Cluster is to perform the backup on a single . Because all nodes in a contain the same data, a complete backup from one node represents a snapshot of the entire cluster at a specific point in time.

The preferred tool for this is . mariadb-backup creates a "hot" backup without blocking the node from serving traffic for an extended period.

The Challenge of Consistency in a Live Cluster

While taking a backup, the donor node is still receiving and applying transactions from the rest of the cluster. If the backup process is long, it's possible for the data at the end of the backup to be newer than the data at the

Upgrading Galera Cluster

Articles on upgrading between MariaDB versions with Galera Cluster

Performance Tuning

Galera Architecture

Certification-Based Replication

Certification-based replication uses and transaction ordering techniques to achieve synchronous replication.

Transactions execute optimistically in a , or replica, and then at commit time, they run a coordinated certification process to enforce global consistency. It achieves global coordination with the help of a broadcast service that establishes a global total order among concurrent transactions.

Requirements for Certification-Based Replication

It is not possible to implement certification-based replication for all database systems. It requires certain features of the database in order to work:

Quorum Control with Weighted Votes

This page is a deep-dive into the advanced feature of weighted quorum. For a general overview of Quorum, its role in monitoring, and basic recovery, see .

MariaDB Galera Cluster supports a weighted quorum, where each node can be assigned a weight in the range of 0 to 255, with which it will participate in quorum calculations. This provides fine-grained control over which nodes are most critical for forming a , especially in complex or .

By default, every has a weight of 1. You can customize a node's weight during runtime by setting the pc.weight

Galera Cluster Deployment Variants

MariaDB Galera Cluster is flexible and can be deployed in several different topologies to meet various business needs, from within a single data center to geographically distributed disaster recovery. The primary deployment patterns are designed for Local Area Networks (LAN) and Wide Area Networks (WAN).

Standard LAN Cluster (Single Data Center)

This is the most common deployment pattern for achieving high availability and read scaling within a single data center.

Galera Security

MariaDB Galera security encrypts replication/SST traffic and ensures integrity through firewalls, secure credentials, and network isolation.

Securing Communications in Galera Cluster

By default, Galera Cluster replicates data between each node without encrypting it. This is generally acceptable when the cluster nodes runs on the same host or in networks where security is guaranteed through other means. However, in cases where the cluster nodes exist on separate networks or they are in a high-risk network, the lack of encryption does introduce security concerns as a malicious actor could potentially eavesdrop on the traffic or get a complete copy of the data by triggering an SST.

To mitigate this concern, Galera Cluster allows you to encrypt data in transit as it is replicated between each cluster node using the Transport Layer Security (TLS) protocol. TLS was formerly known as Secure Socket Layer (SSL), but strictly speaking the SSL protocol is a predecessor to TLS and, that version of the protocol is now considered insecure. The documentation still uses the term SSL often and for compatibility reasons TLS-related server system and status variables still use the prefix ssl_, but internally, MariaDB only supports its secure successors.

In order to secure connections between the cluster nodes, you need to ensure that all servers were compiled with TLS support. See to determine how to check whether a server was compiled with TLS support.

For each cluster node, you also need a certificate, private key, and the Certificate Authority (CA) chain to verify the certificate. If you want to use self-signed certificates that are created with OpenSSL, then see for information on how to create those.

Securing Galera Cluster Replication Traffic

In order to enable TLS for Galera Cluster's replication traffic, there are a number of that you need to set, such as:

You need to set the path to the server's certificate by setting the wsrep_provider_option.
You need to set the path to the server's private key by setting the wsrep_provider_option.
You need to set the path to the certificate authority (CA) chain that can verify the server's certificate by setting the wsrep_provider_option.
If you want to restrict the server to certain ciphers, then you also need to set the wsrep_provider_option.

It is also a good idea to set MariaDB Server's regular TLS-related system variables, so that TLS will be enabled for regular client connections as well. See for information on how to do that.

For example, to set these variables for the server, add the system variables to a relevant server in an :

And then restart the server to make the changes persistent.

By setting both MariaDB Server's TLS-related system variables and Galera Cluster's TLS-related wsrep_provider_options, the server can secure both external client connections and Galera Cluster's replication traffic.

Securing State Snapshot Transfers

The method that you would use to enable TLS for would depend on the value of .

mariadb-backup

See for more information.

xtrabackup-v2

See : TLS for more information.

mysqldump

This SST method simply uses the (previously mysqldump) utility, so TLS would be enabled by following the guide at

rsync

This SST method supports encryption in transit via . See for more information.

_{This page is licensed: CC BY-SA / Gnu FDL}

High Availability

MariaDB ensures high availability with Replication for async/semi-sync data copying and Galera Cluster for sync multi-master with failover and zero data loss.

Resetting the Quorum (Cluster Bootstrap)

This page provides a step-by-step guide for an emergency recovery procedure. For a general overview of what Quorum is and how to monitor it, see

When a network failure or a crash affects over half of your cluster nodes, the cluster might lose its . In such cases, the remaining nodes may return an Unknown command error for many queries. This behavior is a safeguard to prevent data inconsistency.

You can confirm this by checking the wsrep_cluster_status on all nodes:

If none of your nodes return a value of Primary

Load Balancing

Load Balancing in MariaDB Galera Cluster

While a client application can connect directly to any node in a MariaDB Galera Cluster, this is not a practical approach for a production environment. A direct connection creates a single point of failure and does not allow the application to take advantage of the cluster's high availability and read-scaling capabilities.

A load balancer or database proxy is an essential component that sits between your application and the cluster. Its primary responsibilities are:

Provide a Single Endpoint: Your application connects to the load balancer's virtual IP address, not to the individual database nodes.
Health Checks: The load balancer constantly monitors the health of each cluster node (e.g., is it Synced? is it up or down?).
Traffic Routing: It intelligently distributes incoming client connections and queries among the healthy nodes in the cluster.
Automatic Failover: If a node fails, the load balancer automatically stops sending traffic to it, providing seamless failover for your application.

Recommended Load Balancer: MariaDB MaxScale

For MariaDB Galera Cluster, the recommended load balancer is MariaDB MaxScale. Unlike a generic TCP proxy, MaxScale is a database-aware proxy that understands the Galera Cluster protocol. This allows it to make intelligent routing decisions based on the real-time state of the cluster nodes.

Common Routing Strategies

A database-aware proxy like MaxScale can be configured to use several different routing strategies.

Read-Write Splitting (Recommended)

This is the most common and highly recommended strategy for general-purpose workloads.

How it Works: The load balancer is configured to send all write operations (INSERT, UPDATE, DELETE) to a single, designated primary node. All read operations (SELECT) are then distributed across the remaining available nodes.
Advantages:

Read Connection Load Balancing

In this simpler strategy, the load balancer distributes all connections evenly across all available nodes.

How it Works: Each new connection is sent to the next available node in a round-robin fashion.
Disadvantages: This approach can easily lead to transaction conflicts if your application sends writes to multiple nodes simultaneously. It is generally only suitable for applications that are almost exclusively read-only.

State Snapshot Transfers (SSTs) in Galera Cluster

State Snapshot Transfers (SSTs) in MariaDB Galera Cluster copy the full dataset from a donor node to a new or recovering joiner node, ensuring data consistency before the joiner joins replication.

Using MariaDB Replication with MariaDB Galera Cluster

MariaDB Galera Cluster provides high availability with synchronous replication, while adding asynchronous replication boosts redundancy for disaster recovery or reporting.

Overview of Hybrid Replication

Hybrid replication leverages standard, asynchronous MariaDB Replication to copy data from a synchronous MariaDB Galera Cluster to an external server or another cluster. This configuration establishes a one-way data flow, where the entire Galera Cluster serves as the source (primary) for one or more asynchronous replicas. This advanced setup combines the strengths of both replication methods: synchronous replication ensures high availability within the primary site, while asynchronous replication caters to specific use cases, allowing for flexible data distribution.

Common Use Cases

Implementing a hybrid replication setup is a powerful technique for solving several common business needs:

Use Case

Description

Key Challenges and Considerations

Before implementing a hybrid setup, it is critical to understand the technical challenges:

Challenge

Description

_{This page is licensed: CC BY-SA / Gnu FDL}

Using MariaDB Replication with MariaDB Galera Cluster

and can be used together. However, there are some things that have to be taken into account.

Tutorials

If you want to use and together, then the following tutorials may be useful:

Reference

Galera Cluster for MariaDB offers synchronous multi-master replication with high availability, no data loss, and simplified, consistent scaling.

WSREP Variable Details

wsrep_certificate_expiration_hours_warning

Overview

Print warning about certificate expiration if the X509 certificate used for wsrep connections is about to expire in hours given as an argument. If the value is 0, warnings are not printed.

Usage

The wsrep_certificate_expiration_hours_warning system variable can be set in a configuration file:

The global value of the wsrep_certificate_expiration_hours_warning system variable can also be set dynamically at runtime by executing :

When the wsrep_certificate_expiration_hours_warning system variable is set dynamically at runtime, its value will be reset the next time the server restarts. To make the value persist on restart, set it in a configuration file too.

Details

The wsrep_certificate_expiration_hours_warning system variable can be used to configure certificate expiration warnings for MariaDB Enterprise Cluster, powered by Galera:

When the wsrep_certificate_expiration_hours_warning system variable is set to 0, certificate expiration warnings are not printed to the .
When the wsrep_certificate_expiration_hours_warning system variable is set to a value N, which is greater than 0, certificate expiration warnings are printed to the MariaDB Error Log when the node's certificate expires in N hours or less.

Parameters

wsrep_cluster_name

Overview

Name for the cluster.

Details

This system variable specifies the logical name of the cluster. Every Cluster Node that connects to each other must

wsrep_sst_method

Overview

State snapshot transfer method.

DETAILS

wsrep_sst_common

`wsrep_sst_common` Variables

The wsrep_sst_common script provides shared functionality used by various State Snapshot Transfer (SST) methods in Galera Cluster. It centralizes the handling of common configurations such as authentication credentials, SSL/TLS encryption parameters, and other security-related settings. This ensures consistent and secure communication between cluster nodes during the SST process.

The wsrep_sst_common script parses the following options:

Replace ${dist} in the code below for the YUM-based distribution you are testing. Valid distributions are:

Place this code block in a file at /etc/yum.repos.d/galera.repo

[galera-test]
name = galera-test
baseurl = http://yum.mariadb.org/galera/repo/rpm/${dist}
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

# run the following command:
sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 0xcbcb082a1bb943db 0xF1656F24C74CD1D8

# Add the following line to your /etc/apt/sources.list file:
deb http://yum.mariadb.org/galera/repo/deb ${dist} main

Take the selected node out of your load balancer's rotation so it no longer receives application traffic.
Connect to the node with a mariadb client and execute:
```
SET GLOBAL wsrep_desync = ON;
```
The node will finish applying any transactions already in its queue and then pause, entering a "desynced" state. The rest of the cluster will continue to operate normally.

Connect to the node again and execute:
```
SET GLOBAL wsrep_desync = OFF;
```
The node will now request an Incremental State Transfer (IST) from its peers to receive all the transactions it missed while it was desynchronized and quickly catch up.
Once the node is fully synced (you can verify this by checking that wsrep_local_state_comment is Synced), add it back to your load balancer's rotation.

+--------------------+-----------------+
| Variable_name      | Value           |
+--------------------+-----------------+
| wsrep_cluster_name | example_cluster |
+--------------------+-----------------+

Installing Galera from Source

There are binary installation packages available for RPM and Debian-based distributions, which will pull in all required Galera dependencies.

If these are not available, you will need to build Galera from source.

The wsrep API for Galera Cluster is included by default. Follow the usual instructions

Preparation

make cannot manage dependencies for the build process, so the following packages need to be installed first:

RPM-based:

Debian-based:

If running on an alternative system, or the commands are available, the following packages are required. You will need to check the repositories for the correct package names on your distribution—these may differ between distributions or require additional packages:

MariaDB Database Server with wsrep API

Git, CMake (on Fedora, both cmake and cmake-fedora are required), GCC and GCC-C++, Automake, Autoconf, and Bison, as well as development releases of libaio and ncurses.

Building

You can use Git to download the source code, as MariaDB source code is available through GitHub. 1. Clone the repository:

Check out the branch (e.g., 10.5-galera or 11.1-galera), for example:

Building the Database Server

The standard and Galera Cluster database servers are the same, except that for Galera Cluster, the wsrep API patch is included. Enable the patch with the CMake configuration options. WITH_WSREP and WITH_INNODB_DISALLOW_WRITES. To build the database server, run the following commands:

There are also some build scripts in the *BUILD/* directory, which may be more convenient to use. For example, the following pre-configures the build options discussed above:

There are several others as well, so you can select the most convenient.

Besides the server with the Galera support, you will also need a Galera provider.

Preparation

make cannot manage dependencies itself, so the following packages need to be installed first:

If running on an alternative system, or the commands are available, the following packages are required. You will need to check the repositories for the correct package names on your distribution - these may differ between distributions, or require additional packages:

Galera Replication Plugin

SCons, as well as development releases of Boost (libboost_program_options, libboost_headers1), Check and OpenSSL.

Building

Run:

After this, the source files for the Galera provider will be in the galera directory.

Building the Galera Provider

The Galera Replication Plugin both implements the wsrep API and operates as the database server's wsrep Provider. To build, cd into the galera/ directory and do:

The path to libgalera_smm.so needs to be defined in the my.cnf configuration file.

Building Galera Replication Plugin from source on FreeBSD runs into issues due to Linux dependencies. To overcome these, either install the binary package: pkg install galera, or use the ports build available at /usr/ports/databases/galera.

Configuration

After building, a number of other steps are necessary:

Create the database server user and group:

Install the database (the path may be different if you specified CMAKE_INSTALL_PREFIX):

If you want to install the database in a location other than /usr/local/mysql/data , use the --basedir or --datadir options.
Change the user and group permissions for the base directory.

Create a system unit for the database server.

Galera Cluster can now be started using the service command and is set to start at boot.

_{This page is licensed: CC BY-SA / Gnu FDL}

Upgrading from MariaDB 10.4 to MariaDB 10.5 with Galera Cluster

Galera Cluster ships with the MariaDB Server. Upgrading a Galera Cluster node is very similar to upgrading a server from MariaDB 10.4 to MariaDB 10.5. For more information on that process as well as incompatibilities between versions, see the .

Performing a Rolling Upgrade

The following steps can be used to perform a rolling upgrade from MariaDB 10.4 to MariaDB 10.5 when using Galera Cluster. In a rolling upgrade, each node is upgraded individually, so the cluster is always operational. There is no downtime from the application's perspective.

First, before you get started:

First, take a look at to see what has changed between the major versions.
Check whether any system variables or options have been changed or removed. Make sure that your server's configuration is compatible with the new MariaDB version before upgrading.
Check whether replication has changed in the new MariaDB version in any way that could cause issues while the cluster contains upgraded and non-upgraded nodes.
Check whether any new features have been added to the new MariaDB version. If a new feature in the new MariaDB version cannot be replicated to the old MariaDB version, then do not use that feature until all cluster nodes have been upgrades to the new MariaDB version.
Next, make sure that the Galera version numbers are compatible.
If you are upgrading from the most recent release to , then the versions will be compatible.
See ?: s for information on which MariaDB releases uses which Galera wsrep provider versions.
You want to have a large enough gcache to avoid a during the rolling upgrade. The gcache size can be configured by setting For example:wsrep_provider_options="gcache.size=2G"

Before you upgrade, it would be best to take a backup of your database. This is always a good idea to do before an upgrade. We would recommend .

Then, for each node, perform the following steps:

Modify the repository configuration, so the system's package manager installs

see for more information.

When this process is done for one node, move onto the next node.

When upgrading the Galera wsrep provider, sometimes the Galera protocol version can change. The Galera wsrep provider should not start using the new protocol version until all cluster nodes have been upgraded to the new version, so this is not generally an issue during a rolling upgrade. However, this can cause issues if you restart a non-upgraded node in a cluster where the rest of the nodes have been upgraded.

_{This page is licensed: CC BY-SA / Gnu FDL}

Upgrading from MariaDB 10.5 to MariaDB 10.6 with Galera Cluster

Galera Cluster ships with the MariaDB Server. Upgrading a Galera Cluster node is very similar to upgrading a server from to . For more information on that process as well as incompatibilities between versions, see the .

Performing a Rolling Upgrade

The following steps can be used to perform a rolling upgrade from to when using Galera Cluster. In a rolling upgrade, each node is upgraded individually, so the cluster is always operational. There is no downtime from the application's perspective.

First, before you get started:

First, take a look at to see what has changed between the major versions.
Check whether any system variables or options have been changed or removed. Make sure that your server's configuration is compatible with the new MariaDB version before upgrading.
Check whether replication has changed in the new MariaDB version in any way that could cause issues while the cluster contains upgraded and non-upgraded nodes.
Check whether any new features have been added to the new MariaDB version. If a new feature in the new MariaDB version cannot be replicated to the old MariaDB version, then do not use that feature until all cluster nodes have been upgrades to the new MariaDB version.
Next, make sure that the Galera version numbers are compatible.
If you are upgrading from the most recent release to , then the versions will be compatible.
See ?: for information on which MariaDB releases uses which Galera wsrep provider versions.
You want to have a large enough gcache to avoid a during the rolling upgrade. The gcache size can be configured by setting for example:wsrep_provider_options="gcache.size=2G"

Before you upgrade, it would be best to take a backup of your database. This is always a good idea to do before an upgrade. We would recommend .

Then, for each node, perform the following steps:

Modify the repository configuration, so the system's package manager installs

see for more information.

When this process is done for one node, move onto the next node.

_{This page is licensed: CC BY-SA / Gnu FDL}

Flow Control in Galera Cluster

Flow Control is a key feature in MariaDB Galera Cluster that ensures nodes remain synchronized. In synchronous replication, no node should lag significantly in processing transactions.

Picture the cluster as an assembly line; if one worker slows down, the whole line must adjust to prevent a breakdown.

Flow Control manages this by aligning all nodes' replication processes:

Preventing Memory Overflow

Without Flow Control, a slow node's replication queue can grow unchecked, consuming all server memory and potentially crashing the MariaDB process due to an Out-Of-Memory (OOM) error.

Maintaining Synchronization

It maintains synchronization across the cluster, ensuring all nodes have nearly identical database states at all times.

Flow Control Sequence

The Flow Control process is an automatic feedback loop triggered by the state of a node's replication queue.

Queue Growth: A node (the "slow node") begins receiving from its peers faster than it can apply them. This causes its local receive queue, measured by the wsrep_local_recv_queue , to grow.
Upper Limit Trigger: When the receive queue size exceeds the configured upper limit, defined by the gcs.fc_limit , the slow node triggers Flow Control.
Pause Message: The node broadcasts a "Flow Control PAUSE" message to all other nodes in the cluster.

Monitoring Flow Control

As an administrator, observing Flow Control is a key indicator of a performance bottleneck in your cluster. You can monitor it using the following global :

Variable Name

Description

Troubleshooting Flow Control Issues

If you observe frequent Flow Control pauses, it is essential to identify and address the underlying cause.

Key Configuration Parameters

These my.cnf control the sensitivity of Flow Control:

Parameter

Description

Default Value

Modifying these values is an advanced tuning step. In most cases, it is better to fix the underlying cause of the bottleneck rather than relaxing the Flow Control limits.

Common Causes and Solutions

Cause

Description

Solution

_{This page is licensed: CC BY-SA / Gnu FDL}

Using Streaming Replication for Large Transactions

Streaming Replication optimizes replication of large or long-running transactions in MariaDB Galera Cluster. Typically, a node executes a transaction fully and replicates the complete write-set to other nodes at time. Although efficient for most workloads, this approach can be challenging for very large or lengthy transactions.

With Streaming Replication, the initiating node divides the transaction into smaller fragments. These fragments are certified and replicated to other nodes while the transaction is ongoing. Once a fragment is certified and applied to the replicas, it becomes immune to abortion by conflicting transactions, thus improving the chances of the entire transaction succeeding. This method also supports processing of transaction write-sets over two Gigabytes.

Streaming Replication is available in Galera Cluster 4.0 and later versions. Both and newer, and and newer, on supported platforms, include Galera 4.

When to Use Streaming Replication

In most cases, the standard replication method is sufficient. Streaming Replication is a specialized tool for specific scenarios. The best practice is to enable it only at the session level for the specific transactions that require it.

Large Data Transactions

This is the primary use case. When performing a massive , , or , normal replication requires the originating node to hold the entire transaction locally and then send a very large write-set at commit time. This can cause two problems:

A significant replication lag, as the entire cluster waits for the large write-set to be transferred and applied.
The replica nodes, while busy applying the large transaction, cannot commit other transactions, which can trigger and throttle the entire cluster.

With Streaming Replication, the node replicates the data in fragments throughout the transaction's lifetime. This spreads the network load and allows replica nodes to apply other concurrent transactions between fragments, minimizing the impact on the overall

Long-Running Transactions

A transaction that remains open for a long time has a higher chance of that commits first. When this happens, the long-running transaction is aborted.

Streaming Replication mitigates this by committing the transaction in fragments. Once a fragment is , it is "locked in" and cannot be aborted by a new conflicting transaction.

Certification keys derive from record locks, not gap locks. If a streaming transaction holds a gap lock, another node's transaction can still apply a in that gap, potentially aborting the streaming transaction.

High-Contention ("Hot") Records

For applications that frequently update the same row (e.g., a counter, a job queue, or a locking scheme), Streaming Replication can be used to force a critical update to replicate immediately. This effectively locks the , preventing other transactions from modifying it and increasing the chance that the critical transaction will commit successfully.

How to Enable and Use Streaming Replication

Streaming Replication should be enabled at the session level just for the transactions that need it. This is controlled by two session variables:

defines what a "unit" of replication is.
defines how many units make up a fragment.

To enable streaming, you set both variables:

In the above example, the node will create, certify, and replicate a fragment after every 10 SQL statements within the transaction.

The available fragment units for wsrep_trx_fragment_unit are:

Parameter

Description

To disable Streaming Replication, you can set wsrep_trx_fragment_size back to 0.

Managing a "Hot Record"

Consider an application that manages a work order queue. To prevent two users from getting the same queue position, you can use Streaming Replication for the single critical update.

Begin the transaction:
After reading necessary data, enable Streaming Replication for just the next statement:
Perform the critical update. This statement will be immediately fragmented and replicated:
Immediately disable Streaming Replication for the rest of the transaction:

This ensures the queue_position update is replicated and certified across the cluster before the rest of the transaction proceeds, preventing .

Limitations and Performance Considerations

Before using Streaming Replication, consider the following limitations:

Performance Overhead

When Streaming Replication is enabled, Galera records all write-sets to a log table () on every node to ensure persistence in case of a crash. This adds write overhead and can impact performance, which is why it should only be used when necessary.

Cost of Rollbacks

If a streaming transaction needs to be rolled back after some fragments have already been applied, the rollback operation consumes system resources on all nodes as they undo the previously applied fragments. Frequent rollbacks of streaming transactions can become a performance problem.

For these reasons, it is always a good application design policy to use shorter, smaller transactions whenever possible.

_{This page is licensed: CC BY-SA / Gnu FDL}

Introduction to Galera Architecture

MariaDB Galera Cluster provides a synchronous replication system that uses an approach often called eager replication. In this model, nodes in a cluster synchronize with all other nodes by applying replicated updates as a single transaction. This means that when a transaction COMMITs, all nodes in the cluster have the same value. This process is accomplished using write-set replication through a group communication framework.

Core Architectural Components

The internal architecture of MariaDB Galera Cluster revolves around four primary components:

MariaDB Server (DBMS)

The foundation of the cluster is the standard MariaDB Server, typically using the InnoDB storage engine. This component serves clients that connect to it and executes queries as a normal database server would.

The `wsrep API`

The wsrep API is a generic replication plugin interface for databases. It defines a set of application callbacks and replication plugin calls that connect the MariaDB Server to the replication provider. It consists of two main elements:

wsrep Hooks: These hooks integrate with the database server engine to enable write-set replication.
dlopen(): This function is used to make the (the Galera Plugin) available to the wsrep hooks.

In this model, the wsrep API considers the database to have a "state." When clients modify the database content, its state changes. The API represents these changes as a series of atomic transactions. In a healthy cluster, all nodes always have the same because they synchronize by replicating and applying these state changes in the same serial order.

From a technical perspective, the process flow is as follows:

A state change (transaction) occurs on one node in the cluster.
Within the MariaDB Server, the wsrep hooks translate these changes into a write-set.
The dlopen() function makes the wsrep provider's functions available to the hooks.

Galera Replication Plugin

The Galera Replication Plugin implements the wsrep API and acts as the wsrep Provider. It handles the core replication service functionality. The plugin itself consists of the following components:

Layer

Description

Group Communication (GComm) Framework

The Galera Replication Plugin uses a Group Communication (GComm) framework for its messaging layer. The GComm system implements a virtual synchrony Quality of Service (QoS), which unifies data delivery and cluster into a clear, formal model.

While virtual synchrony guarantees data consistency, it does not guarantee the temporal synchrony needed for smooth multi-primary operations. To address this, Galera Cluster implements its own runtime-configurable , which keeps nodes synchronized to within a fraction of a second.

The GComm framework also provides a total ordering of messages from multiple sources, which it uses to generate Global Transaction IDs in a multi-primary cluster.

Fundamental Concepts

Global Transaction ID (GTID)

To keep the database state identical across all nodes, the wsrep API uses a Global Transaction ID (GTID). This allows the cluster to uniquely identify every state change and to know the current state of any node in relation to others. An example GTID looks like this:

45eec521-2f34-11e0-0800-2a36050b826b:94530586304

It consists of two components:

State UUID: A unique identifier for the database state and its sequence of changes.
Ordinal Sequence Number (seqno): A 64-bit signed integer that denotes the position of the transaction in the sequence.

Read and Write Scaling

A direct benefit of Galera's multi-master architecture is the ability to scale both read and write operations.

Write Scaling: Because every node in the cluster can accept write operations, you can distribute your application's write traffic across multiple nodes. This can increase write throughput, though it's important to remember that all writes must still be replicated and certified on all nodes, which can introduce .
Read Scaling: This is the most significant performance advantage. Since all nodes are kept synchronized, they all contain the same data. This allows you to distribute read queries across all nodes in the cluster, providing excellent horizontal scaling for read-heavy applications. This architecture is ideal for use with a load balancer (like MariaDB MaxScale) that can perform read-write splitting.

This scaling is typically managed by a which distributes traffic intelligently across the cluster.

_{This page is licensed: CC BY-SA / Gnu FDL}

Recovering a Primary Component

In a MariaDB Galera Cluster, an individual node is considered to have "failed" when it loses communication with the cluster's Primary Component. This can happen for many reasons, including hardware failure, a software crash, loss of network connectivity, or a critical error during a state transfer.

From the perspective of the cluster, a node has failed when the other members can no longer see it. From the perspective of the failed node itself (assuming it hasn't crashed), it has simply lost its connection to the Primary Component and will enter a non-operational state to protect data integrity.

The EVS Protocol

Node failure detection is handled automatically by Galera's group communication system, which uses an Extended Virtual Synchrony (EVS) protocol. This process is controlled by several evs.* parameters in your configuration file.

The cluster determines a node's health based on the last time it received a network packet from that node. The process is as follows:

The cluster periodically checks for inactive nodes, controlled by .
If a node hasn't sent a packet within the , other nodes begin sending heartbeat beacons to it.
If the node remains silent for the duration of , the other nodes will mark it as "suspect."
Once all members of the Primary Component agree that a node is suspect, it is declared inactive and .

Cluster Fault Tolerance

A safeguard mechanism ensures the cluster remains operational even if some nodes become unresponsive. If a node is active but overwhelmed—perhaps from excessive —it will be labeled as failed. This process ensures that one struggling node doesn't disrupt the entire cluster's functionality.

The Availability vs. Partition Tolerance Trade-off

Within the context of the CAP Theorem (Consistency, Availability, Partition Tolerance), Galera Cluster strongly prioritizes Consistency. This leads to a direct trade-off when configuring the failure detection timeouts, especially on unstable networks like a .

Low Timeouts

Setting low values for evs.suspect_timeout allows the cluster to detect a genuinely failed node very quickly, minimizing downtime. However, on an unstable network, this can lead to "false positives," where a temporarily slow node is incorrectly evicted.

High Timeouts

Setting higher values makes the cluster more tolerant of and slow nodes. However, if a node truly fails, the cluster will remain unavailable for a longer period while it waits for the timeout to expire.

Recovering a Single Failed Node

Recovery from a single node failure is typically automatic. If one node in a cluster with three or more members fails, the rest of the cluster and continues to operate. When the failed node comes back online, it will automatically connect to the cluster and initiate a to synchronize its data. No data is lost in a single node failure.

Recovering the Primary Component After a Full Cluster Outage

A full cluster outage occurs when all nodes shut down or when , leaving no Primary Component. In this scenario, you must manually intervene to safely restart the cluster.

Manual Bootstrap (Using `grastate.dat`)

This is the traditional recovery method. You must manually identify the node with the most recent data and force it to become the first node in a new cluster.

Stop all nodes in the cluster.
Identify the most advanced node by checking the seqno value in the in each node's data directory. The node with the highest seqno is the correct one to start from.
the new Primary Component by starting the MariaDB service on that single advanced node using a special command (e.g., galera_new_cluster).

Automatic Recovery with `pc.recovery`

Modern versions of Galera Cluster enable the pc.recovery parameter by default. This feature attempts to automate the recovery of the Primary Component.

When pc.recovery is enabled, nodes that were part of the last known Primary Component will save the state of that component to a file on disk called gvwstate.dat. If the entire cluster goes down, it can automatically recover its state once all the nodes from that last saved component achieve connectivity with each other.

Understanding the `gvwstate.dat` file

The gvwstate.dat file is created in the data directory of a node when it is part of a Primary Component and is deleted upon graceful shutdown. It contains the node's own UUID and its view of the other members of the component. An example:

my_uuid: The UUID of the node that owns this file.
view_id: An identifier for the specific cluster view.
member: The UUIDs of all nodes that were part of this saved Primary Component.

Advanced Procedure: Modifying the Saved State

Avoid manually editing the gvwstate.dat file unless absolutely necessary. Doing so may cause data inconsistency or prevent the cluster from starting. This action should only be considered in critical recovery situations.

In the rare case that you need to force a specific set of nodes to form a new Primary Component, you can manually edit the gawtate.dat file on each of those nodes. By ensuring that each node's file lists itself and all other desired members in the member fields, you can force them to recognize each other and form a new component when you start them.

Failures During State Transfers

A node failure can also occur if a is interrupted. This will cause the receiving node (the "joiner") to abort its startup process. To recover, simply restart the MariaDB service on the failed joiner node.

_{This page is licensed: CC BY-SA / Gnu FDL}

Using MariaDB GTIDs with MariaDB Galera Cluster

MariaDB's are very useful when used with , which is primarily what that feature was developed for. Galera Cluster, on the other hand, was developed by Codership for all MySQL and MariaDB variants, and the initial development of the technology pre-dated MariaDB's implementation. As a side effect, MariaDB Galera Cluster (at least until ) only partially supports MariaDB's implementation.

GTID Support for Write Sets Replicated by Galera Cluster

Galera Cluster has its own certification-based replication method that is substantially different from . However, it would still be beneficial if MariaDB Galera Cluster was able to associate a Galera Cluster write set with a that is globally unique but that is also consistent for that write set on each cluster node.

Wsrep GTID Mode

MariaDB supports .

MariaDB has a feature called wsrep GTID mode. When this mode is enabled, MariaDB uses some tricks to try to associate each Galera Cluster write set with a that is globally unique, but that is also consistent for that write set on each cluster node. These tricks work in some cases, but can still become inconsistent among cluster nodes.

Enabling Wsrep GTID Mode

Several things need to be configured for wsrep GTID mode to work, such as

needs to be set on all nodes in the cluster.
needs to be set to the same value on all nodes in a given cluster, so that each cluster node uses the same domain when assigning for Galera Cluster's write sets. When replicating between two clusters, each cluster should have this set to a different value, so that each cluster uses different domains when assigning for their write sets.
needs to be enabled on all nodes in the cluster. See .

And as an extra safety measure:

should be set to a different value on all nodes in a given cluster, and each of these values should be different than the configured value. This is to prevent a node from using the same domain used for Galera Cluster's write sets when assigning for non-Galera transactions, such as DDL executed with set or DML executed with set.

If you want to avoid writes accidentally local GTIDS, you can avoid it with by setting this:

In this case you get an error:

You can overwrite it temporarily with:

For information on setting , see .

GTIDs for Transactions Applied by Slave Thread

If a Galera Cluster node is also a , then that node's will be applying transactions that it replicates from its replication master. If the node has set, then each transaction that the applies will also generate a Galera Cluster write set that is replicated to the rest of the nodes in the cluster.

The node acting as slave includes the transaction's original Gtid_Log_Event in the replicated write set, so all nodes should associate the write set with its original GTID. See .

_{This page is licensed: CC BY-SA / Gnu FDL}

#!/bin/bash

# Define the log file
LOG_FILE="/var/log/galera_events.log"

# Arguments passed by Galera
STATUS="$1"
VIEW_ID="$2"
MEMBERS="$3"
IS_PRIMARY="$4"

# Get the current timestamp
TIMESTAMP=$(date +"%Y-%m-%d %T")

# Log the event
echo "${TIMESTAMP}: Node status changed to ${STATUS}. View ID: ${VIEW_ID}. Members: [${MEMBERS}]. Is Primary: ${IS_PRIMARY}" >> "${LOG_FILE}"

exit 0

[mariadb]
...
ssl_cert = /etc/my.cnf.d/certificates/server-cert.pem
ssl_key = /etc/my.cnf.d/certificates/server-key.pem
ssl_ca = /etc/my.cnf.d/certificates/ca.pem
wsrep_provider_options="socket.ssl_cert=/etc/my.cnf.d/certificates/server-cert.pem;socket.ssl_key=/etc/my.cnf.d/certificates/server-key.pem;socket.ssl_ca=/etc/my.cnf.d/certificates/ca.pem"

my_uuid: d3124bc8-1605-11e4-aa3d-ab44303c044a
#vwbeg
view_id: 3 0dae1307-1606-11e4-aa94-5255b1455aa0 12
bootstrap: 0
member: 0dae1307-1606-11e4-aa94-5255b1455aa0 1
member: 47bbe2e2-1606-11e4-8593-2a6d8335bc79 1
member: d3124bc8-1605-11e4-aa3d-ab44303c044a 1
#vwend

Uninstall the old version of MariaDB and the Galera wsrep provider.

sudo apt-get remove mariadb-server galera

sudo yum remove MariaDB-server galera

sudo zypper remove MariaDB-server galera

Uninstall the old version of MariaDB and the Galera wsrep provider.

sudo apt-get remove mariadb-server galera

sudo yum remove MariaDB-server galera

sudo zypper remove MariaDB-server galera

mariadb_backup_galera_info

Upgrading from MariaDB 10.6 to MariaDB 10.11 with Galera Cluster

Methods

Stopping all nodes, upgrading all nodes, then starting the nodes
Rolling upgrade with IST (however, see )
Note that rolling upgrade with SST does not work

Performing a Rolling Upgrade

First, before you get started:

First, take a look at to see what has changed between the major versions.
Check whether any system variables or options have been changed or removed. Make sure that your server's configuration is compatible with the new MariaDB version before upgrading.
Check whether replication has changed in the new MariaDB version in any way that could cause issues while the cluster contains upgraded and non-upgraded nodes.
Check whether any new features have been added to the new MariaDB version. If a new feature in the new MariaDB version cannot be replicated to the old MariaDB version, then do not use that feature until all cluster nodes have been upgrades to the new MariaDB version.

Before you upgrade, it would be best to take a backup of your database. This is always a good idea to do before an upgrade. We would recommend .

Then, for each node, perform the following steps:

Modify the repository configuration, so the system's package manager installs

see for more information.

When this process is done for one node, move onto the next node.

_{This page is licensed: CC BY-SA / Gnu FDL}

sudo apt update
sudo apt install dirmngr software-properties-common apt-transport-https ca-certificates curl -y
curl -LsS https://r.mariadb.com/downloads/mariadb_repo_setup | sudo bash
sudo apt update

sudo apt install mariadb-server mariadb-client galera-4 -y # For MariaDB 10.4+ or later, galera-4 is the provider.
                                                           # For older versions (e.g., 10.3), use galera-3.

# Example for UFW (Ubuntu)
sudo ufw allow 3306/tcp  # MariaDB client connections
sudo ufw allow 4567/tcp  # Galera replication (multicast and unicast)
sudo ufw allow 4567/udp  # Galera replication (multicast)
sudo ufw allow 4568/tcp  # Incremental State Transfer (IST)
sudo ufw allow 4444/tcp  # State Snapshot Transfer (SST)
sudo ufw reload
sudo ufw enable # If firewall is not already enabled

[mysqld]
# Basic MariaDB settings
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0 # Binds to all network interfaces. Adjust if you have a specific private IP for cluster traffic.

# Galera Provider Configuration
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so # Adjust path if different (e.g., /usr/lib64/galera-4/libgalera_smm.so)

# Galera Cluster Configuration
wsrep_cluster_name="my_galera_cluster" # A unique name for your cluster

# IP addresses of ALL nodes in the cluster, comma-separated.
# Use private IPs if available for cluster communication.
wsrep_cluster_address="gcomm://node1_ip_address,node2_ip_address,node3_ip_address"

# This node's specific configuration
wsrep_node_name="node1" # Must be unique for each node (e.g., node1, node2, node3)
wsrep_node_address="node1_ip_address" # This node's own IP address

sudo systemctl stop mariadb # Ensure it's stopped
sudo galera_new_cluster    # This command often wraps the systemctl start --wsrep-new-cluster
                           # Alternatively: sudo systemctl start mariadb --wsrep-new-cluster

RSU: Rolling Schema Upgrade (requires manual steps with pt-online-schema-change).

Galera Cluster Status Variables

Viewing Galera Cluster Status Variables

Galera status variables can be viewed with the statement.

List of Galera Cluster status variables

MariaDB Galera Cluster has the following status variables:

`wsrep_applier_thread_count`

Description: Stores the current number of applier threads to make clear how many slave threads of this type there are.

`wsrep_apply_oooe`

Description: How often write sets have been applied out of order, an indicator of parallelization efficiency.

`wsrep_apply_oool`

Description: How often write sets with a higher sequence number were applied before ones with a lower sequence number, implying slow write sets.

`wsrep_apply_window`

Description: Average distance between highest and lowest concurrently applied seqno.

`wsrep_cert_deps_distance`

Description: Average distance between the highest and the lowest sequence numbers that can possibly be applied in parallel, or the potential degree of parallelization.

`wsrep_cert_index_size`

Description: The number of entries in the certification index.

`wsrep_cert_interval`

Description: Average number of transactions received while a transaction replicates.

`wsrep_cluster_capabilities`

Description:

`wsrep_cluster_conf_id`

Description: Total number of cluster membership changes that have taken place.

`wsrep_cluster_size`

Description: Number of nodes currently in the cluster.

`wsrep_cluster_state_uuid`

Description: UUID state of the cluster. If it matches the value in , the local and cluster nodes are in sync.

`wsrep_cluster_status`

Description: Cluster component status. Possible values are PRIMARY (primary group configuration, quorum present), NON_PRIMARY (non-primary group configuration, quorum lost), or DISCONNECTED (not connected to group, retrying).

`wsrep_cluster_weight`

Description: The total weight of the current members in the cluster. The value is counted as a sum of pc.weight of the nodes in the current primary component.

`wsrep_commit_oooe`

Description: How often a transaction was committed out of order.

`wsrep_commit_oool`

Description: No meaning.

`wsrep_commit_window`

Description: Average distance between highest and lowest concurrently committed seqno.

`wsrep_connected`

Description: Whether or not MariaDB is connected to the wsrep provider. Possible values are ON or OFF.

`wsrep_desync_count`

Description: Returns the number of operations in progress that require the node to temporarily desync from the cluster.

`wsrep_evs_delayed`

Description: Provides a comma separated list of all the nodes this node has registered on its delayed list.

`wsrep_evs_evict_list`

Description: Lists the UUID’s of all nodes evicted from the cluster. Evicted nodes cannot rejoin the cluster until you restart their mysqld processes.

`wsrep_evs_repl_latency`

Description: This status variable provides figures for the replication latency on group communication. It measures latency (in seconds) from the time point when a message is sent out to the time point when a message is received. As replication is a group operation, this essentially gives you the slowest ACK and longest RTT in the cluster. The format is min/avg/max/stddev

`wsrep_evs_state`

Description: Shows the internal state of the EVS protocol.

`wsrep_flow_control_paused`

Description: The fraction of time since the last FLUSH STATUS command that replication was paused due to flow control.

`wsrep_flow_control_paused_ns`

Description: The total time spent in a paused state measured in nanoseconds.

`wsrep_flow_control_recv`

Description: Number of FC_PAUSE events received as well as sent since the most recent status query.

`wsrep_flow_control_sent`

Description: Number of FC_PAUSE events sent since the most recent status query

`wsrep_gcomm_uuid`

Description: The UUID assigned to the node.

`wsrep_incoming_addresses`

Description: Comma-separated list of incoming server addresses in the cluster component.

`wsrep_last_committed`

Description: Sequence number of the most recently committed transaction.

`wsrep_local_bf_aborts`

Description: Total number of high-priority local transactions aborts caused by replication applier threads.

`wsrep_local_cached_downto`

Description: The lowest sequence number, or seqno, in the write-set cache (GCache).

`wsrep_local_cert_failures`

Description: Total number of local transactions that failed the certification test and consequently issued a voluntary rollback.

`wsrep_local_commits`

Description: Total number of local transactions committed on the node.

`wsrep_local_index`

Description: The node's index in the cluster. The index is zero-based.

`wsrep_local_recv_queue`

Description: Current length of the receive queue, which is the number of write sets waiting to be applied.

`wsrep_local_recv_queue_avg`

Description: Average length of the receive queue since the most recent status query. If this value is noticeably larger than zero, the node is likely to be overloaded and cannot apply the write sets as quickly as they arrive, resulting in replication throttling.

`wsrep_local_recv_queue_max`

Description: The maximum length of the recv queue since the last FLUSH STATUS command.

`wsrep_local_recv_queue_min`

Description: The minimum length of the recv queue since the last FLUSH STATUS command.

`wsrep_local_replays`

Description: Total number of transaction replays due to asymmetric lock granularity.

`wsrep_local_send_queue`

Description: Current length of the send queue, which is the number of write sets waiting to be sent.

`wsrep_local_send_queue_avg`

Description: Average length of the send queue since the most recent status query. If this value is noticeably larger than zero, there are most likely network throughput or replication throttling issues.

`wsrep_local_send_queue_max`

Description: The maximum length of the send queue since the last FLUSH STATUS command.

`wsrep_local_send_queue_min`

Description: The minimum length of the send queue since the last FLUSH STATUS command.

`wsrep_local_state`

Description: Internal Galera Cluster FSM state number.

`wsrep_local_state_comment`

Description: Human-readable explanation of the state.

`wsrep_local_state_uuid`

Description: The node's UUID state. If it matches the value in , the local and cluster nodes are in sync.

`wsrep_open_connections`

Description: The number of open connection objects inside the wsrep provider.

`wsrep_open_transactions`

Description: The number of locally running transactions that have been registered inside the wsrep provider. This means transactions that have made operations that have caused write set population to happen. Transactions that are read-only are not counted.

`wsrep_protocol_version`

Description: The wsrep protocol version being used.

`wsrep_provider_name`

Description: The name of the provider. The default is "Galera".

`wsrep_provider_vendor`

Description: The vendor string.

`wsrep_provider_version`

Description: The version number of the Galera wsrep provider

`wsrep_ready`

Description: Whether or not the Galera wsrep provider is ready. Possible values are ON or OFF

`wsrep_received`

Description: Total number of write sets received from other nodes.

`wsrep_received_bytes`

Description: Total size in bytes of all write sets received from other nodes.

`wsrep_repl_data_bytes`

Description: Total size of data replicated.

`wsrep_repl_keys`

Description: Total number of keys replicated.

`wsrep_repl_keys_bytes`

Description: Total size of keys replicated.

`wsrep_repl_other_bytes`

Description: Total size of other bits replicated.

`wsrep_replicated`

Description: Total number of write sets replicated to other nodes.

`wsrep_replicated_bytes`

Description: Total size in bytes of all write sets replicated to other nodes.

`wsrep_rollbacker_thread_count`

Description: Stores the current number of rollbacker threads to make clear how many slave threads of this type there are.

`wsrep_thread_count`

Description: Total number of wsrep (applier/rollbacker) threads.

_{This page is licensed: CC BY-SA / Gnu FDL}

Adjust in /etc/security/limits.conf if needed

You can start mariadbd with the option.

Start the First Cluster

The very first step is to start the nodes in the first cluster. The first node will have to be bootstrapped. The other nodes can be started normally.

Once the nodes are started, you need to pick a specific node that will act as the replication primary for the second cluster.

Backup the Database on the First Cluster's Primary Node and Prepare It

The first step is to simply take and prepare a fresh of the node that you have chosen to be the replication primary. For example:

And then you would prepare the backup as you normally would. For example:

Copy the Backup to the Second Cluster's Replica

Once the backup is done and prepared, you can copy it to the node in the second cluster that will be acting as replica. For example:

Restore the Backup on the Second Cluster's Replica

At this point, you can restore the backup to the , as you normally would. For example:

And adjusting file permissions, if necessary:

Bootstrap the Second Cluster's Replica

Now that the backup has been restored to the second cluster's replica, you can start the server by the node.

Create a Replication User on the First Cluster's Primary

Before the second cluster's replica can begin replicating from the first cluster's primary, you need to on the primary that the replica can use to connect, and you need to the user account the privilege. For example:

Start Replication on the Second Cluster's Replica

At this point, you need to get the replication coordinates of the primary from the original backup.

The coordinates will be in the file.

mariadb-backup dumps replication coordinates in two forms: and file and position coordinates, like the ones you would normally see from output. In this case, it is probably better to use the coordinates.

Check the Status of the Second Cluster's Replica

You should be done setting up the replica now, so you should check its status with . For example:

Start the Second Cluster

If the replica is replicating normally, then the next step would be to start the MariaDB Server process on the other nodes in the second cluster.

Now that the second cluster is up, ensure that it does not start accepting writes yet if you want to set up between the two clusters.

If you are using for state transfer, and it fails for whatever reason (e.g., you do not have the database account it attempts to connect with, or it does not have the necessary permissions), you will see an SQL SYNTAX error in the server . Don't let it fool you, this is just a fancy way to deliver a message (the pseudo-statement inside the bogus SQL will contain the error message).
Do not use transactions of any essential size. Just to insert 100K rows, the server might require an additional 200-300 Mb. In a less fortunate scenario, it can be 1.5 GB for 500K rows, or 3.5 GB for 1M rows. See MDEV-466 for some numbers (you'll see that it's closed, but it's not closed because it was fixed).
Locking is lax when DDL is involved. For example, if your DML transaction uses a table, and a parallel DDL statement is started, in the normal MySQL setup, it would have waited for the metadata lock, but in the Galera context, it will be executed right away. It happens even if you are running a single node, as long as you have configured it as a cluster node. See also MDEV-468. This behavior might cause various side effects; the consequences have not been investigated yet. Try to avoid such parallelism.
Do not rely on auto-increment values to be sequential. Galera uses a mechanism based on autoincrement increment to produce unique non-conflicting sequences, so on every single node, the sequence will have gaps. See
A command may fail with ER_UNKNOWN_COM_ERROR, producing 'WSREP has not yet prepared node for application use' (or 'Unknown command' in older versions) error message. It happens when a cluster is suspected to be split, and the node is in a smaller part, for example, during a network glitch, when nodes temporarily lose each other. It can also occur during state transfer. The node takes this measure to prevent data inconsistency. It's usually a temporary state; it can be detected by checking the value. The node, however, allows the SHOW and SET commands during this period.
After a temporary split, if the 'good' part of the cluster is still reachable and its state was modified, resynchronization occurs. As a part of it, nodes of the 'bad' part of the cluster drop all client connections. It might be quite unexpected, especially if the client was idle and did not even know anything was happening. Please also note that after the connection to the isolated node is restored, if there is a flow on the node, it takes a long time for it to synchronize, during which the "good" node says that the cluster is already of the normal size and synced, while the rejoining node says it's only joined (but not synced). The connections keep getting 'unknown command'. It should pass eventually.
While is checked on startup and can only be ROW (see ), it can be changed at runtime. Do NOT change at runtime, it is likely to cause replication failure, but make all other nodes crash.
If you are using rsync for state transfer, and a node crashes before the state transfer is over, the rsync process might hang forever, occupying the port and preventing the node. The problem will show up as 'port in use' in the server error log. Find the orphaned rsync process and kill it manually.
Performance: By design performance of the cluster cannot be higher than the performance of the slowest node; however, even if you have only one node, its performance can be considerably lower compared to running the same server in a standalone mode (without wsrep provider). It is particularly true for big enough transactions (even those which are well within current limitations on transaction size quoted above).
Windows is not supported.
Replication filters: When using a Galera cluster, replication filters should be used with caution. See for more details. See also and .
Flashback isn't supported in Galera due to an incompatible binary log format.
FLUSH PRIVILEGES is not replicated.
The needed to be disabled by setting prior to MariaDB Galera Cluster 5.5.40, MariaDB Galera Cluster 10.0.14, and
In an asynchronous replication setup where a master replicates to a Galera node acting as a slave, parallel replication (slave-parallel-threads > 1) on the slave is currently not supported (see ).
The disk-based is not encrypted ().
Nodes may have different table definitions, especially temporarily during operations, but the same apply as they do for row-based replication

Installing MariaDB Galera on IBM Cloud

Get MariaDB Galera on IBM Cloud

You should have an IBM Cloud account; otherwise, you can register here. At the end of the tutorial, you will have a cluster with MariaDB up and running. IBM Cloud uses Bitnami charts to deploy MariaDB Galera with Helm

We will provision a new Kubernetes Cluster for you if, you already have one, skip to step 2
We will deploy the IBM Cloud Block Storage plug-in; if you already have it, skip to step 3
MariaDB Galera deployment

Step 1: Provision Kubernetes Cluster

Click the Catalog button on the top
Select Service from the catalog
Search for Kubernetes Service and click on it

You are now at the Kubernetes deployment page; you need to specify some details about the cluster
Choose a standard or free plan; the free plan only has one worker node and no subnet. to provision a standard cluster, you will need to upgrade account to Pay-As-You-Go
To upgrade to a Pay-As-You-Go account, complete the following steps:
In the console, go to Manage > Account.

Now choose your location settings; for more information, please visit
Choose Geography (continent)

Choose Single or Multizone. In single zone, your data is only kept in one datacenter; on the other hand, with Multizone it is distributed to multiple zones, thus safer in an unforeseen zone failure

Choose a Worker Zone if using Single zones or Metro if Multizone

If you wish to use Multizone please set up your account with or
If at your current location selection, there is no available Virtual LAN, a new Vlan will be created for you
Choose a Worker node setup or use the preselected one, set Worker node amount per zone

Choose Master Service Endpoint, In VRF-enabled accounts, you can choose private-only to make your master accessible on the private network or via VPN tunnel. Choose public-only to make your master publicly accessible. When you have a VRF-enabled account, your cluster is set up by default to use both private and public endpoints. For more information visit .

Give cluster a name

Give desired tags to your cluster; for more information, visit

Click create

Wait for you cluster to be provisioned

Your cluster is ready for usage

Step 2: Deploy IBM Cloud Block Storage Plug-in

The Block Storage plug-in is a persistent, high-performance iSCSI storage that you can add to your apps by using Kubernetes Persistent Volumes (PVs).

Click the Catalog button on the top
Select Software from the catalog
Search for IBM Cloud Block Storage plug-in and click on it

On the application page Click in the dot next to the cluster, you wish to use
Click on Enter or Select Namespace and choose the default Namespace or use a custom one (if you get error please wait 30 minutes for the cluster to finalize)

Give a name to this workspace
Click install and wait for the deployment

Step 3: Deploy MariaDB Galera

We will deploy MariaDB on our cluster

Click the Catalog button on the top
Select Software from the catalog
Search for MariaDB and click on it

On the application page Click in the dot next to the cluster, you wish to use

Click on Enter or Select Namespace and choose the default Namespace or use a custom one

Give a unique name to workspace, which you can easily recognize

Select which resource group you want to use, it's for access controll and billing purposes. For more information please visit

Give tags to your MariaDB Galera, for more information visit

Click on Parameters with default values, You can set deployment values or use the default ones

Please set the MariaDB Galera root password in the parameters

After finishing everything, tick the box next to the agreements and click install

The MariaDB Galera workspace will start installing, wait a couple of minutes

Your MariaDB Galera workspace has been successfully deployed

Verify MariaDB Galera Installation

Go to in your browser
Click on Clusters
Click on your Cluster

Now you are at your clusters overview, here Click on Actions and Web terminal from the dropdown menu

Click install - wait couple of minutes

Click on Actions
Click Web terminal, and a terminal will open up
Type in the terminal; please change NAMESPACE to the namespace you choose at the deployment setup:

Enter your pod with bash; please replace PODNAME with your mariadb pod's name

After you are in your pod , please verify that MariaDB is running on your pod's cluster. Please enter the root password after the prompt

You have successfully deployed MariaDB Galera on IBM Cloud!

_{This page is licensed: CC BY-SA / Gnu FDL}

Galera Cluster System Tables

Starting with Galera 4 (used in and later), several system tables related to replication are available in the mysql database. These tables can be queried by administrators to get a real-time view of the cluster's layout, membership, and current operations.

You can view these tables with the following query:

mysql vs. mariadb

You'll see queries referencing the mysql database (e.g., FROM mysql.wsrep_cluster). This is intentional. MariaDB, a MySQL fork, retains the mysql name for its internal system schema to ensure historical and backward compatibility where it manages user permissions and system tables.

This is different from the command-line client, which should always be invoked as mariadb.

These tables are managed by the cluster itself and should not be modified by users, with the exception of wsrep_allowlist.

`wsrep_allowlist`

This table stores a list of allowed IP addresses that can join the cluster and perform a state transfer (IST/SST). It is a security feature to prevent unauthorized nodes from joining.

To add a new node to the allowlist, you can INSERT its IP address:

If a node attempts to join and its IP address is not in the allowlist, the join will fail. The DONOR nodes will log a warning similar to this:

The joining node will fail with a connection timeout error.

`wsrep_cluster`

This table contains a single row with a high-level view of the cluster's identity, state, and capabilities.

Attribute

Description

You can query its contents like this:

`wsrep_cluster_members`

This table provides a real-time list of all the nodes that are currently members of the cluster component.

Node

Description

Querying this table gives you a quick overview of the current cluster membership:

`wsrep_streaming_log`

This table contains metadata for Streaming Replication transactions that are currently in progress. Each row represents a write-set fragment. The table is typically empty unless a large or long-running transaction with streaming enabled is active.

Fragment

Description

Example of querying the table during a streaming transaction:

_{This page is licensed: CC BY-SA / Gnu FDL}

+---------------------------+
| Tables_in_mysql (wsrep%)  |
+---------------------------+
| wsrep_allowlist           |
| wsrep_cluster             |
| wsrep_cluster_members     |
| wsrep_streaming_log       |
+---------------------------+

If you want to use GTIDs, then you will have to first set to the coordinates that we pulled from the file, and we would set MASTER_USE_GTID=slave_pos in the command. For example:

WSREP

MariaDB Galera Cluster Overview

MariaDB Enterprise Cluster is a solution designed to handle high workloads exceeding the capacity of a single server. It is based on Galera Cluster technology integrated with MariaDB Enterprise Server and includes features like data-at-rest encryption for added security. This multi-primary replication alternative is ideal for maintaining data consistency across multiple servers, providing enhanced reliability and scalability.

Overview

MariaDB Enterprise Cluster, powered by Galera, is available with MariaDB Enterprise Server. MariaDB Galera Cluster is available with MariaDB Community Server.

In order to handle increasing load and especially when that load exceeds what a single server can process, it is best practice to deploy multiple MariaDB Enterprise Servers with a replication solution to maintain data consistency between them. MariaDB Enterprise Cluster is a multi-primary replication solution that serves as an alternative to the single-primary MariaDB Replication.

An Introduction to Database Replication

Database replication is the process of continuously copying data from one database server (a "node") to another, creating a distributed and resilient system. The goal is for all nodes in this system to contain the same set of data, forming what is known as a database cluster. From the perspective of a client application, this distributed nature is often transparent, allowing it to interact with the cluster as if it were a single database.

Replication Architectures

Primary/Replica

The most common replication architecture is Primary/Replica (also known as Master/Slave). In this model:

The Primary node is the authoritative source. It is the only node that accepts write operations (e.g., INSERT, UPDATE, DELETE).
The Primary logs these changes and sends them to one or more Replica nodes.
The Replicas receive the stream of changes and apply them to their own copy of the data. Replicas are typically used for read-only queries, backups, or as a hot standby for failover.

Multi-Primary Replication

In a multi-primary system, every node in the cluster acts as a primary. This means any node can accept write operations. When a node receives an update, it automatically propagates that change to all other primary nodes in the cluster. Each primary node logs its own changes and communicates them to its peers to maintain synchronization.

Replication Protocols: Asynchronous vs. Synchronous

Beyond the architecture, the replication protocol determines how transactions are confirmed across the cluster.

Asynchronous Replication (Lazy Replication)

In asynchronous replication, the primary node commits a transaction locally first and then sends the changes to the replicas in the background. The transaction is confirmed as complete to the client immediately after it's saved on the primary. This means there is a brief period, known as replication lag, where the replicas have not yet received the latest data.

Synchronous Replication (Eager Replication)

In synchronous replication, a transaction is not considered complete (committed) until it has been successfully applied and confirmed on all participating nodes. When the client receives confirmation, it is a guarantee that the data exists consistently across the cluster.

The Trade-offs of Synchronous Replication

Advantages

Synchronous replication offers several powerful advantages over its asynchronous counterpart:

High Availability: Since all nodes are fully synchronized, if one node fails, there is zero data loss. Traffic can be immediately directed to another node without complex failover procedures, as all data replicas are guaranteed to be consistent.
Read-After-Write Consistency: Synchronous replication guarantees causality. A SELECT query issued immediately after a transaction will always see the effects of that transaction, even if the query is executed on a different node in the cluster.

Disadvantages

Traditionally, eager replication protocols coordinate nodes one operation at a time. They use a two phase commit, or distributed locking. A system with number of nodes due to process operations with a throughput of transactions per second gives you messages per second with:

What this means that any increase in the number of nodes leads to an exponential growth in the transaction response times and in the probability of conflicts and deadlock rates.

For this reason, asynchronous replication remains the dominant replication protocol for database performance, scalability and availability. Widely adopted open source databases, such as MySQL and PostgreSQL only provide asynchronous replication solutions.

Galera's Solution: Modern Synchronous Replication

Galera Cluster solves the traditional problems of synchronous replication by using a modern, certification-based approach built on several key innovations:

Group Communication: A robust messaging layer ensures that information is delivered to all nodes reliably and in the correct order, forming a solid foundation for data consistency.
Write-Set Replication: Instead of coordinating on every individual operation, database changes (writes) are grouped into a single package called a "write-set." This write-set is replicated as a single message, avoiding the high overhead of traditional two-phase commit.
Optimistic Execution: Transactions are first executed optimistically on a local node. The resulting write-set is then broadcast to the cluster for a fast, parallel certification process. If it passes certification (meaning no conflicts), it is committed on all nodes.

The certification-based replication system that Galera Cluster uses is built on these powerful approaches, delivering the benefits of synchronous replication without the traditional performance bottlenecks.

How it Works

MariaDB Enterprise Cluster is built on MariaDB Enterprise Server with Galera Cluster and MariaDB MaxScale. In MariaDB Enterprise Server 10.5 and later, it features enterprise-specific options, such as data-at-rest encryption for the write-set cache, that are not available in other Galera Cluster implementations.

As a multi-primary replication solution, any MariaDB Enterprise Server can operate as a Primary Server. This means that changes made to any node in the cluster replicate to every other node in the cluster, using certification-based replication and global ordering of transactions for the InnoDB storage engine.

MariaDB Enterprise Cluster is only available for Linux operating systems.

Architecture

There are a few things to consider when planning the hardware, virtual machines, or containers for MariaDB Enterprise Cluster.

MariaDB Enterprise Cluster architecture involves deploying with multiple instances of MariaDB Enterprise Server. The Servers are configured to use multi-primary replication to maintain consistency between themselves while routes reads and writes between them.

The application establishes a client connection to . MaxScale then routes statements to one of the MariaDB Enterprise Servers in the cluster. Writes made to any node in this cluster replicate to all the other nodes of the cluster.

When MariaDB Enterprise Servers start in a cluster:

Each Server attempts to establish network connectivity with the other Servers in the cluster
Groups of connected Servers form a component
When a Server establishes network connectivity with the Primary Component, it synchronizes its local database with that of the cluster
As a member of the Primary Component, the Server becomes operational — able to accept read and write queries from clients

In planning the number of systems to provision for MariaDB Enterprise Cluster, it is important to keep cluster operation in mind. Ensuring that it has enough disk space and that it is able to maintain a Primary Component in the event of outages.

Each Server requires the minimum amount of disk space needed to store the entire database. The upper storage limit for MariaDB Enterprise Cluster is that of the smallest disk in use.
Each switch in use should have an odd number of Servers above three.
In a cluster that spans multiple switches, each data center in use should have an odd number of switches above three.
In a cluster that spans multiple data centers, use an odd number of data centers above three.

In case of planning Servers to the switch, switches to the data center, and data centers in the cluster, this model helps preserve the Primary Component. A minimum of three in use means that a single Server or switch can fail without taking down the cluster.

Using an odd number above three reduces the risk of a split-brain situation (that is, a case where two separate groups of Servers believe that they are part of the Primary Component and remain operational).

Cluster Configuration

Nodes in MariaDB Enterprise Cluster are individual MariaDB Enterprise Servers configured to perform multi-primary cluster replication. This configuration is set using a series of system variables in the configuration file.

Additional information on system variables is available in the Reference chapter.

General Configuration

The innodb_autoinc_lock_mode system variable must be set to a value of 2 to enable interleaved lock mode. MariaDB Enterprise Cluster does not support other lock modes.

Ensure also that the bind_address system variable is properly set to allow MariaDB Enterprise Server to listen for TCP/IP connections:

Cluster Name and Address

MariaDB Enterprise Cluster requires that you set a name for your cluster, using the wsrep_cluster_name system variable. When nodes connect to each other, they check the cluster name to ensure that they've connected to the correct cluster before replicating data. All Servers in the cluster must have the same value for this system variable.

Using the wsrep_cluster_address system variable, you can define the back-end protocol (always gcomm) and comma-separated list of the IP addresses or domain names of the other nodes in the cluster.

It is best practice to list all nodes on this system variable, as this is the list the node searches when attempting to reestablish network connectivity with the primary component.

Note: In certain environments, such as deployments in the cloud, you may also need to set the wsrep_node_address system variable, so that MariaDB Enterprise Server properly informs other Servers how to reach it.

Galera Replicator Plugin

MariaDB Enterprise Server connects to other Servers and replicates data from the cluster through a wsrep Provider called the Galera Replicator plugin. In order to enable clustering, specify the path to the relevant .so file using the wsrep_provider system variable.

MariaDB Enterprise Server 10.4 and later installations use an enterprise-build of the Galera Enterprise 4 plugin. This includes all the features of Galera Cluster 4 as well as enterprise features like GCache encryption.

To enable MariaDB Enterprise Cluster, use the libgalera_enterprise_smm.so library:

MariaDB Enterprise Server use the older community-release of the Galera 3 plugin. This is set using the libgalera_smm.so library:

In addition to system variables, there is a set of options that you can pass to the wsrep Provider to configure or to otherwise adjust its operations. This is done through the wsrep_provider_options system variable:

Additional information is available in the Reference chapter.

Cluster Replication

MariaDB Enterprise Cluster implements a multi-primary replication solution.

When you write to a table on a node, the node collects the write into a write-set transaction, which it then replicates to the other nodes in the cluster.

Your application can write to any node in the cluster. Each node certifies the replicated write-set. If the transaction has no conflicts, the nodes apply it. If the transaction does have conflicts, it is rejected and all of the nodes revert the changes.

Quorum

The first node you start in MariaDB Enterprise Cluster bootstraps the Primary Component. Each subsequent node that establishes a connection joins and synchronizes with the Primary Component. A cluster achieves a quorum when more than half the nodes are joined to the Primary Component.

When a component forms that has less than half the nodes in the cluster, it becomes non-operational, since it believes there is a running Primary Component to which it has lost network connectivity.

These quorum requirements, combined with the requisite number of odd nodes, avoid a split brain situation, or one in which two separate components believe they are each the Primary Component.

Dynamically Bootstrapping the Cluster

In cases where the cluster goes down and your nodes become non-operational, you can dynamically bootstrap the cluster.

First, find the most up-to-date node (that is, the node with the highest value for the wsrep_last_committed status variable):

Once you determine the node with the most recent transaction, you can designate it as the Primary Component by running the following on it:

The node bootstraps the Primary Component onto itself. Other nodes in the cluster with network connectivity then submit state transfer requests to this node to bring their local databases into sync with what's available on this node.

State Transfers

From time to time a node can fall behind the cluster. This can occur due to expensive operations being issued to it or due to network connectivity issues that lead to write-sets backing up in the queue. Whatever the cause, when a node finds that it has fallen too far behind the cluster, it attempts to initiate a state transfer.

In a state transfer, the node connects to another node in the cluster and attempts to bring its local database back in sync with the cluster. There are two types of state transfers:

Incremental State Transfer (IST)
State Snapshot Transfer (SST)

When the donor node receives a state transfer request, it checks its write-set cache (that is, the GCache) to see if it has enough saved write-sets to bring the joiner into sync. If the donor node has the intervening write-sets, it performs an IST operation, where the donor node only sends the missing write-sets to the joiner. The joiner applies these write-sets following the global ordering to bring its local databases into sync with the cluster.

When the donor does not have enough write-sets cached for an IST, it runs an SST operation. In an SST, the donor uses a backup solution, like MariaDB Enterprise Backup, to copy its data directory to the joiner. When the joiner completes the SST, it begins to process the write-sets that came in during the transfer. Once it's in sync with the cluster, it becomes operational.

IST's provide the best performance for state transfers and the size of the GCache may need adjustment to facilitate their use.

Flow Control

MariaDB Enterprise Server uses Flow Control to sometimes throttle transactions and ensure that all nodes work equitably.

Write-sets that replicate to a node are collected by the node in its received queue. The node then processes the write-sets according to global ordering. Large transactions, expensive operations, or simple hardware limitations can lead to the received queue backing up over time.

When a node's received queue grows beyond certain limits, the node initiates Flow Control. In Flow Control, the node pauses replication to work through the write-sets it already has. Once it has worked the received queue down to a certain size, it re-initiates replication.

Eviction

A node or nodes will be removed or evicted from a cluster if it becomes non-responsive.

In MariaDB Enterprise Cluster, each node monitors network connectivity and response times from every other node. MariaDB Enterprise Cluster evaluates network performance using the EVS Protocol.

When a node finds another to have poor network connectivity, it adds an entry to the delayed list. If the node becomes active again and the network performance improves for a certain amount of time, an entry for it is removed from the delayed list. That is, the longer a node has network problems the longer it has to be active again to be removed from the delayed list.

If the number of entries for a node in the delayed list exceeds a threshold established for the cluster, the EVS Protocol evicts the node from the cluster.

Evicted nodes become non-operational components. They cannot rejoin the cluster until you restart MariaDB Enterprise Server.

Streaming Replication

Under normal operation, huge transactions and long-running transactions are difficult to replicate. MariaDB Enterprise Cluster rejects conflicting transactions and rolls back the changes. A transaction that takes several minutes or longer to run can encounter issues if a small transaction is run on another node and attempts to write to the same table. The large transaction fails because it encounters a conflict when it attempts to replicate.

MariaDB Enterprise Server 10.4 and later support streaming replication for MariaDB Enterprise Cluster. In streaming replication, huge transactions are broken into transactional fragments, which are replicated and applied as the operation runs. This makes it more difficult for intervening sessions to introduce conflicts.

Initiate Streaming Replication

To initiate streaming replication, set the wsrep_trx_fragment_unit and wsrep_trx_fragment_size system variables. You can set the unit to BYTES, ROWS, or STATEMENTS:

Then, run your transaction.

Streaming replication works best with very large transactions where you don't expect to encounter conflicts. If the statement does encounter a conflict, the rollback operation is much more expensive than usual. As such, it's best practice to enable streaming replication at a session-level and to disable it by setting the wsrep_trx_fragment_size system variable to 0 when it's not needed.

Galera Arbitrator

Deployments on mixed hardware can introduce issues where some MariaDB Enterprise Servers perform better than others. A Server in one part of the world might perform more reliably or be physically closer to most users than others. In cases where a particular MariaDB Enterprise Server holds logical significance for your cluster, you can weight its value in quorum calculations.

Galera Arbitrator is a separate process that runs alongside MariaDB Enterprise Server. While the Arbitrator does not take part in replication, whenever the cluster performs quorum calculations it gives the Arbitrator a vote as though it were another MariaDB Enterprise Server. In effect this means that the system has the vote of MariaDB Enterprise Server plus any running Arbitrators in determining whether it's part of the Primary Component.

Bear in mind that the Galera Arbitrator is a separate package, galera-arbitrator-4, which is not installed by default with MariaDB Enterprise Server.

Scale-out

MariaDB Enterprise Servers that join a cluster attempt to connect to the IP addresses provided to the wsrep_cluster_address system variable. This variable adjusts itself at runtime to include the addresses of all connected nodes.

To scale-out MariaDB Enterprise Cluster, start new MariaDB Enterprise Servers with the appropriate wsrep_cluster_address list and the same wsrep_cluster_name value. The new nodes establish network connectivity with the running cluster and request a state transfer to bring their local database into sync with the cluster.

Once the MariaDB Enterprise Server reports itself as being in sync with the cluster, MariaDB MaxScale can begin including it in the load distribution for the cluster.

Being a multi-primary replication solution means that any MariaDB Enterprise Server in the cluster can handle write operations, but write scale-out is minimal as every Server in the cluster needs to apply the changes.

Failover

MariaDB Enterprise Cluster does not provide failover capabilities on its own. is used to route client connections to MariaDB Enterprise Server.

Unlike a traditional load balancer, is aware of changes in the node and cluster states.

MaxScale takes nodes out of the distribution that initiate a blocking SST operation or Flow Control or otherwise go down, which allows them to recover or catch up without stopping service to the rest of the cluster.

Backups

With MariaDB Enterprise Cluster, each node contains a replica of all the data in the cluster. As such, you run on any node to back up the available data. The process for backing up a node is the same as for a single MariaDB Enterprise Server.

Encryption

MariaDB Enterprise Server supports data-at-rest encryption to secure data on disk, and data-in-transit encryption to secure data on the network.

MariaDB Enterprise Server support data-at-rest encryption of the GCache, the file used by Galera systems to cache write sets. Encrypting GCache ensures the Server encrypts both data it temporarily caches from the cluster as well as the data it permanently stores in tablespaces.

For data-in-transit, MariaDB Enterprise Cluster supports encryption the same as MariaDB Server and additionally provides data-in-transit encryption for Galera replication traffic and for State Snapshot Transfer (SST) traffic.

Data-in-Transit Encryption

MariaDB Enterprise Server 10.6 encrypts Galera replication and SST traffic using the server's TLS configuration by default. With the wsrep_ssl_mode system variable, you can configure the node to use the TLS configuration of .

MariaDB Enterprise Server 10.5 and earlier support encrypting Galera replication and SST traffic through .

TLS encryption is only available when used by all nodes in the cluster.

Enabling GCache Encryption

To encrypt data-at-rest such as GCache, stop the server, set encrypt_binlog=ON within the MariaDB Enterprise Server configuration file, and restart the server. This variable also controls encryption of the binary log and the relay log when used.

Disabling GCache Encryption

To stop using encryption on the GCache file, stop the server, set encrypt_binlog=OFF within the MariaDB Enterprise Server configuration file, and restart the server. This variable also controls encryption of the binary log and the relay log when used.

CHANGE MASTER TO 
   MASTER_HOST="c1dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_LOG_FILE='mariadb-bin.000096',
   MASTER_LOG_POS=568,
START SLAVE;

SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO 
   MASTER_HOST="c2dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_USE_GTID=slave_pos;
START SLAVE;

SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO 
   MASTER_HOST="c1dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_USE_GTID=slave_pos;
START SLAVE;

CHANGE MASTER TO 
   MASTER_HOST="c2dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_LOG_FILE='mariadb-bin.000096',
   MASTER_LOG_POS=568;
START SLAVE;

SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO 
   MASTER_HOST="c1dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_USE_GTID=slave_pos;
START SLAVE;

CHANGE MASTER TO 
   MASTER_HOST="c1dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_LOG_FILE='mariadb-bin.000096',
   MASTER_LOG_POS=568,
START SLAVE;

SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO 
   MASTER_HOST="c2dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_USE_GTID=slave_pos;
START SLAVE;

CHANGE MASTER TO 
   MASTER_HOST="c2dbserver1", 
   MASTER_PORT=3310, 
   MASTER_USER="repl",  
   MASTER_PASSWORD="password", 
   MASTER_LOG_FILE='mariadb-bin.000096',
   MASTER_LOG_POS=568;
START SLAVE;

+-------+----------+------+-----+---------+-------+
| Field | Type     | Null | Key | Default | Extra |
+-------+----------+------+-----+---------+-------+
| ip    | char(64) | NO   | PRI | NULL    |       |
+-------+----------+------+-----+---------+-------+

*************************** 1. row ***************************
   cluster_uuid: bd5fe1c3-7d80-11e9-8913-4f209d688a15
        view_id: 3
     view_seqno: 2956
protocol_version: 4
   capabilities: 184703

*************************** 1. row ***************************
           node_uuid: e39d1774-7e2b-11e9-b5b2-7696f81d30fb
        cluster_uuid: bd5fe1c3-7d80-11e9-8913-4f209d688a15
           node_name: galera1
node_incoming_address: AUTO
*************************** 2. row ***************************
           node_uuid: eb8fc512-7e2b-11e9-bb74-3281cf207f60
        cluster_uuid: bd5fe1c3-7d80-11e9-8913-4f209d688a15
           node_name: galera2
node_incoming_address: AUTO
*************************** 3. row ***************************
           node_uuid: 2347a8ac-7e2c-11e9-b6f0-da90a2d0a563
        cluster_uuid: bd5fe1c3-7d80-11e9-8913-4f209d688a15
           node_name: galera3
node_incoming_address: AUTO

+--------------------------------------+--------+-------+-------+
| node_uuid                            | trx_id | seqno | flags |
+--------------------------------------+--------+-------+-------+
| a006244a-7ed8-11e9-bf00-867215999c7c |     26 |     4 |     1 |
+--------------------------------------+--------+-------+-------+

Getting Started with MariaDB Galera Cluster

The most recent release of is: Stable (GA) Download Now, Alternate download from mariadb.org

The current versions of the Galera wsrep provider library are 26.4.21 for Galera 4. For convenience, packages containing these libraries are included in the MariaDB .

Currently, MariaDB Galera Cluster only supports the storage engine (although there is experimental support for and, from , ).

Galera Cluster Support in MariaDB Server

MariaDB Galera Cluster is powered by:

MariaDB Server.
The patch for MySQL Server and MariaDB Server. The patch currently supports only Unix-like operating systems.
The .

The patch has been merged into MariaDB Server. This means that the functionality of MariaDB Galera Cluster can be obtained by installing the standard MariaDB Server packages and the package. The following version corresponds to each MariaDB Server version:

MariaDB Galera Cluster uses 4. This means that the patch is version 26 and the is version 4.

See for more information about how to interpret these version numbers.

See for more information about which specific version is included in each release of MariaDB Server.

In supported builds, Galera Cluster functionality can be enabled by setting some configuration options that are mentioned below. Galera Cluster functionality is not enabled in a standard MariaDB Server installation unless explicitly enabled with these configuration options.

Prerequisites

Swap Size Requirements

During normal operation, a MariaDB Galera node consumes no more memory than a regular MariaDB server. Additional memory is consumed for the certification index and uncommitted write sets, but normally, this should not be noticeable in a typical application. There is one exception, though:

Writeset caching during state transfer

When a node is receiving a state transfer, it cannot process and apply incoming writesets because it has no state to apply them to yet. Depending on a state transfer mechanism (e.g.) the node that sends the state transfer may not be able to apply writesets as well. Thus, they need to cache those writesets for a catch-up phase. Currently the writesets are cached in memory and, if the system runs out of memory either the state transfer will fail or the cluster will block waiting for the state transfer to end.

To control memory usage for writeset caching, check the : gcs.recv_q_hard_limit, gcs.recv_q_soft_limit, and gcs.max_throttle.

Limitations

Before using MariaDB Galera Cluster, we would recommend reading through the , so you can be sure that it is appropriate for your application.

Installing MariaDB Galera Cluster

To use MariaDB Galera Cluster, there are two primary packages that you need to install:

A MariaDB Server version that supports Galera Cluster.
The Galera wsrep provider library.

As mentioned in the previous section, Galera Cluster support is actually included in the standard MariaDB Server packages. That means that installing MariaDB Galera Cluster package is the same as installing standard MariaDB Server package in those versions. However, you will also have to install an additional package to obtain the Galera wsrep provider library.

Some methods may also require additional packages to be installed. The SST method is generally the best option for large clusters that expect a lot of loads.

Installing MariaDB Galera Cluster with a Package Manager

MariaDB Galera Cluster can be installed via a package manager on Linux. In order to do so, your system needs to be configured to install from one of the MariaDB repositories.

You can configure your package manager to install it from MariaDB Corporation's MariaDB Package Repository by using the .

You can also configure your package manager to install it from MariaDB Foundation's MariaDB Repository by using the .

Installing MariaDB Galera Cluster with yum/dnf

On RHEL, CentOS, Fedora, and other similar Linux distributions, it is highly recommended to install the relevant from MariaDB's repository using or . Starting with RHEL 8 and Fedora 22, yum has been replaced by dnf, which is the next major version of yum. However, yum commands still work on many systems that use dnf.

To install MariaDB Galera Cluster with yum or dnf, follow the instructions at .

Installing MariaDB Galera Cluster with apt-get

On Debian, Ubuntu, and other similar Linux distributions, it is highly recommended to install the relevant from MariaDB's repository using .

To install MariaDB Galera Cluster with apt-get, follow the instructions at .

Installing MariaDB Galera Cluster with zypper

On SLES, OpenSUSE, and other similar Linux distributions, it is highly recommended to install the relevant from MariaDB's repository using .

To install MariaDB Galera Cluster with zypper, follow the instructions at .

Installing MariaDB Galera Cluster with a Binary Tarball

To install MariaDB Galera Cluster with a binary tarball, follow the instructions at .

To make the location of the libgalera_smm.so library in binary tarballs more similar to its location in other packages, the library is now found at lib/galera/libgalera_smm.so in the binary tarballs, and there is a symbolic link in the lib directory that points to it.

Installing MariaDB Galera Cluster from Source

To install MariaDB Galera Cluster by compiling it from source, you will have to compile both MariaDB Server and the Galera wsrep provider library. For some information on how to do this, see the pages at . The pages at and may also be helpful.

Configuring MariaDB Galera Cluster

A number of options need to be set in order for Galera Cluster to work when using MariaDB. See for more information.

Bootstrapping a New Cluster

To first node of a new cluster needs to be bootstrapped by starting on that node with the option option. This option tells the node that there is no existing cluster to connect to. The node will create a new UUID to identify the new cluster.

Do not use the option when connecting to an existing cluster. Restarting the node with this option set will cause the node to create new UUID to identify the cluster again, and the node won't reconnect to the old cluster. See the next section about how to reconnect to an existing cluster.

For example, if you were manually starting on a node, then you could bootstrap it by executing the following:

However, keep in mind that most users are not going to be starting manually. Instead, most users will use a to start . See the following sections on how to bootstrap a node with the most common service managers.

Systemd and Bootstrapping

On operating systems that use , a node can be bootstrapped in the following way:

This wrapper uses to run with the option.

If you are using the service that supports the , then you can bootstrap a specific instance by specifying the instance name as a suffix. For example:

Systemd support and the galera_new_cluster script were added.

SysVinit and Bootstrapping

On operating systems that use , a node can be bootstrapped in the following way:

This runs with the option.

Adding Another Node to a Cluster

Once you have a cluster running and you want to add/reconnect another node to it, you must supply an address of one or more of the existing cluster members in the option. For example, if the first node of the cluster has the address 192.168.0.1, then you could add a second node to the cluster by setting the following option in a server in an :

The new node only needs to connect to one of the existing cluster nodes. Once it connects to one of the existing cluster nodes, it will be able to see all of the nodes in the cluster. However, it is generally better to list all nodes of the cluster in , so that any node can join a cluster by connecting to any of the other cluster nodes, even if one or more of the cluster nodes are down. It is even OK to list a node's own IP address in , since Galera Cluster is smart enough to ignore it.

Once all members agree on the membership, the cluster's state will be exchanged. If the new node's state is different from that of the cluster, then it will request an IST or to make itself consistent with the other nodes.

Restarting the Cluster

If you shut down all nodes at the same time, then you have effectively terminated the cluster. Of course, the cluster's data still exists, but the running cluster no longer exists. When this happens, you'll need to bootstrap the cluster again.

If the cluster is not bootstrapped and on the first node is just started normally, then the node willl try to connect to at least one of the nodes listed in the option. If no nodes are currently running, then this will fail. Bootstrapping the first node solves this problem.

Determining the Most Advanced Node

In some cases Galera will refuse to bootstrap a node if it detects that it might not be the most advanced node in the cluster. Galera makes this determination if the node was not the last one in the cluster to be shut down or if the node crashed. In those cases, manual intervention is needed.

If you know for sure which node is the most advanced you can edit the grastate.dat file in the . You can set safe_to_bootstrap=1 on the most advanced node.

You can determine which node is the most advanced by checking grastate.dat on each node and looking for the node with the highest seqno. If the node crashed and seqno=-1, then you can find the most advanced node by recovering the seqno on each node with the option. For example:

Systemd and Galera Recovery

On operating systems that use , the position of a node can be recovered by running the galera_recovery script. For example:

If you are using the service that supports the , then you can recover the position of a specific instance by specifying the instance name as a suffix. For example:

The galera_recovery script recovers the position of a node by running with the option.

When the galera_recovery script runs , it does not write to the . Instead, it redirects log output to a file named with the format /tmp/wsrep_recovery.XXXXXX, where XXXXXX is replaced with random characters.

When Galera is enabled, MariaDB's service automatically runs the galera_recovery script prior to starting MariaDB, so that MariaDB starts with the proper Galera position.

Support for and the galera_recovery script were added.

State Snapshot Transfers (SSTs)

In a State Snapshot Transfer (SST), the cluster provisions nodes by transferring a full data copy from one node to another. When a new node joins the cluster, the new node initiates a State Snapshot Transfer to synchronize its data with a node that is already part of the cluster.

See for more information.

Incremental State Transfers (ISTs)

In an Incremental State Transfer (SST), the cluster provisions nodes by transferring a node's missing writesets from one node to another. When a new node joins the cluster, the new node initiates a Incremental State Transfer to synchronize its data with a node that is already part of the cluster.

If a node has only been out of a cluster for a little while, then an IST is generally faster than an SST.

Data at Rest Encryption

MariaDB Galera Cluster supports . See for some disclaimers on how SSTs are affected when encryption is configured.

Some data still cannot be encrypted:

The disk-based is not encrypted ().

Monitoring

Status Variables

can be queried with the standard command. For example:

Cluster Change Notifications

The cluster nodes can be configured to invoke a command when cluster membership or node status changes. This mechanism can also be used to communicate the event to some external monitoring agent. This is configured by setting . See for more information.

Footnotes

_{This page is licensed: CC BY-SA / Gnu FDL}

[mariadb]

# General Configuration
bind_address             = 0.0.0.0
innodb_autoinc_lock_mode = 2

# Cluster Configuration
wsrep_cluster_name    = "accounting_cluster"
wsrep_cluster_address = "gcomm://192.0.2.1,192.0.2.2,192.0.2.3"

# wsrep Provider
wsrep_provider = /usr/lib/galera/libgalera_enterprise_smm.so
wsrep_provider_options = "evs.suspect_timeout=PT10S"

[mariadb]
...
wsrep_sst_method = mariabackup
wsrep_sst_auth = mariabackup:mypassword

[sst]
ssl_mode = DISABLED

tca = /certs/ca-cert.pem
tcert = /certs/server-cert.pem
tkey = /certs/server-key.pem

[mariadb]
...
wsrep_sst_method = mariabackup
wsrep_sst_auth = mariabackup:mypassword

ssl_ca = /certs/ca-cert.pem
ssl_cert = /certs/server-cert.pem
ssl_key = /certs/server-key.pem

[sst]
ssl_mode = VERIFY_CA

mariadb-backup SST Method

Configure State Snapshot Transfers for Galera. Learn to use mariadb-backup for non-blocking data transfer when a new node joins a cluster.

The mariabackup SST method uses the utility for performing SSTs. It is one of the methods that does not block the donor node. mariadb-backup was originally forked from , and similarly, the mariabackup SST method was originally forked from the xtrabackup-v2 SST method.

If you use the mariadb-backup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor node to the joiner node. This is a limitation that was inherited from the xtrabackup-v2 SST method.

Choosing mariadb-backup for SSTs

To use the mariadb-backup SST method, you must set the on both the donor and joiner node. It can be changed dynamically with on the node that you intend to be an SST donor. For example:

It can be set in a server in an prior to starting up a node:

For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set to the same value on all nodes, since any node will usually be a donor or joiner node at some point.

Major Version Upgrades

The InnoDB redo log format has been changed in and in a way that will not allow the crash recovery or the preparation of a backup from an older major version. Because of this, the mariabackup SST method cannot be used for some major-version upgrades, unless you temporarily edit the wsrep_sst_mariadbbackup script so that the --prepare step on the newer-major-version joiner will be executed using the older-major-version mariadb-backup tool.

The default method wsrep_sst_method=rsync works for major-version upgrades; see .

Configuration Options

The mariabackup SST method is configured by placing options in the [sst] section of a MariaDB configuration file (e.g., /etc/my.cnf.d/server.cnf). These settings are parsed by the wsrep_sst_mariabackup and wsrep_sst_common scripts.

The command-line utility is mariadb-backup; this tool was previously called mariabackup. The SST method itself retains the original name mariabackup (as in wsrep_sst_method=mariabackup).

Primary Transfer and Format Options

These options control the core data transfer mechanism.

Option

Default Value

Description

Compression Options

These options configure on-the-fly compression to reduce network bandwidth.

Option

Description

Authentication and Security (TLS)

These options manage user authentication and stream encryption.

Option

Description

Logging and Miscellaneous Options

Option

Default Value

Description

Pass-through `mariadb-backup` Options

This feature allows mariadb-backup specific options to be passed through the SST script.

Option

Default Value

Description

Example: Using Native Encryption and Threading

Authentication and Privileges

To use the mariadb-backup SST method, the utility must be able to authenticate locally on the donor node to create a backup stream. There are two ways to manage this authentication:

Automatic User Account Management (ES 11.4+)

Starting with MariaDB Enterprise Server 11.4, the cluster can automatically manage the SST user account. This method is more secure and requires less configuration because it avoids storing plain-text passwords in configuration files.

When this feature is used:

The donor node automatically creates a temporary internal user (e.g., 'wsrep.sst. <timestamp>_<node_id>'@localhost) with a generated password when the SST process begins.
The necessary privileges (RELOAD, PROCESS, LOCK TABLES, etc.) are automatically granted to this temporary user.
Once the SST process completes, the donor node automatically drops the user.

To enable automatic user management:

Ensure that the system variable is not set (or is left blank) in your configuration file.

If you explicitly define wsrep_sst_auth in your configuration, the server will revert to the manual behavior and attempt to authenticate using the credentials provided in that variable.

Manual User Configuration

For versions prior to 11.4, or if you prefer to manage the user manually, you must create a user and provide the credentials to the server.

You can tell the donor node what username and password to use by setting the system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be an SST donor:

It can also be set in a server in an prior to starting up a node:

Some do not require a password. For example, the unix_socket and gssapi authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of empty. For example:

The user account that performs the backup for the SST needs to have the same privileges as , which are the RELOAD, PROCESS, LOCK TABLES and BINLOG MONITOR, REPLICA MONITOR . To be safe, ensure that these privileges are set on each node in your cluster. mariadb-backup connects locally on the donor node to perform the backup, so the following user should be sufficient:

Passwordless Authentication - Unix Socket

It is possible to use the authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in .

The user account would have to have the same name as the operating system user account that is running the mysqld process. On many systems, this is the user account configured as the user option, and it tends to default to mysql.

For example, if the authentication plugin is already installed, then you could execute the following to create the user account:

To configure , set the following in a server in an prior to starting up a node:

Passwordless Authentication - GSSAPI

It is possible to use the authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in .

The following steps would need to be done beforehand:

You need a KDC running or .
You will need to for the MariaDB server.
You will need to containing the authentication plugin.
You will need to in MariaDB, so that the authentication plugin is available to use.

For example, you could execute the following to create the user account in MariaDB:

To configure , set the following in a server in an prior to starting up a node:

Choosing a Donor Node

When mariadb-backup is used to create the backup for the SST on the donor node, mariadb-backup briefly requires a system-wide lock at the end of the backup. This is done with .

If a specific node in your cluster is acting as the primary node by receiving all of the application's write traffic, then this node should not usually be used as the donor node, because the system-wide lock could interfere with the application. In this case, you can define one or more preferred donor nodes by setting the system variable.

For example, let's say that we have a 5-node cluster with the nodes node1, node2, node3, node4, and node5, and let's say that node1 is acting as the primary node. The preferred donor nodes for node2 could be configured by setting the following in a server in an prior to starting up a node:

The trailing comma tells the server to allow any other node as donor when the preferred donors are not available. Therefore, if node1 is the only node left in the cluster, the trailing comma allows it to be used as the donor node.

Socat Dependency

During the SST process, the donor node uses socat to stream the backup to the joiner node. Then the joiner node prepares the backup before restoring it. The socat utility must be installed on both the donor node and the joiner node in order for this to work. Otherwise, the MariaDB error log will contain an error like:

Installing Socat on RHEL/CentOS

On RHEL/CentOS, socat can be installed from the repository.

TLS

This SST method supports two different TLS methods. The specific method can be selected by setting the encrypt option in the [sst] section of the MariaDB configuration file. The options are:

TLS using OpenSSL encryption built into socat (encrypt=2)
TLS using OpenSSL encryption with Galera-compatible certificates and keys (encrypt=3)

Note that encrypt=1 refers to a TLS encryption method that has been deprecated and removed. encrypt=4 refers to a TLS encryption method in xtrabackup-v2 that has not yet been ported to mariadb-backup. See about that.

TLS Using OpenSSL Encryption Built into Socat

To generate keys compatible with this encryption method, follow .

First, generate the keys and certificates:

On some systems, you may also have to add dhparams to the certificate:

Next, copy the certificate and keys to all nodes in the cluster.

When done, configure the following on all nodes in the cluster:

Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.

TLS Using OpenSSL Encryption With Galera-Compatible Certificates and Keys

To generate keys compatible with this encryption method, follow .

First, generate the keys and certificates:

Next, copy the certificate and keys to all nodes in the cluster.

When done, configure the following on all nodes in the cluster:

Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.

Logs

The mariadb-backup SST method has its own logging outside of the MariaDB Server logging.

Logging to SST Logs

Logging for mariadb-backup SSTs works the following way.

By default, on the donor node, it logs to mariadb-backup.backup.log. This log file is located in the .

By default, on the joiner node, it logs to mariadb-backup.prepare.log and mariadb-backup.move.log These log files are also located in the datadir.

By default, before a new SST is started, existing mariadb-backup SST log files are compressed and moved to /tmp/sst_log_archive. This behavior can be disabled by setting sst-log-archive=0 in the [sst] in an . Similarly, the archive directory can be changed by setting sst-log-archive-dir:

See for more information.

Logging to Syslog

Redirect the SST logs to the syslog instead, by setting the following in the [sst] in an :

You can also redirect the SST logs to the syslog by setting the following in the [mysqld_safe] in an :

Performing SSTs With IPv6 Addresses

If you are performing SSTs with IPv6 addresses, then the socat utility needs to be passed the pf=ip6 option. This can be done by setting the sockopt option in the [sst] in an :

See for more information.

Manual SST With mariadb-backup

If Galera Cluster's automatic SSTs repeatedly fail, it can be helpful to perform a "manual SST"; see:

wsrep_provider_options

The following options can be set as part of the Galera wsrep_provider_options variable. Dynamic options can be changed while the server is running.

Options need to be provided as a semicolon (;) separated list on a single line. Options that are not explicitly set are set to their default value.

Note that before Galera 3, the repl tag was named replicator.

`base_dir`

Description: Specifies the data directory

`base_host`

Description: For internal use. Should not be manually set.
Default: 127.0.0.1 (detected network address)

`base_port`

Description: For internal use. Should not be manually set.
Default: 4567

`cert.log_conflicts`

Description: Certification failure log details.
Dynamic: Yes
Default: no

`cert.optimistic_pa`

Description: Controls parallel application of actions on the replica. If set, the full range of parallelization as determined by the certification algorithm is permitted. If not set, the parallel applying window will not exceed that seen on the primary, and applying will start no sooner than after all actions it has seen on the master are committed.
Dynamic: Yes
Default: yes

`debug`

Description: Enable debugging.
Dynamic: Yes
Default: no

`evs.auto_evict`

Description: Number of entries the node permits for a given delayed node before triggering the Auto Eviction protocol. An entry is added to a delayed list for each delayed response from a node. If set to 0, the default, the Auto Eviction protocol is disabled for this node. See for more.
Dynamic: No
Default: 0

`evs.causal_keepalive_period`

Description: Used by the developers only, and not manually serviceable.
Dynamic: No
Default: The .

`evs.debug_log_mask`

Description: Controls EVS debug logging. Only effective when is on.
Dynamic: Yes
Default: 0x1

`evs.delay_margin`

Description: Time that response times can be delayed before this node adds an entry to the delayed list. See . Must be set to a higher value than the round-trip delay time between nodes.
Dynamic: No
Default: PT1S

`evs.delayed_keep_period`

Description: Time that this node requires a previously delayed node to remain responsive before being removed from the delayed list. See .
Dynamic: No
Default: PT30S

`evs.evict`

Description: When set to the gcomm UUID of a node, that node is evicted from the cluster. When set to an empty string, the eviction list is cleared on the node where it is set. See .
Dynamic: No
Default: Empty string

`evs.inactive_check_period`

Description: Frequency of checks for peer inactivity (looking for nodes with delayed responses), after which nodes may be added to the delayed list, and later evicted.
Dynamic: No
Default: PT0.5S

`evs.inactive_timeout`

Description: Time limit that a node can be inactive before being pronounced as dead.
Dynamic: No
Default: PT15S

`evs.info_log_mask`

Description: Controls extra EVS info logging. Bits:
- 0x1 – extra view change information
- 0x2 – extra state change information
- 0x4 – statistics

`evs.install_timeout`

Description: Timeout on waits for install message acknowledgments. Replaces evs.consensus_timeout.
Dynamic: Yes
Default: PT7.5S

`evs.join_retrans_period`

Description: Time period for how often retransmission of EVS join messages when forming cluster membership should occur.
Dynamic: Yes
Default: PT1S

`evs.keepalive_period`

Description: How often keepalive signals should be transmitted when there's no other traffic.
Dynamic: Yes
Default: PT1S

`evs.max_install_timeouts`

Description: Number of membership install rounds to attempt before timing out. The total rounds will be this value plus two.
Dynamic: No
Default: 3

`evs.send_window`

Description: Maximum number of packets that can be replicated at a time, Must be more than , which applies to data packets only (double is recommended). In WAN environments can be set much higher than the default, for example 512.
Dynamic: Yes
Default: 4

`evs.stats_report_period`

Description: Reporting period for EVS statistics.
Dynamic: No
Default: PT1M

`evs.suspect_timeout`

Description: A node will be suspected to be dead after this period of inactivity. If all nodes agree, the node is dropped from the cluster before is reached.
Dynamic: No
Default: PT5S

`evs.use_aggregate`

Description: If set to true (the default), small packets will be aggregated into one where possible.
Dynamic: No
Default: true

`evs.user_send_window`

Description: Maximum number of data packets that can be replicated at a time. Must be smaller than (half is recommended). In WAN environments can be set much higher than the default, for example 512.
Dynamic: Yes
Default: 2

`evs.version`

Description: EVS protocol version. Defaults to 0 for backward compatibility. Certain EVS features (e.g. auto eviction) require more recent versions.
Dynamic: No
Default: 0

`evs.view_forget_timeout`

Description: Time after which past views will be dropped from the view history.
Dynamic: No
Default: P1D

`gcache.dir`

Description: Directory where GCache files are placed.
Dynamic: No
Default: The working directory

`gcache.keep_pages_size`

Description: Total size of the page storage pages for caching. One page is always present if only page storage is enabled.
Dynamic: No
Default: 0

`gcache.mem_size`

Description: Maximum size of size of the malloc() store for setups that have spare RAM.
Dynamic: No
Default: 0

`gcache.name`

Description: Gcache ring buffer storage file name. By default placed in the working directory, changing to another location or partition can reduce disk IO.
Dynamic: No
Default: ./galera.cache

`gcache.page_size`

Description: Size of the page storage page files. These are prefixed by gcache.page. Can be set to as large as the disk can handle.
Dynamic: No
Default: 128M

`gcache.recover`

Description: Whether or not gcache recovery takes place when the node starts up. If it is possible to recover gcache, the node can then provide IST to other joining nodes, which assists when the whole cluster is restarted.
Dynamic: No
Default: no
Introduced: , ,

`gcache.size`

Description: Gcache ring buffer storage size (the space the node uses for caching write sets), preallocated on startup.
Dynamic: No
Default: 128M

`gcomm.thread_prio`

fifo or rr real-time scheduling policies requires mariadb service permissions at the OS level.

Description: Gcomm thread policy and priority (in the format policy:priority. Priority is an integer, while policy can be one of:
- fifo: First-in, first-out scheduling. Always preempt other, batch or idle threads and can only be preempted by other fifo threads of a higher priority or blocked by an I/O request.
- rr

`gcs.fc_debug`

Description: If set to a value greater than zero (the default), debug statistics about SST flow control will be posted each timegcs.fc_master_slave after the specified number of writesets.
Dynamic: No
Default: 0

`gcs.fc_factor`

Description:Fraction below which if the recv queue drops below, replication resumes.
Dynamic: Yes
Default: 1.0

`gcs.fc_limit`

Description: If the recv queue exceeds this many writesets, replication is paused. Can increase greatly in master-slave setups. Replication will resume again according to the setting.
Dynamic: Yes
Default: 16

`gcs.fc_master_slave`

Description: Whether to assume that the cluster only contains one master. Deprecated since Galera 4.10 (, , , , ) - see
Dynamic: No
Default: no

`gcs.fc_single_primary`

Description: Defines whether there is more than one source of replication. As the number of nodes in the cluster grows, the larger the calculated gcs.fc_limit gets. At the same time, the number of writes from the nodes increases. When this parameter value is set to NO (multi-primary), the gcs.fc_limit parameter is dynamically modified to give more margin for each node to be a bit further behind applying writes. The gcs.fc_limit parameter is modified by the square root of the cluster size, that is, in a four-node cluster it is two times higher than the base value. This is done to compensate for the increasing replication rate noise.
Dynamic: No
Default: no

`gcs.max_packet_size`

Description: Maximum packet size, after which writesets become fragmented.
Dynamic: No
Default: 64500

`gcs.max_throttle`

Description: How much we can throttle replication rate during state transfer (to avoid running out of memory). Set it to 0.0 if stopping replication is acceptable for the sake of completing state transfer.
Dynamic: No
Default: 0.25

`gcs.recv_q_hard_limit`

Description: Maximum size of the recv queue. If exceeded, the server aborts. Half of available RAM plus swap is a recommended size.
Dynamic: No
Default: LLONG_MAX

`gcs.recv_q_soft_limit`

Description: Fraction of after which replication rate is throttled. The rate of throttling increases linearly from zero (the regular, varying rate of replication) at and below csrecv_q_soft_limit to one (full throttling) at
Dynamic: No
Default: 0.25

`gcs.sync_donor`

Description: Whether or not the rest of the cluster should stay in sync with the donor. If set to YES (NO is default), if the donor is blocked by state transfer, the whole cluster is also blocked.
Dynamic: No
Default: no

`gmcast.listen_addr`

Description: Address Galera listens for connections from other nodes. Can be used to override the default port to listen, which is obtained from the connection address. Specifying a hostname isn't supported. Use an IP number instead. You can specify the setting using either TCP or SSL, like this:
gmcast.listen_addr=tcp://192.168.8.111:4567
gmcast.listen_addr=ssl://192.168.8.111:4567 If your system supports IPv6, you can also specify it like this: gmcast.listen_addr=tcp://[::]:4567
Dynamic: No

`gmcast.mcast_addr`

Description: Not set by default, but if set, UDP multicast will be used for replication. Must be identical on all nodes.For example, gmcast.mcast_addr=239.192.0.11
Dynamic: No
Default: None

`gmcast.mcast_ttl`

Description: Multicast packet TTL (time to live) value.
Dynamic: No
Default: 1

`gmcast.peer_timeout`

Description: Connection timeout for initiating message relaying.
Dynamic: No
Default: PT3S

`gmcast.segment`

Description: Defines the segment to which the node belongs. By default, all nodes are placed in the same segment (0). Usually, you would place all nodes in the same datacenter in the same segment. Galera protocol traffic is only redirected to one node in each segment, and then relayed to other nodes in that same segment, which saves cross-datacenter network traffic at the expense of some extra latency. State transfers are also, preferably but not exclusively, taken from the same segment. If there are no nodes available in the same segment, state transfer will be taken from a node in another segment.
Dynamic: No
Default: 0

`gmcast.time_wait`

Description: Waiting time before allowing a peer that was declared outside of the stable view to reconnect.
Dynamic: No
Default: PT5S

`gmcast.version`

Description: Deprecated option. Gmcast version.
Dynamic: No
Default: 0

`ist.recv_addr`

Description: Address for listening for Incremental State Transfer.
Dynamic: No
Default::<port+1> from

`ist.recv_bind`

Description:
Dynamic: No
Default: Empty string
- Introduced: , ,

`pc.announce_timeout`

Description: Period of time for which cluster joining announcements are sent every 1/2 second.
Dynamic: No
Default: PT3S

`pc.checksum`

Description: For debug purposes, by default false (true in earlier releases), indicates whether to checksum replicated messages on PC level. Safe to turn off.
Dynamic: No
Default: false

`pc.ignore_quorum`

Description: Whether to ignore quorum calculations, for example when a master splits from several slaves, it will remain in operation if set to true (false is default). Use with care however, as in master-slave setups, slaves will not automatically reconnect to the master if set.
Dynamic: Yes
Default: false

`pc.ignore_sb`

Description: Whether to permit updates to be processed even in the case of split brain (when a node is disconnected from its remaining peers). Safe in master-slave setups, but could lead to data inconsistency in a multi-master setup.
Dynamic: Yes
Default: false

`pc.linger`

Description: Time that the PC protocol waits for EVS termination.
Dynamic: No
Default: PT20S

`pc.npvo`

Description: If set to true (false is default), when there are primary component conficts, the most recent component will override the older.
Dynamic: No
Default: false

`pc.recovery`

Description: If set to true (the default), the Primary Component state is stored on disk and in the case of a full cluster crash (e.g power outages), automatic recovery is then possible. Subsequent graceful full cluster restarts will require explicit bootstrapping for a new Primary Component.
Dynamic: No
Default: true

`pc.version`

Description: Deprecated option. PC protocol version.
Dynamic: No
Default: 0

`pc.wait_prim`

Description: When set to true, the default, the node will wait for a primary component for the period of time specified by . Used to bring up non-primary components and make them primary using .
Dynamic: No
Default: true

`pc.wait_prim_timeout`

Description: Ttime to wait for a primary component. See .
Dynamic: No
Default: PT30S

`pc.weight`

Description: Node weight, used for quorum calculation. See the Codership article .
Dynamic: Yes
Default: 1

`protonet.backend`

Description: Deprecated option. Transport backend to use. Only ASIO is supported currently.
Dynamic: No
Default: asio

`protonet.version`

Description: Deprecated option. Protonet version.
Dynamic: No
Default: 0

`repl.causal_read_timeout`

Description: Timeout period for causal reads.
Dynamic: Yes
Default: PT30S

`repl.commit_order`

Description: Whether or not out-of-order committing is permitted, and under what conditions. By default it is not permitted, but setting this can improve parallel performance.
- 0 BYPASS: No commit order monitoring is done (useful for measuring the performance penalty).
- 1 OOOC: Out-of-order committing is permitted for all transactions.

`repl.key_format`

Description: Format for key replication. Can be one of:
- FLAT8 - shorter key with a higher probability of false positives when matching
- FLAT16 - longer key with a lower probability of false positives when matching

`repl.max_ws_size`

Description:
Dynamic:
Default: 2147483647

`repl.proto_max`

Description:
Dynamic:
Default: 9

`socket.checksum`

Description: Method used for generating checksum. Note: If Galera 25.2.x and 25.3.x are both being used in the cluster, MariaDB with Galera 25.3.x must be started with wsrep_provider_options='socket.checksum=1' in order to make it backward compatible with Galera v2. Galera wsrep providers other than 25.3.x or 25.2.x are not supported.
Dynamic: No
Default: 2

`socket.dynamic`

Description: Allow both encrypted and unencrypted connections between nodes. Typically this should be set to false (the default), when set to true encrypted connections will still be preferred, but will fall back to unencrypted connections when encryption is not possible, e.g. not enabled on all nodes yet. Needs to be true on all nodes when wanting to enable or disable encryption via a rolling restart. As this can't be changed at runtime a rolling restart to enable or disable encryption may need three restarts per node in total: one to enable socket.dynamic on each node, one to change the actual encryption settings on each node, and a final round to change socket.dynamic back to false.

`socket.recv_buf_size`

Description: Size in bytes of the receive buffer used on the network sockets between nodes, passed on to the kernel via the SO_RCVBUF socket option.
Dynamic: No
Default:
- = ,

`socket.send_buf_size`

Description: Size in bytes of the send buffer used on the network sockets between nodes, passed on to the kernel via the SO_SNDBUF socket option.
Dynamic: No
Default:: Auto
Introduced: , ,

`socket.ssl`

Description: Explicitly enables TLS usage by the wsrep Provider.
Dynamic: No
Default: NO

`socket.ssl_ca`

Description: Path to Certificate Authority (CA) file. Implicitly enables the option.
Dynamic: No

`socket.ssl_cert`

Description: Path to TLS certificate. Implicitly enables the option.
Dynamic: No

`socket.ssl_cipher`

Description: TLS cipher to use. Implicitly enables the option. Since defaults to the value of the system variable.
Dynamic: No
Default: system default, before defaults to AES128-SHA.

`socket.ssl_compression`

Description: Compression to use on TLS connections. Implicitly enables the option.
Dynamic: No

`socket.ssl_key`

Description: Path to TLS key file. Implicitly enables the option.
Dynamic: No

`socket.ssl_password_file`

Description: Path to password file to use in TLS connections. Implicitly enables the option.
Dynamic: No

Galera Cluster System Variables

/This page documents system variables related to Galera Cluster. For options that are not system variables, see .

See for a complete list of system variables and instructions on setting them.

Also see the .

`wsrep_allowlist`

Description:
- Allowed IP addresses, comma delimited.
- Note that setting gmcast.listen_addr=tcp://[::]:4567 on a dual-stack system (eg. Linux with net.ipv6.bindv6only = 0), IPv4 addresses need to allowlisted using the IPv4-mapped IPv6 address (eg. ::ffff:1.2.3.4).
Command line: --wsrep-allowlist=value1[,value2...]
Scope: Global
Dynamic: No
Data Type: String
Default Value: None
Introduced:

`wsrep_applier_retry_count`

Description: Maximum number of applier retry attempts. Prior to MariaDB 12.1, replication applying always stops for the first non-ignored failure occurring in event applying, and the node will emergency abort (or start inconsistency voting). Some failures, however, can be concurrency related, and applying may succeed if the operation is tried at later time. This variable controls the retry-applying feature. It is set to zero by default, which means no retrying.
Command line: --wsrep-applier-retry-count=value
Scope: Global

`wsrep_auto_increment_control`

Description: If set to 1 (the default), will automatically adjust the and variables according to the size of the cluster, and when the cluster size changes. This avoids replication conflicts due to . In a primary-replica environment, can be set to OFF.
Command line: --wsrep-auto-increment-control[={0|1}]
Scope: Global

`wsrep_causal_reads`

Description: If set to ON (OFF is default), enforces characteristics across the cluster. In the case that a primary applies an event more quickly than a replica, the two could briefly be out-of-sync. With this variable set to ON, the replica will wait for the event to be applied before processing further queries. Setting to ON also results in larger read latencies. Deprecated by .
Command line: --wsrep-causal-reads[={0|1}]

`wsrep_certificate_expiration_hours_warning`

`wsrep_certification_rules`

Description: Certification rules to use in the cluster. Possible values are:
- strict: Stricter rules that could result in more certification failures. For example with foreign keys, certification failure could result if different nodes receive non-conflicting insertions at about the same time that point to the same row in a parent table
- optimized: relaxed rules that allow more concurrency and cause less certification failures.

`wsrep_certify_nonPK`

Description: When set to ON (the default), Galera will still certify transactions for tables with no . However, this can still cause undefined behavior in some circumstances. It is recommended to define primary keys for every InnoDB table when using Galera.
Command line: --wsrep-certify-nonPK[={0|1}]
Scope: Global

`wsrep_cluster_address`

Description: The addresses of cluster nodes to connect to when starting up.
- Good practice is to specify all possible cluster nodes, in the form gcomm://<node1 or ip:port>,<node2 or ip2:port>,<node3 or ip3:port>.
- Specifying an empty ip (gcomm://) will cause the node to start a new cluster (which should not be done in the my.cnf file, as after each restart the server will not rejoin the current cluster).

`wsrep_cluster_name`

Description: The name of the cluster. Nodes cannot connect to clusters with a different name, so needs to be identical on all nodes in the same cluster. The variable can be set dynamically, but note that doing so may be unsafe and cause an outage, and that the wsrep provider is unloaded and loaded.
Command line: --wsrep-cluster-name=value
Scope: Global
Dynamic: Yes

`wsrep_convert_LOCK_to_trx`

Description: Converts / statements to and . Used mainly for getting older applications to work with a multi-primary setup, use carefully, as can result in extremely large writesets.
Command line: --wsrep-convert-LOCK-to-trx[={0|1}]
Scope: Global

`wsrep_data_home_dir`

Description: Directory where wsrep provider will store its internal files.
Command line: --wsrep-data-home-dir=value
Scope: Global
Dynamic: No

`wsrep_dbug_option`

Description: Unused. The mechanism to pass the DBUG options to the wsrep provider hasn't been implemented.
Command line: --wsrep-dbug-option=value
Scope: Global
Dynamic: Yes

`wsrep_debug`

Description: WSREP debug level logging.
- Before , DDL logging was only logged on the originating node. From , it is logged on other nodes as well.

It is an enum. Valid values are:0: NONE: Off (default)1: SERVER: MariaDB server code contains WSREP_DEBUG log writes, and these will be added to server error log2: TRANSACTION: Logging from wsrep-lib transaction is added to the error log3: STREAMING: Logging from streaming transactions in wsrep-lib is added to the error log4: CLIENT: Logging from wsrep-lib client state is added to the error log.

Command line:
- --wsrep-debug[={NONE|SERVER|TRANSACTION|STREAMING|CLIENT}]
Scope: Global
Dynamic: Yes

`wsrep_desync`

Description: When a node receives more write-sets than it can apply, the transactions are placed in a received queue. If the node's received queue has too many write-sets waiting to be applied (as defined by the WSREP provider option), then the node would usually engage Flow Control. However, when this option is set to ON, Flow Control will be disabled for the desynced node. The desynced node works through the received queue until it reaches a more manageable size. The desynced node continues to receive write-sets from the other nodes in the cluster. The other nodes in the cluster do not wait for the desynced node to catch up, so the desynced node can fall even further behind the other nodes in the cluster. You can check if a node is desynced by checking if the status variable is equal to Donor/Desynced.
Command line: --wsrep-desync[={0|1}]

`wsrep_dirty_reads`

Description: By default, when not synchronized with the group (=OFF) a node will reject all queries other than SET and SHOW. If wsrep_dirty_reads is set to 1, queries which do not change data, like SELECT queries (dirty reads), creating of prepare statement, etc. will be accepted by the node.
Command line: --wsrep-dirty-reads[={0|1}]
Scope: Global,Session

`wsrep_drupal_282555_workaround`

Description: If set to ON, a workaround for is enabled. This is a bug where, in some cases, when inserting a DEFAULT value into an column, a duplicate key error may be returned.
Command line: --wsrep-drupal-282555-workaround[={0|1}]
Scope: Global

`wsrep_forced_binlog_format`

Description: A that will override any session binlog format settings.
Command line: --wsrep-forced-binlog-format=value
Scope: Global
Dynamic: Yes

`wsrep_gtid_domain_id`

Description: This system variable defines the domain ID that is used for .
- When is set to ON, wsrep_gtid_domain_id is used in place of for all Galera Cluster write sets.
- When is set to OFF, wsrep_gtid_domain_id

`wsrep_gtid_mode`

Description: attempts to keep consistent for Galera Cluster write sets on all cluster nodes. state is initially copied to a joiner node during an . If you are planning to use Galera Cluster with , then wsrep GTID mode can be helpful.
- When wsrep_gtid_mode is set to ON, is used in place of for all Galera Cluster write sets.
- When

`wsrep_gtid_seq_no`

Description: Internal server usage, manually set WSREP GTID seqno.
Command line: None
Scope: Session only
Dynamic: Yes

`wsrep_ignore_apply_errors`

Description: Bitmask determining whether errors are ignored, or reported back to the provider.
- 0: No errors are skipped.
- 1: Ignore some DDL errors (DROP DATABASE, DROP TABLE, DROP INDEX, ALTER TABLE).
- 2: Skip DML errors (Only ignores DELETE errors).

`wsrep_load_data_splitting`

Description: If set to ON, supports big data files by introducing transaction splitting. The setting has been deprecated in Galera 4, and defaults to OFF
Command line: --wsrep-load-data-splitting[={0|1}]
Scope: Global

`wsrep_log_conflicts`

Description: If set to ON (OFF is default), details of conflicting MDL as well as InnoDB locks in the cluster will be logged.
Command line: --wsrep-log-conflicts[={0|1}]
Scope: Global

`wsrep_max_ws_rows`

Description: Maximum permitted number of rows per writeset. The support for this variable has been added and in order to be backward compatible the default value has been changed to 0, which essentially allows writesets to be any size.
Command line: --wsrep-max-ws-rows=#
Scope: Global

`wsrep_max_ws_size`

Description: Maximum permitted size in bytes per write set. Writesets exceeding 2GB will be rejected.
Command line: --wsrep-max-ws-size=#
Scope: Global
Dynamic: Yes

`wsrep_mode`

Description: Turns on WSREP features which are not part of default behavior.
- BINLOG_ROW_FORMAT_ONLY: Only ROW is supported.
- DISALLOW_LOCAL_GTID: Nodes can have GTIDs for local transactions in a number of scenarios. If

`wsrep_mysql_replication_bundle`

Description: Determines the number of replication events that are grouped together. Experimental implementation aimed to assist with bottlenecks when a single replica faces a large commit time delay. If set to 0 (the default), there is no grouping.
Comman dline: --wsrep-mysql-replication-bundle=#
Scope: Global

`wsrep_node_address`

Description: Specifies the node's network address, in the format ip address[:port]. It supports IPv6. The default behavior is for the node to pull the address of the first network interface on the system and the default Galera port. This autoguessing can be unreliable, particularly in the following cases:
- cloud deployments
- container deployments

`wsrep_node_incoming_address`

Description: This is the address from which the node listens for client connections. If an address is not specified or it's set to AUTO (default), mysqld uses either or , or tries to get one from the list of available network interfaces, in the same order. See also .
Command line: --wsrep-node-incoming-address=value
Scope: Global

`wsrep_node_name`

Description: Name of this node. This name can be used in as a preferred donor. Note that multiple nodes in a cluster can have the same name.
Command line: --wsrep-node-name=value
Scope: Global
Dynamic: Yes

`wsrep_notify_cmd`

Description: Command to be executed each time the node state or the cluster membership changes. Can be used for raising an alarm, configuring load balancers and so on. See the for more details.
Command line: --wsrep-notify-command=value
Scope: Global
Dynamic:

`wsrep_on`

Description: Whether or not wsrep replication is enabled. If the global value is set to OFF , it is not possible to load the provider and join the node in the cluster. If only the session value is set to OFF, the operations from that particular session are not replicated in the cluster, but other sessions and applier threads will continue as normal. The session value of the variable does not affect the node's membership and thus, regardless of its value, the node keeps receiving updates from other nodes in the cluster. It is set to OFF by default and must be turned on to enable Galera replication.
Command line: --wsrep-on[={0|1}]

`wsrep_OSU_method`

Description: Online schema upgrade method. The default is TOI, specifying the setting without the optional parameter will set to RSU.
- TOI: Total Order Isolation. In each cluster node, DDL is processed in the same order regarding other transactions, guaranteeing data consistency. However, affected parts of the database will be locked for the whole cluster.
- RSU

`wsrep_patch_version`

Description: Wsrep patch version, for example wsrep_25.10.
Command line: None
Scope: Global
Dynamic: No

`wsrep_provider`

Description: Location of the wsrep library, usually /usr/lib/libgalera_smm.so on Debian and Ubuntu, and /usr/lib64/libgalera_smm.so on Red Hat/CentOS.
Command line: --wsrep-provider=value
Scope: Global

`wsrep_provider_options`

Description: Semicolon (;) separated list of wsrep options (see ).
Command line: --wsrep-provider-options=value
Scope: Global
Dynamic: No

`wsrep_recover`

Description: If set to ON when the server starts, the server will recover the sequence number of the most recent write set applied by Galera, and it will be output to stderr, which is usually redirected to the . At that point, the server will exit. This sequence number can be provided to the system variable.
Command line: --wsrep-recover[={0|1}]
Scope: Global

`wsrep_reject_queries`

Description: Variable to set to reject queries from client connections, useful for maintenance. The node continues to apply write-sets, but an Error 1047: Unknown command error is generated by a client query.
- NONE - Not set. Queries will be processed as normal.
- ALL - All queries from client connections will be rejected, but existing client connections will be maintained.

`wsrep_replicate_myisam`

Description: Whether or not DML updates for tables will be replicated. This functionality is still experimental and should not be relied upon in production systems. Deprecated in , and removed in , use instead.
Command line: --wsrep-replicate-myisam[={0|1}]
Scope: Global

`wsrep_restart_slave`

Description: If set to ON, the replica is restarted automatically, when node joins back to cluster.
Command line: --wsrep-restart-slave[={0|1}]
Scope: Global
Dynamic: Yes

`wsrep_retry_autocommit`

Description: Number of times autocommited queries are retried due to cluster-wide conflicts before returning an error to the client. If set to 0, no retries are attempted, while a value of 1 (the default) or more specifies the number of retries attempted. Can be useful to assist applications using autocommit to avoid deadlocks. Reasons for failures include:
- Certification failure: If the transaction reached the replication state and observed the conflict by performing a certification test.
- High-priority abort: If the execution of the transaction was interrupted by the replication applier before entering the replication state.

`wsrep_slave_FK_checks`

Description: If set to ON (the default), the applier replica thread performs foreign key constraint checks.
Command line: --wsrep-slave-FK-checks[={0|1}]
Scope: Global
Dynamic: Yes

`wsrep_slave_threads`

Description: Number of replica threads used to apply Galera write sets in parallel. The Galera replica threads are able to determine which write sets are safe to apply in parallel. However, if your cluster nodes seem to have frequent consistency problems, then setting the value to 1 will probably fix the problem. See for more information.
Command line: --wsrep-slave-threads=#
Scope: Global

`wsrep_slave_UK_checks`

Description: If set to ON, the applier replica thread performs secondary index uniqueness checks.
Command line: --wsrep-slave-UK-checks[={0|1}]
Scope: Global
Dynamic: Yes

`wsrep_sr_store`

Description: Storage for streaming replication fragments.
Command line: --wsrep-sr-store=val
Scope: Global
Dynamic: No

`wsrep_ssl_mode`

`wsrep_sst_auth`

Description: Username and password of the user to use for replication. Unused if is set to rsync, while for other methods it should be in the format <user>:<password>. The contents are masked in logs and when querying the value with . See for more information.
Command line: --wsrep-sst-auth=value
Scope: Global

`wsrep_sst_donor`

Description: Comma-separated list (from 5.5.33) or name (as per ) of the servers as donors, or the source of the state transfer, in order of preference. The donor-selection algorithm, in general, prefers a donor capable of transferring only the missing transactions (IST) to the joiner node, instead of the complete state (SST). Thus, it starts by looking for an IST-capable node in the given donor list followed by rest of the nodes in the cluster. In case multiple candidate nodes are found outside the specified donor list, the node in the same segment () as the joiner is preferred. If none of the existing nodes in the cluster can serve the missing transactions through IST, the algorithm moves on to look for a suitable node to transfer the entire state (SST). It first looks at the nodes specified in the donor list (irrespective of their segment). If no suitable donor is still found, the rest of the donor nodes are checked for suitability only if the donor list has a "terminating-comma". Note that a stateless node (the Galera arbitrator) can never be a donor. See for more information. [NOTE] Although the variable is dynamic, the node will not use the new value unless the node requiring SST or IST disconnects from the cluster. To force this, set to an empty string and back to the nodes list. After setting this variable dynamically, on startup the value from the configuration file will be used again.
Command line: --wsrep-sst-donor=value

`wsrep_sst_donor_rejects_queries`

Description: If set to ON (OFF is default), the donor node will reject incoming queries, returning an UNKNOWN COMMAND error code. Can be used for informing load balancers that a node is unavailable.
Command line: --wsrep-sst-donor-rejects-queries[={0|1}]
Scope: Global

`wsrep_sst_method`

Description: Method used for taking the . See for more information.
Command line: --wsrep-sst-method=value
Scope: Global
Dynamic: Yes

`wsrep_sst_receive_address`

Description: This is the address where other nodes (donor) in the cluster connect to in order to send the state-transfer updates. If an address is not specified or its set to AUTO (default), mysqld uses 's value as the receiving address. However, if is not set, it uses address from either or tries to get one from the list of available network interfaces, in the same order. Note: setting it to localhost will make it impossible for nodes running on other hosts to reach this node. See for more information.
Command line: --wsrep-sst-receive-address=value

`wsrep_start_position`

Description: The start position that the node should use in the format: UUID:seq_no. The proper value to use for this position can be recovered with .
Command line: --wsrep-start-position=value
Scope: Global

`wsrep_status_file`

Description: wsrep status output filename.
Command line: --wsrep-status-file=value
Scope: Global
Dynamic: No

`wsrep_strict_ddl`

Description: If set, reject DDL statements on affected tables not supporting Galera replication. This is done by checking if the table is InnoDB, which is the only table currently fully supporting Galera replication. MyISAM tables will not trigger the error if the experimental setting is ON. If set, should be set on all tables in the cluster. Affected DDL statements include: (e.g. CREATE TABLE t1(a int) engine=Aria) Statements in , , and are permitted as the affected tables are only known at execution. Furthermore, the various USER, ROLE, SERVER and DATABASE statements are also allowed as they do not have an affected table. Deprecated in and removed in . Use instead.
Command line: --wsrep-strict-ddl[={0|1}

`wsrep_sync_wait`

Description: Setting this variable ensures causality checks will take place before executing an operation of the type specified by the value, ensuring that the statement is executed on a fully synced node. While the check is taking place, new queries are blocked on the node to allow the server to catch up with all updates made in the cluster up to the point where the check was begun. Once reached, the original query is executed on the node. This can result in higher latency. Note that when is ON, values of wsrep_sync_wait become irrelevant. Sample usage (for a critical read that must have the most up-to-date data) SET SESSION wsrep_sync_wait=1; SELECT ...; SET SESSION wsrep_sync_wait=0;
- 0 - Disabled (default)

`wsrep_trx_fragment_size`

Description: Size of transaction fragments for streaming replication (measured in units as specified by )
Command line: --wsrep-trx-fragment-size=#
Scope: Session
Dynamic: Yes

`wsrep_trx_fragment_unit`

Description: Unit for streaming replication transaction fragments' size:
- bytes: transaction’s binlog events buffer size in bytes
- rows: number of rows affected by the transaction

_{This page is licensed: CC BY-SA / Gnu FDL}

wsrep_gtid_mode

OFF

Data Type: numeric

DISALLOW_LOCAL_GTID

ERROR HY000: Galera replication not supported

1 - READ (SELECT and BEGIN/START TRANSACTION). This is the same as wsrep_causal_reads=1.

statements: number of SQL statements executed in the multi-statement transaction

Galera Cluster

MariaDB Galera Cluster

Quickstart Guides

MariaDB Galera Cluster Replication Guide

Quickstart Guide: About Galera Replication

Galera Management

Installation & Deployment

Galera Test Repositories

Galera Test Repositories for YUM

Advanced Installation (From Source)

Building the Galera wsrep Package on Ubuntu and Debian

Configuration

Galera Cluster Address

Schema

Cluster address

Option list

Port

Configuring Auto-Eviction

Auto-Eviction Process

Configuration

Related Parameters for Failure Detection

Using the Notification Command (wsrep_notify_cmd)

General Operations

Backing Up a MariaDB Galera Cluster

The Challenge of Consistency in a Live Cluster

Upgrading Galera Cluster

Performance Tuning

Galera Architecture

Certification-Based Replication

Requirements for Certification-Based Replication

Quorum Control with Weighted Votes

Galera Cluster Deployment Variants

Standard LAN Cluster (Single Data Center)

Galera Security

Securing Communications in Galera Cluster

Securing Galera Cluster Replication Traffic

Securing State Snapshot Transfers

mariadb-backup

xtrabackup-v2

mysqldump

rsync

High Availability

Resetting the Quorum (Cluster Bootstrap)

Load Balancing

Load Balancing in MariaDB Galera Cluster

Recommended Load Balancer: MariaDB MaxScale

Common Routing Strategies

Read-Write Splitting (Recommended)

Read Connection Load Balancing

Other Load Balancing Solutions

State Snapshot Transfers (SSTs) in Galera Cluster

Using MariaDB Replication with MariaDB Galera Cluster

Overview of Hybrid Replication

Common Use Cases

Key Challenges and Considerations

Using MariaDB Replication with MariaDB Galera Cluster

Tutorials

Reference

WSREP Variable Details

wsrep_certificate_expiration_hours_warning

Overview

Usage

Details

Parameters

wsrep_cluster_name

Overview

Details

wsrep_sst_method

Overview

DETAILS

wsrep_sst_common

wsrep_sst_common Variables

MariaDB Galera Cluster

Quickstart Guides

Galera Management

Installation & Deployment

Advanced Installation (From Source)

Configuration

General Operations

Upgrading Galera Cluster

`wsrep_sst_common` Variables