Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
MariaDB ensures high availability with Replication for async/semi-sync data copying and Galera Cluster for sync multi-master with failover and zero data loss.
From a , you can check the status of write-set replication throughout the cluster using standard queries. Status variables that relate to write-set replication have the prefix wsrep_, meaning that you can display them all using the following query:
The most fundamental aspect of a healthy cluster is Quorum. Quorum is a mechanism that ensures data consistency by requiring a majority of nodes to be online and in communication to form a Primary Component. Only the Primary Component will process transactions. This prevents "split-brain" scenarios where a network partition could otherwise lead to data conflicts.
You can check the cluster's integrity and Quorum status using these key variables. For a healthy cluster, the values for these variables must be identical on every node.
You can monitor the status of individual nodes to ensure they are in working order and able to receive write-sets.
The value of wsrep_local_state_comment tells you exactly what a node is doing. The most common states include:
These can help identify performance issues and bottlenecks.
Many status variables are differential and reset after each FLUSH STATUS command.
If the entire cluster shuts down or , you must manually re-establish a Primary Component by bootstrapping from the most advanced node.
This page is licensed: CC BY-SA / Gnu FDL
SHOW GLOBAL STATUS LIKE 'wsrep_%'Examine the grastate.dat file located in the MariaDB data directory (e.g., /var/lib/mysql/).
Look for the seqno: value in this file. The node with the highest seqno is the most advanced node. If a node was shut down gracefully, its seqno may be -1; these nodes should not be used to bootstrap if a node with a positive seqno is available.
After the first node is successfully running as a new Primary Component, start the MariaDB service normally on all of the other nodes.
They will detect the existing Primary Component, connect to it, and automatically initiate a State Transfer (IST or SST) to synchronize their data and rejoin the cluster.
wsrep_cluster_status
Status of the component the node belongs to. The healthy value is Primary. Any other value indicates issues.
Primary
wsrep_cluster_size
Number of nodes in the current component. This should match the expected total nodes in the cluster.
Matches total expected nodes
wsrep_cluster_state_uuid
Unique identifier for the cluster's state. It must be consistent across all nodes.
Same on all nodes
wsrep_cluster_conf_id
Identifier for the cluster membership group. It must be the same on all nodes.
wsrep_ready
Indicates if the node can accept queries.
ON
If OFF, the node will reject almost all queries.
wsrep_connected
Indicates if the node has network connectivity with other nodes.
ON
If OFF, the node is isolated.
wsrep_local_state_comment
Shows the current node state in a readable format.
N/A
Synced
The node is a healthy, fully operational, and active member of the cluster.
Donor
The node is providing a State Snapshot Transfer (SST) to another node; typically read-only.
Joining
The node is establishing a connection and synchronizing with the cluster.
Joined
The node has received a state transfer but is applying transactions to catch up before syncing.
Initialized
The node is not connected to any cluster component.
wsrep_local_recv_queue_avg
Average size of the queue of write-sets waiting to be applied. A value consistently higher than 0.0 indicates falling behind and may trigger Flow Control.
wsrep_flow_control_paused
Fraction of time the node has been paused by Flow Control. A value close to 0.0 is ideal; a high value indicates a performance bottleneck.
wsrep_local_send_queue_avg
Average size of the queue of write-sets waiting to be sent to other nodes. Values much greater than 0.0 can indicate network throughput issues.
wsrep_cert_deps_distance
Represents the node’s potential for parallel transaction application, helping to optimally tune the wsrep_slave_threads parameter.

Same on all nodes
Output is human-readable and varies.
galera_new_clustersystemctl start mariadbMariaDB Galera Cluster provides high availability with synchronous replication, while adding asynchronous replication boosts redundancy for disaster recovery or reporting.
State Snapshot Transfers (SSTs) in MariaDB Galera Cluster copy the full dataset from a donor node to a new or recovering joiner node, ensuring data consistency before the joiner joins replication.
While a client application can connect directly to any node in a MariaDB Galera Cluster, this is not a practical approach for a production environment. A direct connection creates a single point of failure and does not allow the application to take advantage of the cluster's high availability and read-scaling capabilities.
A load balancer or database proxy is an essential component that sits between your application and the cluster. Its primary responsibilities are:
Provide a Single Endpoint: Your application connects to the load balancer's virtual IP address, not to the individual database nodes.
Health Checks: The load balancer constantly monitors the health of each cluster node (e.g., is it Synced? is it up or down?).
Traffic Routing: It intelligently distributes incoming client connections and queries among the healthy nodes in the cluster.
Automatic Failover: If a node fails, the load balancer automatically stops sending traffic to it, providing seamless failover for your application.
For MariaDB Galera Cluster, the recommended load balancer is MariaDB MaxScale. Unlike a generic TCP proxy, MaxScale is a database-aware proxy that understands the Galera Cluster protocol. This allows it to make intelligent routing decisions based on the real-time state of the cluster nodes.
A database-aware proxy like MaxScale can be configured to use several different routing strategies.
This is the most common and highly recommended strategy for general-purpose workloads.
How it Works: The load balancer is configured to send all write operations (INSERT, UPDATE, DELETE) to a single, designated primary node. All read operations (SELECT) are then distributed across the remaining available nodes.
Advantages:
Minimizes Transaction Conflicts: By directing all writes to one node, you significantly reduce the chance of two nodes trying to modify the same row at the same time, which would lead to deadlocks and transaction rollbacks.
In this simpler strategy, the load balancer distributes all connections evenly across all available nodes.
How it Works: Each new connection is sent to the next available node in a round-robin fashion.
Disadvantages: This approach can easily lead to transaction conflicts if your application sends writes to multiple nodes simultaneously. It is generally only suitable for applications that are almost exclusively read-only.
While MariaDB MaxScale is the recommended solution, other proxies and load balancers can also be used with Galera Cluster, including:
ProxySQL: Another popular open-source, database-aware proxy.
HAProxy: A very common and reliable TCP load balancer. When used with Galera, HAProxy is typically configured with a simple TCP health check or a custom script to determine node availability.
Cloud Load Balancers: Cloud providers like AWS (ELB/NLB), Google Cloud, and Azure offer native load balancing services that can be used to distribute traffic across a Galera Cluster.
When a network failure or a crash affects over half of your cluster nodes, the cluster might lose its . In such cases, the remaining nodes may return an Unknown command error for many queries. This behavior is a safeguard to prevent data inconsistency.
You can confirm this by checking the wsrep_cluster_status on all nodes:
If none of your nodes return a value of Primary, you must manually intervene to reset the Quorum and bootstrap a new Primary Component.
Maximizes Read Scalability: It fully utilizes the other nodes in the cluster for scaling out read-intensive workloads.
Before you can reset the Quorum, you must identify the most advanced node in the cluster. This is the node whose local database committed the last transaction. Starting the cluster from any other node can result in data loss.
To facilitate a safe restart and prevent an administrator from choosing the wrong node, modern versions of Galera Cluster include a "Safe-to-Bootstrap" feature.
When a cluster is shut down gracefully, the last node to be stopped will be the most up-to-date. Galera tracks this and marks only that last node as safe to bootstrap from by setting a flag in its state file. If you attempt to bootstrap from a node marked as unsafe, Galera will refuse and show a message in the logs. In the case of a sudden, simultaneous crash, all nodes will be considered unsafe, requiring manual intervention.
The procedure to select the right node depends on how the cluster was stopped.
In the case of a planned, orderly shutdown, you only need to follow the recommendation of the "Safe-to-Bootstrap" feature. On each node, inspect the /var/lib/mysql/grastate.dat file and look for the one where safe_to_bootstrap: 1 is set.
Use this node for the bootstrap.
In the case of a hard crash, all nodes will likely have safe_to_bootstrap: 0. You must therefore manually determine which node is the most advanced.
On each node, run the mysqld daemon with the --wsrep-recover option. This will read the InnoDB storage engine logs and report the last known transaction position in the MariaDB error log.
Inspect the error log for a line similar to this:
Compare the sequence number (the number after the colon) from all nodes. The node with the highest sequence number is the most advanced.
On that most advanced node, you can optionally edit the /var/lib/mysql/grastate.dat file and set safe_to_bootstrap: 1 to signify that you have willfully chosen this node.
Once you have identified the most advanced node, there are two methods to bootstrap the new Primary Component from it.
This method is recommended if the mysqld process is still running on the most advanced node. It is non-destructive and can preserve the GCache, increasing the chance of a fast Incremental State Transfer (IST) for the other nodes.
To perform an automatic bootstrap, connect to the most advanced node with a and execute:
This node will now form a new Primary Component by itself.
This method involves a full shutdown and a special startup of the most advanced node.
Ensure the mysqld service is stopped on all nodes in the cluster.
On the most advanced node only, start the cluster using the :
After the first node is successfully running and has formed the new Primary Component, start the MariaDB service normally on all of the other nodes.
They will detect the existing Primary Component, connect to it, and automatically initiate a State Transfer to synchronize their data and rejoin the cluster.
This page is licensed: CC BY-SA / Gnu FDL
Quorum is essential for maintaining data consistency in a MariaDB Galera Cluster by safeguarding against network partitions or node failures. It ensures that the cluster processes database queries and transactions only when a majority of nodes are operational, healthy, and in communication.
This majority group is known as the Primary Component. Nodes not in this group switch to a non-primary state, halting queries and entering a read-only "safe mode" to prevent data discrepancies. The primary function of Quorum is to avoid "split-brain" scenarios, which occur when network partitions lead to parts of the cluster operating independently and accepting writes. By ensuring only the partition with a majority of nodes becomes the Primary Component, Quorum effectively prevents these inconsistencies.
Quorum is achieved when more than 50% of the total nodes in the last known membership are in communication.
Odd Number of Nodes (Recommended): In a 3-node cluster, a majority is 2. The cluster can tolerate the failure of 1 node and remain operational.
Even Number of Nodes: In a 2-node cluster, a majority is also 2. If one node fails, the remaining node represents only 50% of the cluster, which is not a majority, and it will lose Quorum. This is why a 2-node cluster has no fault tolerance without an external .
garbd)The Galera Arbitrator (garbd) is the standard solution for clusters with an even number of nodes. It is a lightweight, stateless daemon that acts as a voting member in the cluster without being a full database node. It participates in Quorum calculations, effectively turning an even-numbered cluster into an odd-numbered one. In the diagram, in a 2-node cluster, adding garbd makes the total number of voting members 3, allowing the cluster to maintain Quorum if one database node fails.
A split-brain occurs when a network partition divides the cluster and no resulting group of nodes has a majority (e.g., a 4-node cluster splitting into two groups of 2). By design, both halves of the cluster will fail to achieve a majority, and all nodes will enter a non-Primary state.
If you need to restore service before the network issue is fixed, you must manually intervene:
Choose ONE side of the partition to become the new Primary Component.
On a single node within that chosen group, execute the following command:
The nodes in this group will now form a new Primary Component. When network connectivity is restored, the nodes from the other partition will automatically rejoin.
Never execute the on both sides of a partition, as this will create two independent, active clusters with diverging data.
As a more advanced alternative to garbd for fine-grained control, nodes can also be assigned a specific voting weight.
You can check the health of the cluster and its Quorum status at any time by querying the following .
If the entire cluster loses Quorum (e.g., from a simultaneous crash or shutdown), you must a new Primary Component to restore service. This must be done from the node that contains the most recent data to avoid any data loss.
MariaDB Galera Cluster provides a safe_to_bootstrap flag in the /var/lib/mysql/grastate.dat file to make this process safer and easier.
The last node to shut down will be the most up-to-date and will have safe_to_bootstrap: 1 set in its . You should always look for and bootstrap from this node.
If all nodes crashed, they will all likely have safe_to_bootstrap: 0. In this case, you must manually determine the most advanced node by finding the one with the highest seqno in its grastate.dat file or by using the --wsrep-recover utility.
Once you have identified the correct node, you will start the MariaDB service on that node only using a special bootstrap command (e.g., galera_new_cluster). After it comes online and forms a new Primary Component, you can start the other nodes normally, and they will .
For detailed, step-by-step instructions on this critical procedure, see
This page is licensed: CC BY-SA / Gnu FDL
MariaDB's are very useful when used with , which is primarily what that feature was developed for. Galera Cluster, on the other hand, was developed by Codership for all MySQL and MariaDB variants, and the initial development of the technology pre-dated MariaDB's implementation. As a side effect, MariaDB Galera Cluster (at least until ) only partially supports MariaDB's implementation.
Galera Cluster has its own certification-based replication method that is substantially different from . However, it would still be beneficial if MariaDB Galera Cluster was able to associate a Galera Cluster write set with a that is globally unique but that is also consistent for that write set on each cluster node.
MariaDB supports .
MariaDB has a feature called wsrep GTID mode. When this mode is enabled, MariaDB uses some tricks to try to associate each Galera Cluster write set with a that is globally unique, but that is also consistent for that write set on each cluster node. These tricks work in some cases, but can still become inconsistent among cluster nodes.
Several things need to be configured for wsrep GTID mode to work, such as
needs to be set on all nodes in the cluster.
needs to be set to the same value on all nodes in a given cluster, so that each cluster node uses the same domain when assigning for Galera Cluster's write sets. When replicating between two clusters, each cluster should have this set to a different value, so that each cluster uses different domains when assigning for their write sets.
needs to be enabled on all nodes in the cluster. See .
And as an extra safety measure:
should be set to a different value on all nodes in a given cluster, and each of these values should be different than the configured value. This is to prevent a node from using the same domain used for Galera Cluster's write sets when assigning for non-Galera transactions, such as DDL executed with set or DML executed with set.
If you want to avoid writes accidentally local GTIDS, you can avoid it with by setting this:
In this case you get an error:
You can overwrite it temporarily with:
For information on setting , see .
If a Galera Cluster node is also a , then that node's will be applying transactions that it replicates from its replication master. If the node has set, then each transaction that the applies will also generate a Galera Cluster write set that is replicated to the rest of the nodes in the cluster.
The node acting as slave includes the transaction's original Gtid_Log_Event in the replicated write set, so all nodes should associate the write set with its original GTID. See .
This page is licensed: CC BY-SA / Gnu FDL
Hybrid replication leverages standard, asynchronous MariaDB Replication to copy data from a synchronous MariaDB Galera Cluster to an external server or another cluster. This configuration establishes a one-way data flow, where the entire Galera Cluster serves as the source (primary) for one or more asynchronous replicas. This advanced setup combines the strengths of both replication methods: synchronous replication ensures high availability within the primary site, while asynchronous replication caters to specific use cases, allowing for flexible data distribution.
Implementing a hybrid replication setup is a powerful technique for solving several common business needs:
Before implementing a hybrid setup, it is critical to understand the technical challenges:
This page is licensed: CC BY-SA / Gnu FDL
mysqld --wsrep-recover...
[Note] WSREP: Recovered position: 37bb872a-ad73-11e6-819f-f3b71d9c5ada:345628
...galera_new_clusterSHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';# GALERA saved state
version: 2.1
uuid: 9acf4d34-acdb-11e6-bcc3-d3e36276629f
seqno: 15
safe_to_bootstrap: 1SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';systemctl start mariadbDisaster Recovery (DR)
Galera Cluster provides high availability and automatic failover. Use asynchronous replication for a distant replica, promoting it during site outages.
Feeding Analytics/BI Systems
Replicate from OLTP Galera Cluster to a data warehouse or analytics server to run heavy queries without affecting production performance.
Upgrades and Migrations
Use an asynchronous replica to test new MariaDB versions or migrate to new hardware with minimal downtime.
GTID Management
Galera Cluster and MariaDB Replication use different GTID formats and implementations, requiring careful configuration to avoid conflicts.
Replication Lag
The external replica experiences the usual latencies of asynchronous replication, causing it to lag behind the real-time state of the cluster.
Failover Complexity
Failover within Galera Cluster is automatic, but failing over to the asynchronous DR replica is manual and requires careful planning.
Status of the component the node belongs to.
Primary
Number of nodes in the current component.
Matches expected total
Unique identifier for the cluster's state.
Same on all nodes
Identifier for the cluster membership group.



Same on all nodes
SET GLOBAL wsrep_provider_options='pc.bootstrap=true';An Incremental State Transfer (IST) is the fast and efficient process where a joining node receives only the missing transactions it needs to catch up with the cluster, rather than receiving a full copy of the entire database.
This is the preferred provisioning method because it is:
Fast: Transferring only the missing changes is significantly faster than copying the entire dataset.
Non-Blocking: The donor node can continue to serve read and write traffic while an IST is in progress.
IST is an automatic process, but it is only possible if the following conditions are met:
The joining node has previously been a member of the cluster (its matches the cluster's).
All of the that the joiner is missing are still available in the donor node's Write-set Cache (GCache).
If these conditions are not met, the cluster automatically falls back to performing a full State Snapshot Transfer (SST).
Appliers need to verify foreign key constraints during normal operation in multi-active topologies. Therefore, appliers are configured to enable FK checking.
However, during node joining, in IST and latter catch-up period, the node is still idle (from local connections), and the only source for incoming transactions is the cluster sending certified write sets for applying. IST happens with parallel applying — there is a possibility that foreign key check cause lock conflicts between appliers accessing FK child and parent tables. Also, excessive FK checking slows down the IST process.
To address that issue, you can relax FK checks for appliers during IST and catch-up periods. The relaxed FK check mode is configurable by :
When this operation mode is set, and the node is processing IST or catch-up, appliers skip FK checking.
The GCache is a special cache on each node whose primary purpose is to store recent write-sets specifically to facilitate Incremental State Transfers. The size and configuration of the GCache are therefore critical for the cluster's recovery speed and .
When a node attempts to rejoin the cluster, it reports the of the last transaction it successfully applied. The potential donor node then checks its GCache for the very next seqno in that sequence.
The donor has the necessary history. It streams all subsequent write-sets from its GCache to the joiner. The joiner applies them in order and quickly becomes .
The node was disconnected for too long, and the required history has been purged from the cache. IST is not possible, and an SST is initiated.
You can control the GCache behavior with several in the [galera] section of your configuration file (my.cnf).
gcache.sizeThe gcache.size parameter is the most critical setting for ensuring nodes can use IST. A GCache that is too small is the most common reason for a cluster falling back to a full SST.
The ideal size depends on your cluster's write rate and the amount of downtime you want to tolerate for a node before forcing an SST. For instance, do you want a node that is down for 1 hour for maintenance to recover instantly (IST), or can you afford a full SST?
The most accurate way to size your GCache is to base it on your cluster's write rate.
Find your cluster's write rate:
You can calculate this using the wsrep_received_bytes status variable. First, check the value and note the time:
Wait for a significant interval during peak load (e.g., 3600 seconds, or 1 hour). Run the query again:
Now, calculate the rate (bytes per second):
If you cannot calculate the write rate, you can use a simpler heuristic based on your data directory size as a starting point.
Start with the size of your data directory.
Subtract the size of the GCache's ring buffer file itself (default: galera.cache).
Consider your SST method:
If you use mysqldump
These calculations are guidelines. If your cluster nodes frequently request SSTs, it is a clear sign your gcache.size is too small. In cases where you must avoid SSTs as much as possible, you should use a much larger GCache than suggested, assuming you have the available storage.
This page is licensed: CC BY-SA / Gnu FDL
and MariaDB Galera Cluster can be used together. However, there are some things that have to be taken into account.
If you want to use and MariaDB Galera Cluster together, then the following tutorials may be useful:
If a Galera Cluster node is also a , then some additional configuration may be needed.
Like with , write sets that are received by a node with are not written to the by default.
If the node is a replication master, then its replication slaves only replicate transactions that are in the binary log, so this means that the transactions that correspond to Galera Cluster write-sets would not be replicated by any replication slaves by default. If you would like a node to write its replicated write sets to the , then you will have to set . If the node has any replication slaves, then this would also allow those slaves to replicate the transactions that corresponded to those write sets.
See for more information.
If a Galera Cluster node is also a , then some additional configuration may be needed.
If the node is a replication slave, then the node's will be applying transactions that it replicates from its replication master. Transactions applied by the slave SQL thread will only generate Galera Cluster write-sets if the node has set. Therefore, in order to replicate these transactions to the rest of the nodes in the cluster, must be set.
If the node is a replication slave, then it is probably also a good idea to enable . When this is enabled, the node will restart its whenever it rejoins the cluster.
Historically, Galera Cluster nodes acting as asynchronous replication slaves were restricted to single-threaded execution (slave_parallel_threads=0). Enabling parallel replication often resulted in deadlocks due to conflicts between ordering and Galera's internal pre-commit ordering.
As of MariaDB 12.1.1, this limitation has been resolved.
On supported versions, you can safely configure slave_parallel_threads to a value greater than 0 to improve the performance of incoming replication streams.
Recommended Configuration (MariaDB 12.1.1+):
Both and support , so extra caution must be taken when using all of these features together. See for more details on how MariaDB Galera Cluster interprets replication filters.
It is most common to set to the same value on each node in a given cluster. Since uses a , all nodes should have the same data, so in a logical sense, a cluster can be considered in many cases a single logical server for purposes related to . The of each cluster node might even contain roughly the same transactions and if is set and if is enabled and if non-Galera transactions are not being executed on any nodes.
There are cases when it might make sense to set a different value on each node in a given cluster. For example, if is set and if another cluster or a standard MariaDB Server is using to replicate transactions from each cluster node individually, then it would be required to set a different value on each node for this to work.
Keep in mind that if replication is set up in a scenario where each cluster node has a different value, and if the replication topology is set up in such a way that a cluster node can replicate the same transactions through Galera and through MariaDB replication, then you may need to configure the cluster node to ignore these transactions when setting up MariaDB replication. You can do so by setting to the server IDs of all nodes in the same cluster when executing . For example, this might be required when circular replication is set up between two separate clusters, and each cluster node has a different value, and each cluster has set.
This page is licensed: CC BY-SA / Gnu FDL
Perform a manual node provision. This guide details the steps to manually backup a donor and restore it to a joiner node in a Galera Cluster.
Sometimes it can be helpful to perform a "manual SST" when Galera's normal SSTs fail. This can be especially useful when the cluster's is very large, since a normal SST can take a long time to fail in that case.
A manual SST essentially consists of taking a backup of the donor, loading the backup on the joiner, and then manually editing the cluster state on the joiner node. This page will show how to perform this process with .
For example, a very common version number is "2.1".
wsrep_mode = DISALLOW_LOCAL_GTID ERROR 4165 (HY000): Galera replication not supportedSET sql_log_bin = 0;Calculate your desired GCache size:
Decide on the time window you want to support (e.g., 2 hours = 7200 seconds).
In this example, a gcache.size of 140M would allow a node to be down for 2 hours and still rejoin using IST.
Check your current GCache validity period:
Conversely, you can use your write rate to see how long your current GCache size is valid:
mysqldumpIf you use rsync or xtrabackup, the log files are copied, so they should be part of the total size.
Controls the size of the on-disk ring-buffer file. A larger GCache can hold more history, increasing the chance of a fast IST over SST.
Specifies where GCache files are stored. Best practice is to place it on the fastest available storage like SSD or NVMe.
Enabled by default in , it allows a node to recover its GCache post-restart, enabling immediate service as a donor for IST.
mariadb_backup_galera_infoMariaDB 11.3 and earlier: xtrabackup_galera_info
For MariaDB 11.4 and later:
For MariaDB 11.3 and earlier:
The file contains the values of the wsrep_local_state_uuid and wsrep_last_committed status variables. The values are written in the following format:
For example:
On the joiner node, verify that the node does not need to perform a normal SSTs due to the manual SST.
wsrep_mode=SKIP_APPLIER_FK_CHECKS_IN_ISTSHOW STATUS LIKE 'wsrep_received_bytes';+------------------------+-----------+
| Variable_name | Value |
+------------------------+-----------+
| wsrep_received_bytes | 6637093 |
+------------------------+-----------+SHOW STATUS LIKE 'wsrep_received_bytes';+------------------------+-----------+
| Variable_name | Value |
+------------------------+-----------+
| wsrep_received_bytes | 79883093 |
+------------------------+-----------+sudo tee $MYSQL_BACKUP_DIR/grastate.dat <<EOF
# GALERA saved state
version: 2.1
uuid: d38587ce-246c-11e5-bcce-6bbd0831cc0f
seqno: 1352215
safe_to_bootstrap: 0
EOFMYSQL_DATADIR=/var/lib/mysql
rm -Rf $MYSQL_DATADIR/*mariadb-backup --copy-back \
--target-dir=$MYSQL_BACKUP_DIRchown -R mysql:mysql $MYSQL_DATADIR/systemctl start mariadbtail -f /var/log/mysql/mysqld.logmariadb-backup --versionMYSQL_BACKUP_DIR=/mysql_backup
mkdir $MYSQL_BACKUP_DIRDB_USER=sstuser
DB_USER_PASS=password
mariadb-backup --backup --galera-info \
--target-dir=$MYSQL_BACKUP_DIR \
--user=$DB_USER \
--password=$DB_USER_PASSsystemctl status mariadbMYSQL_BACKUP_DIR=/mysql_backup
mkdir $MYSQL_BACKUP_DIROS_USER=dba
JOINER_HOST=dbserver2.mariadb.com
rsync -av $MYSQL_BACKUP_DIR/* ${OS_USER}@${JOINER_HOST}:${MYSQL_BACKUP_DIR}mariadb-backup --prepare \
--target-dir=$MYSQL_BACKUP_DIRMYSQL_DATADIR=/var/lib/mysql
cat $MYSQL_DATADIR/grastate.dat | grep versioncat $MYSQL_BACKUP_DIR/mariadb_backup_galera_infocat $MYSQL_BACKUP_DIR/xtrabackup_galera_infowsrep_local_state_uuid:wsrep_last_committedd38587ce-246c-11e5-bcce-6bbd0831cc0f:1352215SET GLOBAL slave_parallel_threads = 4; -- Adjust based on workload
SET GLOBAL slave_parallel_mode = 'optimistic';The Galera Load Balancer (glbd) is no longer under active development.
It is provided here for historical and reference purposes only.
For new deployments, we recommend using a modern, fully supported proxy such as MariaDB MaxScale.
Galera Load Balancer (glbd) is a simple, multi-threaded TCP connection balancer, optimized for database workloads.
It was inspired by pen, but unlike it, GLB focuses only on balancing generic TCP connections.
GLB must be built from source. There are no pre-built packages.
This installs:
glbd (daemon) → /usr/sbin
libglb (shared library)
To run as a service:
Manage with:
GLB can be configured either via command-line options or via a configuration file.
glbd.cfg)GLB supports five policies:
-T | --top option: restricts balancing to servers with the highest weight.
GLB can be managed at runtime via:
FIFO file
Control socket (-c <addr:port>)
Example:
The watchdog module performs asynchronous health checks beyond simple TCP reachability.
Enable with:
Runs mysql.sh with host:port as first argument.
Exit code 0 = healthy, non-zero = failure.
Use -i to set check interval.
libglb enables transparent load balancing by intercepting the connect() system call.
Limited by system open files (ulimit -n)
With default 1024 → ~493 connections
With 4096 (typical unprivileged user) → ~2029 connections
– recommended modern proxy
This page is licensed: CC BY-SA / Gnu FDL
Get stats
echo "getstat" | nc 127.0.0.1 4444
Show raw performance counters
With Galera, -D|--discover enables auto-discovery of nodes.
/etc/security/limits.conf if neededServer Draining
Remove servers smoothly without interrupting active connections.
High Performance
Uses Linux epoll API (2.6+).
Multithreading
Leverages multi-core CPUs for better performance.
Optional Watchdog Module
Monitors server health.
Seamless Client Integration
Uses libglb for load balancing without changing applications, by intercepting connect() calls.
LISTEN_ADDR
Address/port GLB listens on for client connections
DEFAULT_TARGETS
Space-separated list of backend servers
OTHER_OPTIONS
Extra GLB options (e.g. balancing policy)
Least Connected(default)
Routes new connections to the server with the fewest active connections (adjusted for weight).
Round Robin
Sequentially cycles through available servers.
Single
Routes all connections to the highest-weight server until it fails or a higher-weight server is available.
Random
Distributes connections randomly among servers.
Source Tracking
Routes all connections from the same client IP to the same server (best-effort).
Add/Modify server
echo "192.168.0.1:3307:5" | nc 127.0.0.1 4444
Add backend with weight 5
Drain server
echo "192.168.0.1:3307:0" | nc 127.0.0.1 4444
Stop new connections, keep existing
Delete server
echo "192.168.0.1:3307:-1" | nc 127.0.0.1 4444
Remove backend and close active connections
Get routing table
echo "getinfo" | nc 127.0.0.1 4444
Show backends, weight, usage, connections
in / out
Bytes received/sent via client interface
recv / send
Bytes passed and number of recv()/send() calls
conns
Created / concurrent connections
poll
Read-ready / write-ready / total poll calls
elapsed
Time since last report (seconds)
GLB_WATCHDOG
Same as --watchdog option
GLB_TARGETS
Comma-separated list of backends (H:P:W)
GLB_BIND
Local bind address for intercepted connections
GLB_POLICY
Balancing policy (single, random, source)
GLB_CONTROL
Control socket for runtime commands
git clone https://github.com/codership/glb
cd glb/
./bootstrap.sh
./configure
make
sudo make install# cp files/glbd.sh /etc/init.d/glb
# cp files/glbd.cfg /etc/sysconfig/glbd.cfg # Red Hat / CentOS
# cp files/glbd.cfg /etc/default/glbd.cfg # Debian / Ubuntuservice glb start|stop|restart|getinfo|getstats|add|remove|drainglbd --help# Galera Load Balancer Configuration
LISTEN_ADDR="8010"
DEFAULT_TARGETS="192.168.1.1 192.168.1.2 192.168.1.3"
OTHER_OPTIONS="--random --top 3"echo "getstat" | nc -q 1 127.0.0.1 4444
in: 6930 out: 102728 recv: 109658 / 45 send: 109658 / 45 conns: 0 / 4 poll: 45 / 0 / 45 elapsed: 1.03428glbd -w exec:"mysql.sh -utest -ptestpass" -t 2 3306 192.168.0.1 192.168.0.2export LD_PRELOAD=/path/to/libglb.so
export GLB_OPTIONS="--random 3306 192.168.0.1 192.168.0.2 192.168.0.3"
mysql -uroot -p -h127.0.0.1 -P3306can be used to replicate between MariaDB Galera Cluster and MariaDB Server. This article will discuss how to do that.
Before we set up replication, we need to ensure that the cluster is configured properly. This involves the following steps:
Set on all nodes in the cluster. See Configuring MariaDB Galera Cluster: Writing Replicated Write Sets to the Binary Log and Using MariaDB Replication with MariaDB Galera Cluster: Configuring a Cluster Node as a Replication Master for more information on why this is important. It is also needed to .
Set to the same value on all nodes in the cluster. See for more information on what this means.
If you want to use replication, then you also need to configure some things to . For example:
needs to be set on all nodes in the cluster.
needs to be set to the same value on all nodes in the cluster so that each cluster node uses the same domain when assigning for Galera Cluster's write sets.
needs to be enabled on all nodes in the cluster. See about that.
needs to be set to the same path on all nodes in the cluster. See
And as an extra safety measure:
should be set to a different value on all nodes in a given cluster, and each of these values should be different than the configured value. This is to prevent a node from using the same domain used for Galera Cluster's write sets when assigning for non-Galera transactions, such as DDL executed with set or DML executed with set.
Before we set up replication, we also need to ensure that the MariaDB Server replica is configured properly. This involves the following steps:
Set to a different value than the one that the cluster nodes are using.
Set to a value that is different than the and values that the cluster nodes are using.
Set and if you want the replica to log the transactions that it replicates.
Our process to set up replication is going to be similar to the process described at , but it will be modified a bit to work in this context.
The very first step is to start the nodes in the first cluster. The first node will have to be . The other nodes can be started normally.
Once the nodes are started, you need to pick a specific node that will act as the replication primary for the MariaDB Server.
Now that the backup has been restored to the MariaDB Server replica, you can start the MariaDB Server process.
You can also set up between the cluster and MariaDB Server, which means that the MariaDB Server replicates from the cluster, and the cluster also replicates from the MariaDB Server.
This page is licensed: CC BY-SA / Gnu FDL
In a State Snapshot Transfer (SST), the cluster provisions nodes by transferring a full data copy from one node to another. When a new node joins the cluster, the new node initiates a State Snapshot Transfer to synchronize its data with a node that is already part of the cluster.
There are two conceptually different ways to transfer a state from one MariaDB server to another:
Logical: The only SST method of this type is the mysqldump
At this point, you need to get the replication coordinates of the primary from the original backup.
The coordinates will be in the file.
mariadb-backup dumps replication coordinates in two forms: and file and position coordinates, like the ones you would normally see from output. In this case, it is probably better to use the coordinates.
For example:
Regardless of the coordinates you use, you will have to set up the primary connection using and then start the replication threads with .
If you want to use GTIDs, then you will have to first set to the coordinates that we pulled from the file, and we would set MASTER_USE_GTID=slave_pos in the command. For example:
If you want to use the file and position coordinates, then you would set MASTER_LOG_FILE and MASTER_LOG_POS in the command to the file and position coordinates that we pulled from the file. For example:
How this is done would depend on whether you want to use the coordinates or the file and position coordinates.
Regardless, you need to ensure that the second cluster is not accepting any writes other than those that it replicates from the cluster at this stage.
To get the GTID coordinates on the MariaDB server, you can check by executing:
Then on the node acting as a replica in the cluster, you can set up replication by setting to the GTID that was returned and then executing :
To get the file and position coordinates on the MariaDB server, you can execute :
Then on the node acting as a replica in the cluster, you would set master_log_file and master_log_pos in the command. For example:
Physical: SST methods of this type physically copy the data files from the donor node to the joiner node. This requires that the joiner node be initialized after the transfer. The SST method and a few other SST methods fall into this category. These SST methods are much faster than the mysqldump SST method, but they have certain limitations. For example, they can be used only on server startup, and the joiner node must be configured very similarly to the donor node (e.g., should be the same, and so on). Some of the SST methods in this category are non-blocking on the donor node, meaning that the donor node is still able to process queries while donating the SST (e.g. the SST method is non-blocking).
SST methods are supported via a scriptable interface. New SST methods could potentially be developed by creating new SST scripts. The scripts usually have names of the form wsrep_sst_<method> where <method> is one of the SST methods listed below.
You can choose your SST method by setting the wsrep_sst_method system variable. It can be changed dynamically with on the node that you intend to be an SST donor. For example:
It can also be set in a server in an prior to starting up a node:
For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set wsrep_sst_method to the same value on all nodes, since any node will usually be a donor or joiner node at some point.
MariaDB Galera Cluster comes with the following built-in SST methods:
This SST method uses the utility for performing SSTs. It is one of the two non-locking methods. This is the recommended SST method if you require the ability to run queries on the donor node during the SST. Note that if you use the mariadb-backup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor to the joiner. This is a limitation inherited from the xtrabackup-v2 SST method.
This SST method supports
This SST method supports .
This SST method is available from MariaDB 10.1.26 and MariaDB 10.2.10.
With this SST method, it is impossible to upgrade the cluster between some major versions; see MDEV-27437.
See mariadb-backup SST method for more information.
rsync is the default method. This method uses the rsync utility to create a snapshot of the donor node. rsync should be available by default on all modern Linux distributions. The donor node is blocked with a read lock during the SST. This is the fastest SST method, especially for large datasets since it copies binary data. Because of that, this is the recommended SST method if you do not need to allow the donor node to execute queries during the SST.
The rsync method runs rsync in --whole-file mode, assuming that nodes are connected by fast local network links so that the default delta transfer mode would consume more processing time than it may save on data transfer bandwidth. When having a distributed cluster with slow links between nodes, the rsync_wan method runs rsync in the default delta transfer mode, which may reduce data transfer time substantially when an older datadir state is already present on the joiner node. Both methods are actually implemented by the same script, wsrep_sst_rsync_wan is just a symlink to the wsrep_sst_rsync script and the actual rsync mode to use is determined by the name the script was called by.
This SST method supports
This SST method supports .
The rsync SST method does not support tables created with the clause. Use the mariadb-backup SST method as an alternative to support this feature.
Use of this SST method could result in data corruption when using (the default).
Use of this SST method could result in data corruption when using (the default). wsrep_sst_method=rsync is a reliable way to upgrade the cluster to a newer major version.
stunnel can be used to encrypt data over the wire. Be sure to have stunnel installed. You will also need to generate certificates and keys. See the stunnel documentation for information on how to do that. Once you have the keys, you will need to add the tkey and tcert options to the [sst] option group in your MariaDB configuration file, such as:
You also need to run the certificate directory through openssl rehash.
stunnel cannot be used to encrypt data over the wire.
This SST method runs on the donor node and pipes the output to the client connected to the joiner node. The mysqldump SST method needs a username/password pair set in the wsrep_sst_auth variable in order to get the dump. The donor node is blocked with a read lock during the SST. This is the slowest SST method.
This SST method supports .
This SST method supports .
Percona XtraBackup is not supported in MariaDB. is the recommended backup method to use instead of Percona XtraBackup. See for more information.
This SST method uses the utility for performing SSTs. It is one of the two non-blocking methods. Note that if you use the xtrabackup-v2 SST method, you also need to have socat installed on the server. Since Percona XtraBackup is a third-party product, this SST method requires an additional installation and some additional configuration. Please refer to Percona's xtrabackup SST documentation for information from the vendor.
This SST method does not support
This SST method does not support .
This SST method is available from MariaDB Galera Cluster 5.5.37 and MariaDB Galera Cluster 10.0.10.
See xtrabackup-v2 SST method for more information.
Percona XtraBackup is not supported in MariaDB. is the recommended backup method to use instead of Percona XtraBackup. See for more information.
This SST method is an older SST method that uses the utility for performing SSTs. The xtrabackup-v2 SST method should be used instead of the xtrabackup SST method starting from MariaDB 5.5.33.
This SST method does not support
This SST method does not support .
All SST methods except rsync require authentication via username and password. You can tell the client what username and password to use by setting the wsrep_sst_auth system variable. It can be changed dynamically with on the node that you intend to be a SST donor. For example:
It can also be set in a server in an prior to starting up a node:
Some do not require a password. For example, the and authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of wsrep_sst_auth empty. For example:
See the relevant description or page for each SST method to find out what privileges need to be to the user and whether the privileges are needed on the donor node or joiner node for that method.
MariaDB's unit file has a default startup timeout of about 90 seconds on most systems. If an SST takes longer than this default startup timeout on a joiner node, then systemd will assume that mysqld has failed to startup, which causes systemd to kill the mysqld process on the joiner node. To work around this, you can reconfigure the MariaDB systemd unit to have an infinite timeout, such as by executing one of the following commands:
If you are using systemd 228 or older, then you can execute the following to set an infinite timeout:
Systemd 229 added the infinity option, so if you are using systemd 229 or later, then you can execute the following to set an infinite timeout:
See Configuring the Systemd Service Timeout for more details.
Note that systemd 236 added the EXTEND_TIMEOUT_USEC environment variable that allows services to extend the startup timeout during long-running processes. Starting with MariaDB 10.1.35, MariaDB 10.2.17, and MariaDB 10.3.8, on systems with systemd versions that support it, MariaDB uses this feature to extend the startup timeout during long SSTs. Therefore, if you are using systemd 236 or later, then you should not need to manually override TimeoutStartSec, even if your SSTs run for longer than the configured value. See MDEV-15607 for more information.
An SST failure generally renders the joiner node unusable. Therefore, when an SST failure is detected, the joiner node will abort.
Restarting a node after a mysqldump SST failure may require manual restoration of the administrative tables.
Look at the description of each SST method to determine which methods support .
For logical SST methods like mysqldump, each node should be able to have different . For physical SST methods, all nodes need to have the same , since the donor node will copy encrypted data files to the joiner node, and the joiner node will need to be able to decrypt them.
In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3.
When using an SST method that blocks the donor, there is yet another reason to require a minimum of 3 nodes. In a 3-node cluster, if one node is acting as an SST joiner and one other node is acting as an SST donor, then there is still one more node to continue executing queries.
In some cases, if Galera Cluster's automatic SSTs repeatedly fail, then it can be helpful to perform a "manual SST". See the following pages on how to do that:
SST scripts can't currently read the mysqld<#> in an that are read by instances managed by .
See MDEV-18863 for more information.
This page is licensed: CC BY-SA / Gnu FDL
can be used for replication between two MariaDB Galera Clusters. This article will discuss how to do that.
Before we set up replication, we need to ensure that the clusters are configured properly. This involves the following steps:
Set on all nodes in both clusters. See Configuring MariaDB Galera Cluster: Writing Replicated Write Sets to the Binary Log and Using MariaDB Replication with MariaDB Galera Cluster: Configuring a Cluster Node as a Replication Master for more information on why this is important. This is also needed to .
Set to the same value on all nodes in a given cluster, but be sure to use a different value in each cluster. See for more information on what this means.
If you want to use replication, then you also need to configure some things to . For example:
needs to be set on all nodes in each cluster.
needs to be set to the same value on all nodes in a given cluster so that each cluster node uses the same domain when assigning for Galera Cluster's write sets. Each cluster should have this set to a different value so that each cluster uses different domains when assigning for their write sets.
needs to be enabled on all nodes in the cluster. See about that.
And as an extra safety measure:
should be set to a different value on all nodes in a given cluster, and each of these values should be different than the configured value. This is to prevent a node from using the same domain used for Galera Cluster's write sets when assigning for non-Galera transactions, such as DDL executed with set or DML executed with set.
To improve the performance of the replication stream between clusters, it is recommended to enable on the nodes in the destination cluster (the cluster acting as the replica).
Our process to set up replication is going to be similar to the process described at , but it will be modified a bit to work in this context.
You can also set up between the two clusters, which means that the second cluster replicates from the first cluster, and the first cluster also replicates from the second cluster.
$ rsync -avrP /var/mariadb/backup dc2-dbserver1:/var/mariadb/backup$ mariadb-backup --copy-back \
--target-dir=/var/mariadb/backup/$ chown -R mysql:mysql /var/lib/mysql/mariadb-bin.000096 568 0-1-2SHOW SLAVE STATUS\GSHOW GLOBAL VARIABLES LIKE 'gtid_current_pos';SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO
MASTER_HOST="c2dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_USE_GTID=slave_pos;
START SLAVE;SHOW SLAVE STATUS\G$ mariadb-backup --backup \
--target-dir=/var/mariadb/backup/ \
--user=mariadb-backup --password=mypassword$ mariadb-backup --prepare \
--target-dir=/var/mariadb/backup/CREATE USER 'repl'@'dc2-dbserver1' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'dc2-dbserver1';CREATE USER 'repl'@'c1dbserver1' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'c1dbserver1';[sst]
tkey = /etc/my.cnf.d/certificates/client-key.pem
tcert = /etc/my.cnf.d/certificates/client-cert.pemSET GLOBAL wsrep_sst_method='mariadb-backup';[mariadb]
...
wsrep_sst_method = mariadb-backupSET GLOBAL wsrep_sst_auth = 'mariadb-backup:password';[mariadb]
...
wsrep_sst_auth = mariadb-backup:password[mariadb]
...
wsrep_sst_auth = mariadb-backup:sudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]
TimeoutStartSec=0
EOF
sudo systemctl daemon-reloadsudo tee /etc/systemd/system/mariadb.service.d/timeoutstartsec.conf <<EOF
[Service]
TimeoutStartSec=infinity
EOF
sudo systemctl daemon-reloadNow that the backup has been restored to the second cluster's replica, you can start the server by bootstrapping the node.
At this point, you need to get the replication coordinates of the primary from the original backup.
The coordinates will be in the file.
mariadb-backup dumps replication coordinates in two forms: and file and position coordinates, like the ones you would normally see from output. In this case, it is probably better to use the coordinates.
For example:
Regardless of the coordinates you use, you will have to set up the primary connection using and then start the replication threads with .
If you want to use GTIDs, then you will have to first set to the coordinates that we pulled from the file, and we would set MASTER_USE_GTID=slave_pos in the command. For example:
If you want to use the file and position coordinates, then you would set MASTER_LOG_FILE and MASTER_LOG_POS in the command to the file and position coordinates that we pulled the file. For example:
If the replica is replicating normally, then the next step would be to start the MariaDB Server process on the other nodes in the second cluster.
Now that the second cluster is up, ensure that it does not start accepting writes yet if you want to set up between the two clusters.
How this is done would depend on whether you want to use the coordinates or the file and position coordinates.
Regardless, you need to ensure that the second cluster is not accepting any writes other than those that it replicates from the first cluster at this stage.
To get the GTID coordinates on the second cluster, you can check by executing:
Then on the first cluster, you can set up replication by setting to the GTID that was returned and then executing :
To get the file and position coordinates on the second cluster, you can execute :
Then on the first cluster, you would set master_log_file and master_log_pos in the command. For example:
$ mariadb-backup --backup \
--target-dir=/var/mariadb/backup/ \
--user=mariadb-backup --password=mypassword$ mariadb-backup --prepare \
--target-dir=/var/mariadb/backup/$ rsync -avrP /var/mariadb/backup c2dbserver:/var/mariadb/backup$ mariadb-backup --copy-back \
--target-dir=/var/mariadb/backup/$ chown -R mysql:mysql /var/lib/mysql/CREATE USER 'repl'@'c2dbserver1' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'c2dbserver1';SHOW SLAVE STATUS\GSHOW GLOBAL VARIABLES LIKE 'gtid_current_pos';SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO
MASTER_HOST="c2dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_USE_GTID=slave_pos;
START SLAVE;SHOW SLAVE STATUS\GSET GLOBAL slave_parallel_threads = 4; -- Adjust based on workload
SET GLOBAL slave_parallel_mode = 'optimistic';CREATE USER 'repl'@'c1dbserver1' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'c1dbserver1';SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO
MASTER_HOST="c1dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_USE_GTID=slave_pos;
START SLAVE;CHANGE MASTER TO
MASTER_HOST="c1dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_LOG_FILE='mariadb-bin.000096',
MASTER_LOG_POS=568,
START SLAVE;SHOW MASTER STATUSCHANGE MASTER TO
MASTER_HOST="c2dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_LOG_FILE='mariadb-bin.000096',
MASTER_LOG_POS=568;
START SLAVE;Configure State Snapshot Transfers for Galera. Learn to use mariadb-backup for non-blocking data transfer when a new node joins a cluster.
The mariabackup SST method uses the utility for performing SSTs. It is one of the methods that does not block the donor node. mariadb-backup was originally forked from , and similarly, the mariabackup SST method was originally forked from the xtrabackup-v2 SST method.
If you use the mariadb-backup SST method, then you also need to have socat installed on the server. This is needed to stream the backup from the donor node to the joiner node. This is a limitation that was inherited from the xtrabackup-v2 SST method.
To use the mariadb-backup SST method, you must set the on both the donor and joiner node. It can be changed dynamically with on the node that you intend to be an SST donor. For example:
It can be set in a server in an prior to starting up a node:
For an SST to work properly, the donor and joiner node must use the same SST method. Therefore, it is recommended to set to the same value on all nodes, since any node will usually be a donor or joiner node at some point.
The InnoDB redo log format has been changed in and in a way that will not allow the crash recovery or the preparation of a backup from an older major version. Because of this, the mariabackup SST method cannot be used for some major-version upgrades, unless you temporarily edit the wsrep_sst_mariadbbackup script so that the --prepare step on the newer-major-version joiner will be executed using the older-major-version mariadb-backup tool.
The default method wsrep_sst_method=rsync works for major-version upgrades; see .
The mariabackup SST method is configured by placing options in the [sst] section of a MariaDB configuration file (e.g., /etc/my.cnf.d/server.cnf). These settings are parsed by the wsrep_sst_mariabackup and wsrep_sst_common scripts.
The command-line utility is mariadb-backup; this tool was previously called mariabackup. The SST method itself retains the original name mariabackup (as in wsrep_sst_method=mariabackup).
These options control the core data transfer mechanism.
These options configure on-the-fly compression to reduce network bandwidth.
These options manage user authentication and stream encryption.
mariadb-backup OptionsThis feature allows mariadb-backup specific options to be passed through the SST script.
To use the mariadb-backup SST method, the utility must be able to authenticate locally on the donor node to create a backup stream. There are two ways to manage this authentication:
Starting with MariaDB Enterprise Server 11.4, the cluster can automatically manage the SST user account. This method is more secure and requires less configuration because it avoids storing plain-text passwords in configuration files.
When this feature is used:
The donor node automatically creates a temporary internal user (e.g., 'wsrep.sst. <timestamp>_<node_id>'@localhost) with a generated password when the SST process begins.
The necessary privileges (RELOAD, PROCESS, LOCK TABLES, etc.) are automatically granted to this temporary user.
Once the SST process completes, the donor node automatically drops the user.
To enable automatic user management:
Ensure that the system variable is not set (or is left blank) in your configuration file.
If you explicitly define wsrep_sst_auth in your configuration, the server will revert to the manual behavior and attempt to authenticate using the credentials provided in that variable.
For versions prior to 11.4, or if you prefer to manage the user manually, you must create a user and provide the credentials to the server.
You can tell the donor node what username and password to use by setting the system variable. It can be changed dynamically with SET GLOBAL on the node that you intend to be an SST donor:
It can also be set in a server in an prior to starting up a node:
Some do not require a password. For example, the unix_socket and gssapi authentication plugins do not require a password. If you are using a user account that does not require a password in order to log in, then you can just leave the password component of empty. For example:
The user account that performs the backup for the SST needs to have the same privileges as , which are the RELOAD, PROCESS, LOCK TABLES and BINLOG MONITOR, REPLICA MONITOR . To be safe, ensure that these privileges are set on each node in your cluster. mariadb-backup connects locally on the donor node to perform the backup, so the following user should be sufficient:
It is possible to use the authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in .
The user account would have to have the same name as the operating system user account that is running the mysqld process. On many systems, this is the user account configured as the user option, and it tends to default to mysql.
For example, if the authentication plugin is already installed, then you could execute the following to create the user account:
To configure , set the following in a server in an prior to starting up a node:
It is possible to use the authentication plugin for the user account that performs SSTs. This would provide the benefit of not needing to configure a plain-text password in .
The following steps would need to be done beforehand:
You need a KDC running or .
You will need to for the MariaDB server.
You will need to containing the authentication plugin.
You will need to in MariaDB, so that the authentication plugin is available to use.
For example, you could execute the following to create the user account in MariaDB:
To configure , set the following in a server in an prior to starting up a node:
When mariadb-backup is used to create the backup for the SST on the donor node, mariadb-backup briefly requires a system-wide lock at the end of the backup. This is done with .
If a specific node in your cluster is acting as the primary node by receiving all of the application's write traffic, then this node should not usually be used as the donor node, because the system-wide lock could interfere with the application. In this case, you can define one or more preferred donor nodes by setting the system variable.
For example, let's say that we have a 5-node cluster with the nodes node1, node2, node3, node4, and node5, and let's say that node1 is acting as the primary node. The preferred donor nodes for node2 could be configured by setting the following in a server in an prior to starting up a node:
The trailing comma tells the server to allow any other node as donor when the preferred donors are not available. Therefore, if node1 is the only node left in the cluster, the trailing comma allows it to be used as the donor node.
During the SST process, the donor node uses socat to stream the backup to the joiner node. Then the joiner node prepares the backup before restoring it. The socat utility must be installed on both the donor node and the joiner node in order for this to work. Otherwise, the MariaDB error log will contain an error like:
On RHEL/CentOS, socat can be installed from the repository.
This SST method supports three different TLS methods. The specific method can be selected by setting the encrypt option in the [sst] section of the MariaDB configuration file. The options are:
TLS using OpenSSL encryption built into socat (encrypt=2)
TLS using OpenSSL encryption with Galera-compatible certificates and keys (encrypt=3)
TLS using OpenSSL encryption with standard MySQL/MariaDB SSL certificates (encrypt=4)
Note that encrypt=1 refers to a TLS encryption method that has been deprecated and removed.
To generate keys compatible with this encryption method, follow .
First, generate the keys and certificates:
On some systems, you may also have to add dhparams to the certificate:
Next, copy the certificate and keys to all nodes in the cluster.
When done, configure the following on all nodes in the cluster:
Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.
To generate keys compatible with this encryption method, follow .
First, generate the keys and certificates:
Next, copy the certificate and keys to all nodes in the cluster.
When done, configure the following on all nodes in the cluster:
Make sure to replace the paths with whatever is relevant on your system. This should allow your SSTs to be encrypted.
The mariadb-backup SST method has its own logging outside of the MariaDB Server logging.
Logging for mariadb-backup SSTs works the following way.
By default, on the donor node, it logs to mariadb-backup.backup.log. This log file is located in the .
By default, on the joiner node, it logs to mariadb-backup.prepare.log and mariadb-backup.move.log These log files are also located in the datadir.
By default, before a new SST is started, existing mariadb-backup SST log files are compressed and moved to /tmp/sst_log_archive. This behavior can be disabled by setting sst-log-archive=0 in the [sst] in an . Similarly, the archive directory can be changed by setting sst-log-archive-dir:
See for more information.
Redirect the SST logs to the syslog instead, by setting the following in the [sst] in an :
You can also redirect the SST logs to the syslog by setting the following in the [mysqld_safe] in an :
If you are performing SSTs with IPv6 addresses, then the socat utility needs to be passed the pf=ip6 option. This can be done by setting the sockopt option in the [sst] in an :
See for more information.
If Galera Cluster's automatic SSTs repeatedly fail, it can be helpful to perform a "manual SST"; see:
mariadb-bin.000096 568 0-1-2SET GLOBAL gtid_slave_pos = "0-1-2";
CHANGE MASTER TO
MASTER_HOST="c1dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_USE_GTID=slave_pos;
START SLAVE;CHANGE MASTER TO
MASTER_HOST="c1dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_LOG_FILE='mariadb-bin.000096',
MASTER_LOG_POS=568,
START SLAVE;SHOW MASTER STATUSCHANGE MASTER TO
MASTER_HOST="c2dbserver1",
MASTER_PORT=3310,
MASTER_USER="repl",
MASTER_PASSWORD="password",
MASTER_LOG_FILE='mariadb-bin.000096',
MASTER_LOG_POS=568;
START SLAVE;You will need to .
You will need to
streamfmt
mbstream
Specifies the backup streaming format. mbstream is the native format for mariadb-backup.
transferfmt
socat
Defines the network utility for data transfer.
sockopt
A string of socket options passed to the socat utility.
rlimit
compressor
The command-line string for compressing the data stream on the donor (e.g., "lz4 -z").
decompressor
The command-line string for decompressing the data stream on the joiner (e.g., "lz4 -d").
wsrep-sst-auth
The authentication string in user:password format. The user requires RELOAD, PROCESS, LOCK TABLES, and REPLICATION CLIENT privileges.
tcert
Path to the TLS certificate file for securing the transfer.
tkey
Path to the TLS private key file.
tca
Path to the TLS Certificate Authority (CA) file.
progress
Set to 1 to show transfer progress (requires pv utility).
sst-initial-timeout
300
Timeout in seconds for the initial connection.
sst-log-archive
1
Set to 1 to archive the previous SST log.
cpat
use-extra
0
Must be set to 1 to enable pass-through functionality.
Throttles the data transfer rate in bytes per second. Supports K, M, and G suffixes.
A space-separated list of extra files/directories to copy from donor to joiner.
SET GLOBAL wsrep_sst_method='mariabackup';[mariadb]
...
wsrep_sst_method = mariabackup[sst]
# Enable pass-through functionality
use-extra=1
# mariadb-backup native options
encrypt=AES256
encrypt-key-file=/etc/mysql/encrypt/keyfile.key
compress-threads=4[mariadb]
...
# Do not set wsrep_sst_auth to use automatic user management
# wsrep_sst_auth =SET GLOBAL wsrep_sst_auth = 'mariadbbackup:mypassword';[mariadb]
...
wsrep_sst_auth = mariadbbackup:mypassword[mariadb]
...
wsrep_sst_auth = mariadbbackup:CREATE USER 'mariadbbackup'@'localhost' IDENTIFIED BY 'mypassword';
GRANT RELOAD, PROCESS, LOCK TABLES,
BINLOG MONITOR ON *.* TO 'mariadbbackup'@'localhost';CREATE USER 'mysql'@'localhost' IDENTIFIED VIA unix_socket;
GRANT RELOAD, PROCESS, LOCK TABLES,
REPLICATION CLIENT ON *.* TO 'mysql'@'localhost';[mariadb]
...
wsrep_sst_auth = mysql:CREATE USER 'mariadbbackup'@'localhost' IDENTIFIED VIA gssapi;
GRANT RELOAD, PROCESS, LOCK TABLES,
BINLOG MONITOR ON *.* TO 'mariadbbackup'@'localhost';[mariadb]
...
wsrep_sst_auth = mariadbbackup:[mariadb]
...
wsrep_sst_donor=node3,node4,node5,WSREP_SST: [ERROR] socat not found in path: /usr/sbin:/sbin:/usr//bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin (20180122 14:55:32.993)FILENAME=sst
openssl genrsa -out $FILENAME.key 1024
openssl req -new -key $FILENAME.key -x509 -days 3653 -out $FILENAME.crt
cat $FILENAME.key $FILENAME.crt >$FILENAME.pem
chmod 600 $FILENAME.key $FILENAME.pemopenssl dhparam -out dhparams.pem 2048
cat dhparams.pem >> sst.pem[sst]
encrypt=2
tca=/etc/my.cnf.d/certificates/sst.crt
tcert=/etc/my.cnf.d/certificates/sst.pem# CA
openssl genrsa 2048 > ca-key.pem
openssl req -new -x509 -nodes -days 365000 \
-key ca-key.pem -out ca-cert.pem
# server1
openssl req -newkey rsa:2048 -days 365000 \
-nodes -keyout server1-key.pem -out server1-req.pem
openssl rsa -in server1-key.pem -out server1-key.pem
openssl x509 -req -in server1-req.pem -days 365000 \
-CA ca-cert.pem -CAkey ca-key.pem -set_serial 01 \
-out server1-cert.pem[sst]
encrypt=3
tkey=/etc/my.cnf.d/certificates/server1-key.pem
tcert=/etc/my.cnf.d/certificates/server1-cert.pem[sst]
sst-log-archive=1
sst-log-archive-dir=/var/log/mysql/sst/[sst]
sst-syslog=1[mysqld_safe]
syslog[sst]
sockopt=",pf=ip6"In a MariaDB Galera Cluster, an individual node is considered to have "failed" when it loses communication with the cluster's Primary Component. This can happen for many reasons, including hardware failure, a software crash, loss of network connectivity, or a critical error during a state transfer.
From the perspective of the cluster, a node has failed when the other members can no longer see it. From the perspective of the failed node itself (assuming it hasn't crashed), it has simply lost its connection to the Primary Component and will enter a non-operational state to protect data integrity.
Node failure detection is handled automatically by Galera's group communication system, which uses an Extended Virtual Synchrony (EVS) protocol. This process is controlled by several evs.* parameters in your configuration file.
The cluster determines a node's health based on the last time it received a network packet from that node. The process is as follows:
The cluster periodically checks for inactive nodes, controlled by .
If a node hasn't sent a packet within the , other nodes begin sending heartbeat beacons to it.
If the node remains silent for the duration of , the other nodes will mark it as "suspect."
Once all members of the Primary Component agree that a node is suspect, it is declared inactive and .
A safeguard mechanism ensures the cluster remains operational even if some nodes become unresponsive. If a node is active but overwhelmed—perhaps from excessive —it will be labeled as failed. This process ensures that one struggling node doesn't disrupt the entire cluster's functionality.
Within the context of the CAP Theorem (Consistency, Availability, Partition Tolerance), Galera Cluster strongly prioritizes Consistency. This leads to a direct trade-off when configuring the failure detection timeouts, especially on unstable networks like a .
Setting low values for evs.suspect_timeout allows the cluster to detect a genuinely failed node very quickly, minimizing downtime. However, on an unstable network, this can lead to "false positives," where a temporarily slow node is incorrectly evicted.
Setting higher values makes the cluster more tolerant of and slow nodes. However, if a node truly fails, the cluster will remain unavailable for a longer period while it waits for the timeout to expire.
Recovery from a single node failure is typically automatic. If one node in a cluster with three or more members fails, the rest of the cluster and continues to operate. When the failed node comes back online, it will automatically connect to the cluster and initiate a to synchronize its data. No data is lost in a single node failure.
A full cluster outage occurs when all nodes shut down or when , leaving no Primary Component. In this scenario, you must manually intervene to safely restart the cluster.
grastate.dat)This is the traditional recovery method. You must manually identify the node with the most recent data and force it to become the first node in a new cluster.
Stop all nodes in the cluster.
Identify the most advanced node by checking the seqno value in the in each node's data directory. The node with the highest seqno is the correct one to start from.
the new Primary Component by starting the MariaDB service on that single advanced node using a special command (e.g., galera_new_cluster).
pc.recoveryModern versions of Galera Cluster enable the pc.recovery parameter by default. This feature attempts to automate the recovery of the Primary Component.
When pc.recovery is enabled, nodes that were part of the last known Primary Component will save the state of that component to a file on disk called gvwstate.dat. If the entire cluster goes down, it can automatically recover its state once all the nodes from that last saved component achieve connectivity with each other.
gvwstate.dat fileThe gvwstate.dat file is created in the data directory of a node when it is part of a Primary Component and is deleted upon graceful shutdown. It contains the node's own UUID and its view of the other members of the component. An example:
my_uuid: The UUID of the node that owns this file.
view_id: An identifier for the specific cluster view.
member: The UUIDs of all nodes that were part of this saved Primary Component.
Avoid manually editing the gvwstate.dat file unless absolutely necessary. Doing so may cause data inconsistency or prevent the cluster from starting. This action should only be considered in critical recovery situations.
In the rare case that you need to force a specific set of nodes to form a new Primary Component, you can manually edit the gawtate.dat file on each of those nodes. By ensuring that each node's file lists itself and all other desired members in the member fields, you can force them to recognize each other and form a new component when you start them.
A node failure can also occur if a is interrupted. This will cause the receiving node (the "joiner") to abort its startup process. To recover, simply restart the MariaDB service on the failed joiner node.
This page is licensed: CC BY-SA / Gnu FDL
Additionally, if no messages are received from a node for a period greater than evs.inactive_timeout, it is declared failed immediately, regardless of consensus.
Start the other nodes normally. They will connect to the new Primary Component and sync their data.
my_uuid: d3124bc8-1605-11e4-aa3d-ab44303c044a
#vwbeg
view_id: 3 0dae1307-1606-11e4-aa94-5255b1455aa0 12
bootstrap: 0
member: 0dae1307-1606-11e4-aa94-5255b1455aa0 1
member: 47bbe2e2-1606-11e4-8593-2a6d8335bc79 1
member: d3124bc8-1605-11e4-aa3d-ab44303c044a 1
#vwendThis page is licensed: CC BY-SA / Gnu FDL