Fault Tolerance with MariaDB Xpand
Xpand provides fault tolerance by maintaining multiple copies of data across the cluster. This enables a cluster to experience the loss of node(s) or zone(s), without data loss and allowing the cluster to automatically resume operations.
Built-in Fault Tolerance
By default, Xpand is configured to accommodate a single node failure and automatically maintain 2 copies (replicas) of all data. As long as the cluster has sufficient replicas and a quorum of nodes is available, a cluster can lose a node without experiencing any data loss. Clusters with zones configured can lose a single zone.
The default settings for fault tolerance are generally acceptable for most clusters.
Zones and Fault Tolerance
Xpand can be configured to be zone-aware. A zone is an arbitrary grouping of Xpand nodes. Different zones could represent different availability zones in a cloud environment, different server racks, different network switches, and/or different power sources. Different zones should not be used to represent different regions, because the distance between regions tends to cause higher network latency, which can negatively affect performance.
To configure Xpand to be zone-aware, you can configure the zone for each node using the ALTER CLUSTER .. ZONE statement.
When Xpand is configured to use zones, Xpand's default zone-aware configuration can tolerate a single zone failure without experiencing any data loss. However, Xpand can be configured to tolerate more failures by setting MAX_
For additional information, see "Zones with MariaDB Xpand".
Setting up disaster recovery site for your Xpand cluster will allow you to recover from catastrophic failures. Setting up a secondary Xpand cluster for DR will also allow for easier transition to new releases. For information regarding the various replication configurations supported by Xpand, please see "Configure Replication with MariaDB Xpand".
MariaDB Xpand's fault tolerance is configurable. By default, MariaDB Xpand is configured to tolerate a single node failure or a single zone failure. If the default configuration does not provide sufficient fault tolerance for your needs, Xpand can be configured to provide even more fault tolerance.
To configure Xpand to provide a higher level of fault tolerance, you can configure the maximum number of node or zone failures using the ALTER CLUSTER SET MAX_
When the maximum number of failures is increased, Xpand maintains additional replicas for each slice, ensuring that Xpand can handle a greater simultaneous loss of nodes or zones without experiencing data loss.
When Xpand is configured to maintain additional physical replicas, Xpand's overhead increases for updates to logical slices, because each additional replica is updated synchronously when the slice is updated. Each additional replica requires the transaction to perform:
Additional network communication to transfer the updated data to the node hosting the replica
Additional disk writes to update the replica on disk
In some cases, the additional overhead can decrease throughput and increase commit latency, which can have negative effects for the application. If you choose to configure Xpand to maintain additional replicas, it is recommended to perform performance testing.
For additional information, see "MAX_
Effects of a Node or Zone Failure
When Xpand experiences a node or zone failure:
A node or zone can no longer communicate with other nodes, so it fails a heartbeat check.
A short group change occurs, during which Xpand updates membership. Once this is complete, Xpand resumes processing transactions.
The Rebalancer starts a timer that counts down from the global
rebalancer_reprotect_queue_interval_svalue (default = 10 minutes).
The Rebalancer establishes a Recovery Queue of pending changes for the node's or the zone's data and tracks all pending changes for that node or zone in that queue. The reprotect queue is necessary only if the failure is temporary.
The next steps depend on whether the failed node or zone returns before the timer exceeds the
If the failed node or zone returns within the interval:
The Rebalancer applies the transactions in the reprotect queue to the returned node or zone.
Operations resume normally.
If the failed node or zone does not return within the interval:
The Rebalancer discards the queued transactions.
The Rebalancer reprotects slices from the failed node or zone by creating new replicas to replace the ones that were lost. If the failed node(s) contained any ranking replicas, Xpand assigns that role to another replica.
When the reprotect process completes, Xpand sends a message indicating that full protection has been restored using Email Alerts. Xpand also writes an entry to
The failed/unavailable node(s) can be safely removed with ALTER CLUSTER DROP.
Group Change Effects on Transactions
If a transaction is interrupted by a group change or encounters a non-fatal error, Xpand automatically retries the transaction in some cases.
Transactions will only be retried if the global value of the
autoretry system variable is set to
The following types of transactions can be retried:
A single statement transaction executed with
autocommit = 1
The first statement in an explicit transaction
If Xpand retries a transaction and the transaction fails, Xpand returns an error to the application.
The following types of transactions cannot be retried:
Subsequent statements in an explicit transaction
If Xpand can't retry the transaction and your application is connecting to MaxScale's Read/Write Split (
readwritesplit) router, you can configure MaxScale automatically retry some transactions by configuring the delayed_
Group Change Effects on Connections
When a group change occurs, connections can be affected:
If a connection was opened to a node still in quorum, the connection will remain open after the new group is formed with the available nodes.
If a connection was opened to a node that is no longer in quorum, the connection will be lost.
When the connection is lost, users have a couple ways to automatically re-establish a connection to a valid node:
If your application is connecting to MaxScale's Read/Write Split (
readwritesplit) router, you can configure MaxScale to automatically reconnect by configuring the master_
If your application is using a MariaDB Connector that supports automatically reconnecting, you can enable that feature in the connector.
If you are using MariaDB Connector/C, the
MYSQL_OPT_RECONNECToption can be set with the
/* enable auto-reconnect */ mysql_optionsv(mysql, MYSQL_OPT_RECONNECT, (void *)"1");
If you are using MariaDB Connector/J, the
autoReconnectparameter can be set for the connection:
Connection connection = DriverManager.getConnection("jdbc:mariadb://192.168.0.1/test?user=test_user&password=myPassword&autoReconnect=true");
If you are using MariaDB Connector/Python, the
auto_reconnectparameter can be set for the connection:
conn = mariadb.connect( user="test_user", password="myPassword", host="192.168.0.1", port=3306, auto_reconnect=true)
Xpand was built for Fault Tolerance and High Availability. For additional information, please see the following sections: