Group Changes with MariaDB Xpand 6.1

Overview

MariaDB Xpand performs a group change when the cluster detects a change in cluster membership.

Compatibility

Information provided here applies to:

  • MariaDB Xpand 6.1

What is an Xpand Group?

Xpand uses a distributed group membership protocol to maintain the static set of all nodes known to the cluster and checks that the nodes maintain active communication between each other. Xpand refers to this as a Group.

What happens when nodes are added or removed from an Xpand cluster?

When the set of nodes changes, there is a change in the group of nodes in the cluster, which is referred to as a group change. Depending on the specific state of the cluster and the operation being performed, Xpand goes through either a full group change or an online group change:

  • During an online group change, the cluster remains fully online and available.

  • During a full group change, there is a brief period of unavailability, typically lasting a few seconds.

In all scenarios, Xpand is designed to provide fault tolerance and high availability, and will automatically recover as long as there is a quorum of nodes.

Online Group Changes

In Xpand 6.1+, when a node is added to a cluster, Xpand attempts to perform an online group change. When nodes are added as part of an online group change, the rest of the cluster is online and available for the duration of the group change

During the online group change:

  • The new node(s) are marked with the "late" sub-state

  • The Rebalancer copies data to the new node(s)

  • Existing nodes continue processing transactions

  • The new node(s) are fully able to service transactions, queries

  • MaxScale can detect the new nodes and begin routing connections to the new nodes

  • The new node(s) do not participate in some internal operations, such as the sequence manager and lock manager. The lockman_max_locks and lockman_max_transaction_locks system variables apply to non-late nodes only, and late nodes do not manage any locks.

Sample output of clx stat, for a cluster showing some late nodes after an online group change:

$ clx status
nid |  Hostname | Status | Substate |  IP Address  | TPS |       Used      |
Total
----+-----------+--------+----------+--------------+-----+-----------------+--------
  1 |  host108 |    OK  |   normal |   10.2.13.91 |  29 |  476.5M (0.06%) |  761.9G
  2 |  host042 |    OK  |   normal |  10.2.15.119 |   1 |  457.8M (0.06%) |  761.9G
  3 |  host192 |    OK  |   normal |   10.2.13.92 |   0 |  470.3M (0.06%) |  761.9G
  4 |  host184 |    OK  |   normal |  10.2.12.137 |   2 |  446.8M (0.06%) |  761.9G
  5 |  host171 |    OK  |     late |  10.2.14.214 |   2 |  526.8M (0.07%) |  761.9G
  6 |  host106 |    OK  |   normal |  10.2.12.170 |   0 |  453.8M (0.06%) |  761.9G
----+-----------+--------+----------+--------------+-----+-----------------+--------
                                                      34 |    2.8G (0.06%) |    4.5T

Full Group Changes

A full group change occurs when:

  • A node is removed from the cluster (e.g., as part of a scale down operation, or an unexpected node or zone failure)

  • If the number of nodes being added to the cluster is greater than or equal to the number of non-late nodes currently in the cluster.

  • When the COORDINATE option is supplied to ALTER CLUSTER.

  • When the cluster is restarted (via clx dbrestart).

What Happens During a Full Group Change?

With Xpand, full group changes are relatively short (generally measured in seconds), though the duration of each full group change depends on factors such as the number of containers, workload, and deployment size. The underlying steps of a full group change are the same, regardless of deployment size and workload.

Cluster Pauses Processing and Performs Internal Operations

When there is a full group change, the deployment pauses all processing and as it determines whether a quorum of nodes is available, and performs initialization of various subsystems, including:

  • Synchronizing global deployment state, including internal system catalogs and global variables.

  • Resolving (or re-resolving) in-process transactions, including rolling back transactions that were interrupted by the group change.

  • Invalidating or rebuilding internal caches, such as the Query Plan Cache.

  • Creating recovery queues for downed replicas, or "flipping" queues that are no longer needed.

  • Performing checks for licensing and Resiliency.

  • Resizing device files if necessary.

If any of the following processes were running when a a full group change occurs, they will be impacted as shown in the sections below.

Queries (DML and DDL)

If a transaction or statement is interrupted by a full group change before it has a chance to commit, it will receive an error.

If the global autoretry is true and a transaction was submitted with autocommit enabled, the database will automatically retry. If the retried statements cannot be executed successfully, an error will be surfaced to the application . If MaxScale is in use, it will also retry-queries.

Replication

Replication processes will automatically restart at the proper binlog location following a full group change.

Other Connections

Connections to nodes that are still in quorum will be maintained and will not experience any errors. Connections to non-communicative nodes will be lost.

Log Messages

Once an online or full group change is complete, a new group is formed, and the clustrix.log will contain an informational message that includes details of the new group.

The following example shows a deployment that has re-grouped without node 5, and then resumes normal operations:

[INFO] Node 1 has new group effffe: { 1-4 down: 5 }

Special Considerations

In-Memory tables are also impacted by full group changes. For additional information, see "In-Memory Tables with MariaDB Xpand".