MariaDB Xpand Group Changes

Overview

MariaDB Xpand uses a distributed group membership protocol to maintain the static set of all nodes known to the deployment and checks that the nodes maintain active communication between each other. Xpand refers to this as a Group.

When the set of nodes changes, there is a change in the group and a Group Change occurs. During a Group Change, Xpand performs tasks to ensure:

  • Data consistency

  • Data availability

  • Effective query distribution

The system variables, system tables, and ALTER CLUSTER statements referenced on this page are only available on the Xpand nodes. When using the Xpand Storage Engine topology, you need to connect to an Xpand node to use these features.

When Does a Cluster Experience a Group Change?

When Node(s) are Added to a Cluster

A Group Change occurs in conjunction with Scaling Out. Following the Group Change, the Rebalancer will work in the background to move slices to the new node(s). You may notice a slight degradation of performance during that time.

When Node(s) Leave the Cluster

When reducing a deployment’s capacity using the Scale-In procedure, the ALTER CLUSTER REFORM command, used to remove nodes, will invoke a Group Change.

A deployment will also Group Change if node(s) are dropped using the emergency procedure of ALTER CLUSTER DROP.

Additionally, there are several unscheduled events that can cause a deployment to Group Change:

  • A deployment experiences unexpected node failure(s) due to hardware failure, network failure, or kernel panic.

  • A node or node(s) are unable to be reached during a regular heartbeat check of the deployment.

Following a node loss, if the node was not previously soft-failed, the Rebalancer will automatically work to reprotect all data and ensure all data has sufficient copies throughout the deployment. You may notice performance degradation during the reprotect process.

What Happens During a Group Change?

If Xpand detects a change in its group, it will recover automatically as long as a quorum of nodes is available. Your deployment will experience a brief period during which transactions are suspended while the group is being reformed and the consistency of the database is ensured. Connections from applications to surviving nodes will remain but transactions and queries for those connections will be temporarily paused.

The deployment can recover from multiple simultaneous node failures if the total number of failed nodes does not exceed the value configured for MAX_FAILURES.

Details of a Group Change

Group Changes are relatively short (generally measured in seconds), though the duration of each Group Change depends on factors such as the number of containers, workload, and deployment size. The underlying steps of a Group Change are the same, regardless of deployment size and workload.

Cluster Pauses Processing and Performs Internal Operations

When there is a Group Change, the deployment pauses all processing and determines whether a quorum of nodes is available. If true, Xpand performs a series of internal operations in preparation for the new group. Together these operations may take a few seconds, or 10s or seconds, depending on how large the deployment is, how large the database is, and how many transactions were in-process when the Group Change occurred. These steps ensure that the consistency of the database is guaranteed despite having lost a member of the deployment.

  • Initializing subsystems such as flow control and the Rebalancer.

  • Synchronizing global deployment state, including internal system catalogs and global variables.

  • Resolving (or re-resolving) in-process transactions, including rolling back transactions that were interrupted by the Group Change.

  • Invalidating or rebuilding internal caches, such as the Query Plan Cache.

  • Creating recovery queues for downed replicas, or "flipping" queues that are no longer needed.

  • Performing checks for licensing and Resiliency.

  • Resizing device files if necessary.

Cluster Forms New Group

Once the deployment is ready to resume operations, a new group is formed and the clustrix.log will contain an informational message that includes details of the new group:

[INFO] Node 1 has new group effffe: { 1-4 down: 5 }

This example shows a deployment that has re-grouped without node 5. The database then resumes its operations.

What Happens to Processes That Were Running?

If any of the following processes were running when a Group Change occurred, they will be impacted as shown:

Queries (DML and DDL)

If a transaction or statement is interrupted by a Group Change before it has a chance to commit, it will receive an error.

If the global autoretry is true and a transaction was submitted with autocommit enabled, the database will automatically retry. If the retried statements cannot be executed successfully, the application will receive another error.

Replication

Replication processes will automatically restart at the proper binlog location following a Group Change.

Other Connections

Connections to nodes that are still in quorum will be maintained and will not experience any errors. Connections to non-communicative nodes will be lost.

Special Considerations

In-Memory tables may be impacted by a Group Change.