Scale-In with MariaDB Xpand 6.1

Overview

MariaDB Xpand 6.1 can be scaled in, which reduces the number of Xpand nodes in the cluster.

Compatibility

Information provided here applies to:

  • MariaDB Xpand 6.1

Use Cases

Some use cases for scale-in:

  • To reduce operating costs following a peak event (i.e., following Cyber-Monday).

  • To allocate servers for other purposes.

  • To remove failing hardware. (See ALTER CLUSTER DROP to drop a permanently failed node.)

Review Target Cluster Configuration

  • MariaDB Xpand requires a minimum of three nodes to support production systems. Going from three or more nodes to a single node is not supported via the steps outlined on this page.

  • When Zones are configured, Xpand requires a minimum of 3 zones.

  • For clusters deployed in zones, Xpand requires an equal number of nodes in each zone.

  • Ensure that the target cluster has has sufficient space. See Allocating Disk Space for Fault Tolerance and Availability.

Scale-In

Step 1: Obtain Node IDs

On any Xpand node, obtain the node ID for each Xpand node that will be removed from the cluster by querying the system.nodeinfo system table:

SELECT *
FROM system.nodeinfo
ORDER BY nodeid;

Step 2: Softfail Nodes

On any Xpand node, softfail one or more Xpand nodes by executing ALTER CLUSTER SOFTFAIL and specifying a comma-separated list of node IDs:

ALTER CLUSTER SOFTFAIL nodeid [, nodeid] ...

When one or more nodes are softfailed, Xpand directs the Rebalancer to move all data from the softfailed nodes to other nodes within the cluster. The Rebalancer performs this task in the background without interrupting queries.

Step 3: Verify Softfailed Nodes

On any Xpand node, verify that the specified Xpand nodes are softfailed by querying the system.softfailed_nodes system table:

SELECT *
FROM system.softfailed_nodes;
+--------+-------+
| nodeid | ready |
+--------+-------+
|      9 |     0 |
|     10 |     0 |
+--------+-------+

When the ready column for a node is 0, the node is being softfailed, but it is not yet ready for removal.

Step 4: If Binary Logs Are Enabled, Reform the Cluster

If your Xpand cluster does not have binary logs, skip this step and continue starting from "Step 5: Wait for Softfail".

If your Xpand cluster has binary logs, the cluster should be reformed, so that Xpand designates the Rebalancer to reprotect the binary logs.

On any Xpand node, reform the cluster by executing ALTER CLUSTER REFORM:

ALTER CLUSTER REFORM;

This will initiate a brief interruption of service while the cluster is reformed.

Step 5: Wait for Softfail

On any Xpand node, monitor the progress of the softfail operation by querying the system.softfailed_nodes system table:

SELECT *
FROM system.softfailed_nodes;
+--------+-------+
| nodeid | ready |
+--------+-------+
|      9 |     1 |
|     10 |     1 |
+--------+-------+

When the ready column for a node is 1, the node is done being softfailed, so it is ready for removal, and you can proceed to the next step.

Step 6: Reform Cluster Again

On any Xpand node, reform the cluster again to remove the softfailed nodes by executing ALTER CLUSTER REFORM:

ALTER CLUSTER REFORM;

This will initiate a brief interruption of service while the softfailed nodes are removed from the cluster and the cluster is re-formed.

Optional: Cancel the Softfail Operation

On any Xpand node, a SOFTFAIL operation can be canceled before it completes by executing ALTER CLUSTER UNSOFTFAIL and specifying the node IDs:

ALTER CLUSTER UNSOFTFAIL nodeid [, nodeid] ...

The cluster will be restored to its prior configuration.

Troubleshooting

The SOFTFAIL operation raises an error when certain issues occur, including:

  • Xpand does not have sufficient storage space to rebalance the data stored on the nodes

  • Xpand does not have enough nodes to protect the data