Scale-In with MariaDB Xpand
This page is part of MariaDB's MariaDB Documentation.
The parent of this page is: Scale-In and Scale-Out with MariaDB Xpand
Topics on this page:
Overview
MariaDB Xpand can be scaled in, which reduces the number of Xpand nodes in the cluster.
Use Cases
Some use cases for scale-in:
To reduce operating costs following a peak event (i.e., following Cyber-Monday).
To allocate servers for other purposes.
To remove failing hardware. (See
ALTER CLUSTER DROP
to drop a permanently failed node.)
Review Target Cluster Configuration
MariaDB Xpand requires a minimum of three nodes to support production systems. Going from three or more nodes to a single node is not supported via the steps outlined on this page.
When Zones are configured, Xpand requires a minimum of 3 zones.
For clusters deployed in zones, Xpand requires an equal number of nodes in each zone.
Ensure that the target cluster has has sufficient space. See Allocating Disk Space for Fault Tolerance and Availability.
Scale-In
Execute
ALTER CLUSTER SOFTFAIL
.Marking a node as softfailed directs the Xpand Rebalancer to move all data from the node(s) specified to others within the cluster. The Rebalancer proceeds in the background while the database continues to serve your ongoing production needs.
If necessary, determine the
nodeid
assigned to a given IP or hostname using the following query:SELECT * FROM system.nodeinfo ORDER BY nodeid;
To initiate a
SOFTFAIL
operation, useALTER CLUSTER SOFTFAIL
:ALTER CLUSTER SOFTFAIL nodeid [, nodeid] ...
The
SOFTFAIL
operation will issue an error if there is not sufficient space to complete the operation or if the operation would leave the cluster unable to protect data should an additional node be lost.To cancel a
SOFTFAIL
operation before it completes, useALTER CLUSTER UNSOFTFAIL
. Your system will be restored to its prior configuration.ALTER CLUSTER UNSOFTFAIL nodeid [, nodeid] ...
Monitor the
SOFTFAIL
operation.Once marked as softfailed, the Rebalancer moves data from the softfailed node(s). The Rebalancer process runs in the background while foreground processing continues to serve your production workload.
To monitor the progress of the
SOFTFAIL
operation:Verify that the node(s) you specified are indeed marked for removal:
SELECT * FROM system.softfailed_nodes;
The
system.softfailing_containers
tables will show the list of containers that are slated to be moved as part of theSOFTFAIL
operation. When the following query returns0
, the data migration is complete:SELECT COUNT(1) FROM system.softfailing_containers;
The following query shows the list of softfailed node(s) that are ready for removal:
SELECT * FROM system.softfailed_nodes WHERE nodeid NOT IN (SELECT DISTINCT nodeid FROM system.softfailing_containers);
Execute
ALTER CLUSTER REFORM
.Once data has been moved off the nodes and there are no more entries in
system.softfailing_containers
, runALTER CLUSTER REFORM
:ALTER CLUSTER REFORM;
This will initiate a brief interruption of service while the cluster is re-formed. If you do not have any binlogs, the softfailed node(s) will be removed from the cluster, and the flex down operation is complete. If you have binlogs, continue with the steps that follow.
Wait for binlog softfail.
If your cluster has binlogs, the previous
ALTER CLUSTER REFORM
will result in the softfailed node being part of the cluster, but designated to be in theLEAVING
state and will not be chosen as an acceptor:INFO dbcore/dbstate.c:292 dbprepare_done(): Running dbstart for membership afffe { 1-3 leaving: 2}
In the meantime, the
binlog_commits
table is being rebalanced across the non-softfailed nodes. When the following query returns0
, thebinlog_commits
table is done being rebalanced:SELECT count(1) FROM system.binlog_commits_segments WHERE softfailed_replicas > 0;
When the
binlog_commits
table is done being rebalanced, the following log message will appear on all nodes:INFO dbcore/softfail.ct:27 softfail_node_msg_signal(): softfailing nodes are ready to be removed: 2
Execute
ALTER CLUSTER REFORM
again.If there are no more softfailing containers (see step 2) and if the
binlog_commits
table is done being rebalanced (see step 4), runALTER CLUSTER REFORM
:ALTER CLUSTER REFORM;
This will remove the softfailed nodes from the cluster.