Allocating Disk Space for Fault Tolerance and Availability
A MariaDB Xpand deployment must contain sufficient free disk space to automatically recover from Xpand node or zone failures.
The system variables, system tables, and ALTER CLUSTER statements are only available on the Xpand nodes. When using the Xpand Storage Engine topology, you need to connect to an Xpand node to use these features.
To calculate the maximum amount of disk space that can be utilized while still allowing Xpand to fully reprotect data after a failure, you can use the following formula:
Maximum Disk Utilization % = (Total Xpand Nodes - k) * 80 / Total Xpand Nodes
In the formula above, k represents one of the following (whichever is larger):
MAX_FAILURESglobal variable (default value = 1)
The total number of Xpand nodes in a zone (if an entire zone were to fail). Refer to ALTER CLUSTER ZONE for more information on how Xpand works in zones.
To examine the disk space usage of your deployment, use clx on one of the Xpand nodes.
If there is not enough free space in your deployment to reprotect data following an Xpand node or zone failure, your deployment will be at risk for data loss or failure if another Xpand node or zone is lost.
80 is the default value for the
databasefull_user_warn_percentage threshold. If your applications write data at a rate that fills the deployment aggressively, use a value that is less than 80% in your calculations. This will ensure that data can continue to be written while you are waiting for replacement Xpand node(s) to join the deployment and for data redistribution to complete.
To configure the
databasefull_user_warn_percentage threshold, and others related to database space utilization, please see ALTER CLUSTER RESIZE DEVICES.
Using this sample chart of pre-calculated thresholds, a 9 Xpand node deployment (not deployed in zones) with
MAX_FAILURES = 1 will require that the database not exceed 71.11% capacity to ensure successful completion of reprotect actions in the event of a Xpand node failure.
* = Not applicable as the remaining number of Xpand nodes will not constitute a quorum.
If the amount of free space in your deployment goes below the amount of space required to fully reprotect it in the event of a Xpand node or zone failure, Xpand will send an email alert to the list of users configured in Database Alerts. The email will include [WARNING] Insufficient space for reprotection and provide details on the amount of space required.
The same message will also appear in
clustrix.log as an
ERROR 1 (HY000):  Not enough space to reprotect if another node is lost: 94.4255% usage (without softfailed nodes) is greater than max 80.0000%
You may also encounter this message when soft-failing or removing Xpand nodes.