Allocating Disk Space for Fault Tolerance and Availability

A MariaDB Xpand deployment must contain sufficient free disk space to automatically recover from Xpand node or zone failures.

The system variables, system tables, and ALTER CLUSTER statements are only available on the Xpand nodes. When using the Xpand Storage Engine topology, you need to connect to an Xpand node to use these features.

To calculate the maximum amount of disk space that can be utilized while still allowing Xpand to fully reprotect data after a failure, you can use the following formula:

Maximum Disk Utilization % = (Total Xpand Nodes - k) * 80 / Total Xpand Nodes

In the formula above, k represents one of the following (whichever is larger):

  • Value of MAX_FAILURES global variable (default value = 1)

  • The total number of Xpand nodes in a zone (if an entire zone were to fail). Refer to ALTER CLUSTER ZONE for more information on how Xpand works in zones.

To examine the disk space usage of your deployment, use clx on one of the Xpand nodes.

Note

If there is not enough free space in your deployment to reprotect data following an Xpand node or zone failure, your deployment will be at risk for data loss or failure if another Xpand node or zone is lost.

80 is the default value for the databasefull_user_warn_percentage threshold. If your applications write data at a rate that fills the deployment aggressively, use a value that is less than 80% in your calculations. This will ensure that data can continue to be written while you are waiting for replacement Xpand node(s) to join the deployment and for data redistribution to complete.

To configure the databasefull_user_warn_percentage threshold, and others related to database space utilization, please see ALTER CLUSTER RESIZE DEVICES.

Samples

Using this sample chart of pre-calculated thresholds, a 9 Xpand node deployment (not deployed in zones) with MAX_FAILURES = 1 will require that the database not exceed 71.11% capacity to ensure successful completion of reprotect actions in the event of a Xpand node failure.

k

3 Nodes

6 Nodes

9 Nodes

16 Nodes

32 Nodes

1

53.33%

66.67%

71.11%

75.00%

77.50%

2

*

53.33%

62.22%

70.00%

75.00%

3

*

*

53.33%

65.00%

72.50%

4

*

*

44.44%

60.00%

70.00%

5

*

*

*

55.00%

67.50%

* = Not applicable as the remaining number of Xpand nodes will not constitute a quorum.

Space Alerts

If the amount of free space in your deployment goes below the amount of space required to fully reprotect it in the event of a Xpand node or zone failure, Xpand will send an email alert to the list of users configured in Database Alerts. The email will include [WARNING] Insufficient space for reprotection and provide details on the amount of space required.

The same message will also appear in clustrix.log as an ERROR.

ERROR 1 (HY000): [32779] Not enough space to reprotect if another node
is lost: 94.4255% usage (without softfailed nodes) is greater than
max 80.0000%

You may also encounter this message when soft-failing or removing Xpand nodes.