Manage the Rebalancer for MariaDB Xpand
This page is part of MariaDB's Documentation.
The parent of this page is: Storage Management for MariaDB Xpand
Topics on this page:
Overview
Note
The Rebalancer is designed to run automatically as a background process to rebalance data across the cluster. The following section describes how you can configure and monitor the rebalancer, but the majority of deployments should not require user intervention.
The Rebalancer is managed primarily through a set of global variables, and can be monitored through several system tables (or vrels). As described in the Rebalancer section, the rebalancer applies a number of actions such as copying replicas, moving replicas, and splitting slices in order to maintain an optimal distribution of data on the cluster. It is designed to perform these operations in a manner that minimizes impact to user queries, and requires little administrative action. However, there may be circumstances where you wish to either increase or decrease the aggressiveness of the rebalancer, such as quickly rebalancing the cluster after node addition or eliminating any possible interference with user queries during periods of heavy load.
The sections below will discuss monitoring of rebalancer behavior, and specific use cases of rebalancer tuning.
Monitoring the Rebalancer
The table rebalancer_activity_log
maintains a record of current and past rebalancer work. To see recent activity, order by started
, as shown below. You can also filter for currently executing rebalancer actions with WHERE finished IS NULL
.
To check recent Rebalancer activity:
sql> select * from system.rebalancer_activity_log order by started desc limit 10;
+---------------------+-------------+-----------------------------+----------+---------------+------------------------------+------------+---------------------+---------------------+-------+
| id | op | reason | database | relation | representation | bytes | started | finished | error |
+---------------------+-------------+-----------------------------+----------+---------------+------------------------------+------------+---------------------+---------------------+-------+
| 5832803107035702273 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 236879872 | 2017-01-13 05:35:01 | 2017-01-13 05:35:01 | NULL |
| 5832802677131749377 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 478674944 | 2017-01-13 05:33:21 | 2017-01-13 05:33:21 | NULL |
| 5832802504311179267 | slice split | slice too big | statd | statd_history | __idx_statd_history__PRIMARY | 473628672 | 2017-01-13 05:32:41 | 2017-01-13 05:34:08 | NULL |
| 5832791312486337538 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 475987968 | 2017-01-13 04:49:15 | 2017-01-13 04:49:15 | NULL |
| 5832791036763671553 | slice split | slice too big | statd | statd_history | __idx_statd_history__PRIMARY | 1195999232 | 2017-01-13 04:48:11 | 2017-01-13 04:49:15 | NULL |
| 5832788503671368706 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 754778112 | 2017-01-13 04:38:21 | 2017-01-13 04:38:21 | NULL |
| 5832788202047166465 | slice split | slice too big | statd | statd_history | __idx_statd_history__PRIMARY | 471269376 | 2017-01-13 04:37:11 | 2017-01-13 04:38:29 | NULL |
| 5832674257801927682 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 754778112 | 2017-01-12 21:15:01 | 2017-01-12 21:15:01 | NULL |
| 5832673827981474818 | rerank | distribution read imbalance | statd | statd_history | __idx_statd_history__PRIMARY | 471400448 | 2017-01-12 21:13:21 | 2017-01-12 21:13:21 | NULL |
| 5832673526398766083 | slice split | slice too big | statd | statd_history | __idx_statd_history__PRIMARY | 755400704 | 2017-01-12 21:12:11 | 2017-01-12 21:13:43 | NULL |
+---------------------+-------------+-----------------------------+----------+---------------+------------------------------+------------+---------------------+---------------------+-------+
10 rows in set (0.32 sec)
For details such as target/destination for in-progress rebalancer actions, JOIN
(using id
) to rebalancer_activity_targets
, rebalancer_copy_activity
, or rebalancer_splits
. These are vrels (virtual relations, as opposed to actual tables), and so are only populated for the duration of the activity.
Configuring the Rebalancer
The aggressiveness of the rebalancer is controlled by several global variables.
rebalancer_
global_ task_ limit: specifies the number of concurrent rebalancer actions, applicable to all rebalancer actions. task_
rebalancer_ %_ interval_ ms: defines the interval time of when a particular rebalancer task is initiated. rebalancer_
rebalance_ task_ limit: controls the number of concurrent rebalancing tasks. rebalancer_
vdev_ task_ limit: limits the number of concurrent rebalancer actions that touch a single device.
The frequency of the tasks determine how often operations, such as rebalancer moves, may be enqueued. When there are many small containers, the copies and moves take only a few seconds. As such, a default frequency of 30 seconds may mean that the rebalancer queues operations less frequently than it could. Most rebalancer tasks enqueue a limited number of operations at a time, as the required operations to achieve ideal balance change over time. The notable exception is SOFTFAIL
, which enqueues all work to be performed once a node or disk has been softfailed.
For operations other than reprotect, the rebalancer pauses for 5 seconds (default) after starting the transaction, before commencing the actual copy from source to target replica. This is done to reduce the chances of an outstanding user transaction conflicting with the rebalancer operation, in which case the user transaction will be canceled, with this error:
MVCC serializable scheduler conflict
Note that reprotect
has a higher priority and does not apply this delay.
The following are some common use cases for tuning the rebalancer settings. Please consult with support to change parameters not discussed below.
Increasing Rebalance Aggressiveness
By design (as described in "MariaDB Xpand Rebalancer") the rebalancer takes a somewhat leisurely approach to rebalancing data across the cluster. Since data imbalances between nodes typically take some time to manifest and generally do not cause significant performance issues, this is generally acceptable. However, in some situations, it is desirable to rebalance much more quickly:
After expanding a cluster to more nodes, particularly where load is very low off-peak (or in an evaluation situation)
After replacing a failed node, where balanced workload is critical to meeting performance requirements
Following are recommended changes to increase rebalancer aggressiveness:
sql> set global rebalancer_rebalance_task_limit = 8;
sql> set global rebalancer_vdev_task_limit = 4;
sql> set global task_rebalancer_rebalance_distribution_interval_ms = 5000;
sql> set global task_rebalancer_rebalance_interval_ms = 5000;
If these settings cause too great a load, reduce the rebalancer_
Once the rebalancer has finished, reset these globals back to default:
sql> SET GLOBAL variable_name = DEFAULT;
Increasing SOFTFAIL Aggressiveness
As described in "Administering Failure and Recovery with MariaDB Xpand", SOFTFAIL
is a means of moving all data from a node (or disk) in preparation for decommissioning or replacing a node. With proper use of SOFTFAIL
, the system maintains full protection of all data; if a node is removed without SOFTFAIL
, there is a window (until reprotect completes) where a failure could lead to data loss.
SOFTFAIL
is treated as a high priority by the rebalancer. It differs from rebalancing, in that the per-task limit and task intervals do not apply. Changing these two globals can increase SOFTFAIL
aggressiveness:
To increase SOFTFAIL aggressiveness:
sql> set global rebalancer_global_task_limit = 32;
sql> set global rebalancer_vdev_task_limit = 16;
If these settings cause too great a load, reduce the rebalancer_
Once the rebalancer has finished, reset these globals back to Xpand's default:
sql> SET GLOBAL variable_name = DEFAULT;
Disabling the Rebalancer
To disable the Rebalancer:
sql> set global rebalancer_optional_tasks_enabled = false;
This disables the rerank, split, redistribute and rebalance tasks. The value for rebalancer_
Note
Do not leave the Rebalancer disabled for long periods of time, as the Rebalancer plays a crucial role in maintaining optimal database performance.
The Rebalancer tasks for reprotecting data (task_
Global Variables
The following global variables impact Rebalancer activity. Note that these variables do not apply to an individual sessions.
Name | Description | Default Value |
---|---|---|
| Maximum number of simultaneous rebalancer operations. |
|
| Rebalance mode. |
|
| Maximum number of operations that rebalancer_ |
|
| Minimum coefficient of overall write load variation that will trigger rebalance activity. |
|
| Queued replicas count as healthy for this many seconds, to give missing nodes the chance to come back online before rebalancer_ |
|
| Size at which the rebalancer splits slices. |
|
| Maximum number of simultaneous rebalancer operations targeting one device. |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|
| Milliseconds between runs of periodic task "rebalancer_ |
|