MariaDB MaxScale 6.0 Native Clustering

MaxScale 6.0 introduced native clustering, allowing several MaxScale instances to communicate and share configuration updates with each other. In this blog post, We are going to explain how the feature works and demonstrate how to implement it.

Common MaxScale Topologies

Centralized MaxScale Server  
This is the most common installation, where MaxScale is between application servers and the database. It is simple, but without any high availability. If MaxScale goes down, the application loses all connectivity to the database.

Centralized MaxScale server

 

MaxScale HA setup
Another common setup is to install MaxScale onto each application server or seperate servers. The benefit of this setup is that the loss of one MaxScale/app server will not bring down the entire application.

MaxScale HA setup

 

Dynamic Configuration Changes

MaxScale allows runtime config changes via the MaxScale GUI, MaxCtrl or the REST API. Those runtime changes are saved on disk, inside the persistdir directory. If you have multiple MaxScales you have to do any runtime changes on every MaxScale manually.

Internal workflow 
Basically this config sync involves six stages.

Internal workflow

 

Runtime Changes 
All the runtime changes are done by MaxScale GUI, MaxCtrl or the REST API. Those changes are saved on disk.

Data Storage 
The configuration information is stored in “maxscale_config” table under mysql schema. It contains:

  • the cluster name (the cluster name from MaxScale configuration),
  • the configuration version number. This version number will increment for each change,
  • the actual configuration data which is stored in JSON format.

The version number starts from 0 and the first dynamic change causes it to be set to 1. This means the first update stored in the database will have the version number 1 and all subsequent configuration updates increment the version.

Updating the Configuration 
Before applying configuration changes, the current config version is selected from the database by using SELECT ... FOR UPDATE query. If the config version is the same as the version stored in MaxScale, the update can proceed. If the config version is not the same, an error is returned to the client.

Version Check 
Once modifications are done, MaxScale attempts to update the row on the “maxscale_config” table. Here, the SELECT ... FOR UPDATE query reads the config version.  If it succeeds, it will update the row. If the version check fails due to deadlock or any conflicts in that transaction MaxScale returns error to the client. It guarantees that only one MaxScale succeeds in updating the configuration. A MaxScale that fails to perform the update will return an error to the client and read the new configuration from the database.

The runtime configuration changes are stored in JSON format on disk. This is the  “cached configuration”. If this cached configuration is available, it will be used instead of the static configuration files(/etc/maxscale.cnf). This allows the configuration changes to persist through a crash as well as through temporary outages.

If the transition from the current configuration to the configuration stored in the cluster fails, the cached configuration is discarded. This guarantees that only one attempt is made to start with a cached configuration and all subsequent startups use the static configuration files.

Configuration Synchronization
MaxScale periodically polls the database to get the latest configuration version. If the database config version is newer than the current MaxScale config, MaxScale checks the difference between the new and current configurations and updates the configuration by applying only the new changes.

If the configuration update is successful, MaxScale is in sync with other MaxScale nodes. If the configuration update fails, MaxScale attempts to roll back the partial changes.

Rollback
If the configuration update completes on MaxScale but fails to be committed to the database, MaxScale will attempt to revert the local configuration change. If this attempt fails, MaxScale will discard the cached configuration and abort the process.

When synchronizing with the cluster, if MaxScale fails to apply a configuration retrieved from the cluster, it attempts to revert the configuration to the previous version. If MaxScale is successful in reverting the configuration to the previous version, the failed configuration update is ignored. If the failed configuration update cannot be reverted, the MaxScale configuration will be in an indeterminate state, and MaxScale will discard the cached configuration and abort the process.

As an example, let’s assume my current configuration version is 2.

  • If the configuration update is successful, my configuration version will be 3 .
  • If the update fails, but rollback is successful, the configuration version will change from 3 to 2.
  • If both update and rollback fail, MaxScale discards all the versions and restarts the process using the static configuration file (/etc/maxscale.cnf).

Implementation 
To start we need to get at least 2 nodes. Download and install MaxScale 6.0 or higher and start a clean version. Here we have a 3 node Galera cluster and MaxScale 6.4.0 was installed on all the database nodes.

node-1 - 172.31.83.144

node-2 - 172.31.81.34

node-3 - 172.31.85.115

Virtual IP - 172.31.xx.xx

 

Implementation

 

When configuring MaxScale synchronization for the first time, the same static configuration files should be used on all MaxScale instances that use same cluster value of “config_sync_cluster” must be the same on all MaxScale instances and the cluster (i.e. the monitor) pointed by it and its servers must be the same in every configuration.

For making a MaxScale cluster we need to create new users on database nodes to synchronize configuration changes across MaxScale instances.

This user must have the following grants:

MariaDB> grant SELECT, INSERT, UPDATE, CREATE ON mysql.maxscale_config to 'sync_user'@'%';

And we need to add the above user details on MaxScale config under [maxscale] tag.

Example:

config_sync_cluster="Galera-Monitor"
config_sync_user=sync_user
config_sync_password=R00t@123

Sample Config 
This is our sample config. The Configuration file should be the same on all MaxScale nodes.

 

$ cat /etc/maxscale.cnf
# Global parameters
[maxscale]
threads=auto
config_sync_cluster="Galera-Monitor"
config_sync_user=sync_user
config_sync_password=R00t@123
admin_secure_gui=false
admin_host=0.0.0.0
 
# Server definitions
[node1]
type=server
address=172.31.83.144
port=3306
protocol=MariaDBBackend
[node2]
type=server
address=172.31.81.34
port=3306
protocol=MariaDBBackend
[node3]
type=server
address=172.31.85.115
port=3306
protocol=MariaDBBackend
 
# Monitor for the servers
[Galera-Monitor]
type=monitor
module=galeramon
servers=node1,node2,node3
user=max_user
password=R00t@123
monitor_interval=2000
 
# Service definitions

Here we are using a read/write split router to route the application traffic. MaxScale node1 handles all the writes and node2 and node3 will handle all the reads.

# ReadWriteSplit 
[Read-Write-Service]
type=service
router=readwritesplit
servers=node1,node2,node3
user=max_user
password=R00t@123
 
# Listener definitions for the services
[Read-Write-Listener]
type=listener
service=Read-Write-Service
protocol=MariaDBClient
port=4006
[root@ip-172-31-83-144 centos]# maxctrl show maxscale | grep -i 'config_sync'
│              │     "config_sync_cluster": "Galera-Monitor",                    │
│              │     "config_sync_interval": "5000ms",                           │
│              │     "config_sync_password": "*****",                            │
│              │     "config_sync_timeout": "10000ms",                           │
│              │     "config_sync_user": "max_user",                             │

Currently we are using default values for these config_sync_interval and config_sync_timeout parameters.

The config has been added. Internally, all nodes are monitoring and communicating with all the other MaxScale nodes.

Let’s add a new node to [maxscale] without restarting the service and see how this configuration changes replicating to other nodes.

[root@ip-172-31-83-144 centos]# maxctrl create server node4 
172.31.80.237 3306
OK

New node is added under [maxscale], but MaxScale won’t perform any health checks on this node. We need to add this node4 to “galera-monitor”.

[root@ip-172-31-83-144 centos]# maxctrl link monitor Galera-Monitor node4
OK

Node4 was added under “galera-monitor”. Let’s add this node to the Read-Write split service.

[root@ip-172-31-83-144 centos]# maxctrl link service Read-Write-Service node4
OK
[root@ip-172-31-83-144 centos]# maxctrl list servers
┌────────┬───────────────┬──────┬─────────────┬─────────────────┬──────────────┐
│ Server │ Address       │ Port │ Connections │ State           │ GTID         │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node4  │ 172.31.80.237 │ 3306 │ 0           │ Slave, Running  │              │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node3  │ 172.31.85.115 │ 3306 │ 0           │ Slave, Running  │ 3-3-5,1-1-47 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node2  │ 172.31.81.34  │ 3306 │ 0           │ Slave, Running  │ 2-2-5,1-1-47 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node1  │ 172.31.83.144 │ 3306 │ 0           │ Master, Running │ 1-1-47       │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴──────────────┘
[root@ip-172-31-83-144 centos]# maxctrl list services
┌────────────────────┬────────────────┬─────────────┬───────────────────┬────────────────────────────┐
│ Service            │ Router         │ Connections │ Total Connections │ Targets                    │
├────────────────────┼────────────────┼─────────────┼───────────────────┼────────────────────────────┤
│ Read-Write-Service │ readwritesplit │ 0           │ 0                 │ node1, node2, node3, node4 │
└────────────────────┴────────────────┴─────────────┴───────────────────┴────────────────────────────┘

Whenever the MaxScale configuration is modified at runtime, the latest configuration is stored in the database cluster in the mysql.maxscale_config table. A local copy of the configuration is stored in the data directory to allow MaxScale to function even if a connection to the cluster cannot be made. By default this file is stored at /var/lib/maxscale/maxscale-config.json.

While starting, the MaxScale service checks if a local version of this configuration exists. If it does and it is a valid cached configuration, the static configuration file as well as any other generated configuration files are ignored. The exception is the [maxscale] section of the main static configuration file which is always read.

Each configuration has a version number with the initial configuration being version 0. Each time the configuration is modified, the version number is incremented. This version number is used to detect when MaxScale needs to update its configuration.

Now new nodes were added under [maxscale]. Let’s verify this  change is replicated to other MaxScale nodes.

From MaxScale node2 and MaxScale node3:

# maxctrl list servers
┌────────┬───────────────┬──────┬─────────────┬─────────────────┬──────────────┐
│ Server │ Address       │ Port │ Connections │ State           │ GTID         │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node4  │ 172.31.80.237 │ 3306 │ 0           │ Slave, Running  │              │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node3  │ 172.31.85.115 │ 3306 │ 0           │ Slave, Running  │ 3-3-5,1-1-47 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node2  │ 172.31.81.34  │ 3306 │ 0           │ Slave, Running  │ 2-2-5,1-1-47 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼──────────────┤
│ node1  │ 172.31.83.144 │ 3306 │ 0           │ Master, Running │ 1-1-47       │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴──────────────┘

From the node2 and node3 MaxScale log

2022-06-19 08:09:03   notice : (ConfigManager); Updating to configuration version 5
2022-06-19 08:28:43   notice : (ConfigManager); Updating to configuration version 6
2022-06-19 08:28:43   notice : (ConfigManager); Added 'node4' to 'Galera-Monitor'
2022-06-19 08:28:43   notice : 'node4' sent version string '10.6.8-MariaDB'. Detected type: 'MariaDB', version: 10.6.8.

How to Verify the MaxScale Cluster is Synced?
The output of maxctrl show maxscale contains the Config Sync field with information about the current configuration state of the local MaxScale as well as the state of any other nodes using this cluster.

[root@ip-172-31-83-144 centos]# maxctrl show maxscale | grep -i 'Config Sync' -A10
│ Config Sync  │ {                                                           │
│              │     "checksum": "e2f0041a344c41065dc94f5c6d8802f40c396f2a", │
│              │     "nodes": {                                              │
│              │         "ip-172.31.83.144.ec2.internal": "OK",              │
│              │         "ip-172.31.81.34.ec2.internal": "OK",               │
│              │         "ip-172.31.85.115.ec2.internal": "OK",              │
│              │         "ip-172.31.80.237.ec2.internal": "OK"               │
│              │     },                                                      │
│              │     "origin": "172.31.83.144.ec2.internal",                 │
│              │     "status": "OK",                                         │
│              │     "version": 8                                            │
│              │ }                                                           │

The version field is the logical configuration version and the origin is the node that originates the latest configuration change. The checksum field is the checksum of the logical configuration and can be used to compare whether two MaxScale instances are in the same configuration state. The nodes field contains the status of each MaxScale instance mapped to the hostname of the server. This field is updated whenever MaxScale reads the configuration from the cluster and can thus be used to detect which MaxScale have updated their configuration.

Let’s do Changes from MaxScale GUI 
Currently the monitor_interval is 2000ms which means around 2 seconds. We are going to change this monitor_interval from 2000 to 5000 ms on the GUI.

Before update:

Monitor interval before update

[root@ip-172-31-83-144 centos]# maxctrl show monitors | grep -i monitor_interval
│                     │     "monitor_interval": 2000,                       │

Monitor interval update

 

After update:

Monitor interval after update

From all the MaxScale nodes:

#  maxctrl show monitors | grep -i monitor_interval
│                     │     "monitor_interval": 5000,                       │

[root@ip-172-31-83-144 centos]# maxctrl show maxscale | grep -i 'Config Sync' -A10
│ Config Sync  │ {                                                           │
│              │     "checksum": "aea6f6ef5a2fa869041981b93493aa371d7eb8f5", │
│              │     "nodes": {                                              │
│              │         "ip-172.31.83.144.ec2.internal": "OK",              │
│              │         "ip-172.31.81.34.ec2.internal": "OK",               │
│              │         "ip-172.31.85.115.ec2.internal": "OK",              │
│              │         "ip-172.31.80.237.ec2.internal": "OK"               │
│              │     },                                                      │
│              │     "origin": "172.31.83.144.ec2.internal",                 │
│              │     "status": "OK",                                         │
│              │     "version": 9                                            │
│              │ }                                                           │

Disabling MaxScale Cluster Synchronization 
To disable configuration synchronization, remove config_sync_cluster from the configuration file or set it to an empty string for example, config_sync_cluster="". This can be done at runtime with MaxCtrl by passing an empty string to config_sync_cluster.

Example

[root@ip-172-31-83-144 centos]# maxctrl alter maxscale config_sync_cluster ""
OK

To reset the MaxScale synchronization,

  • Stop all MaxScale instances.
  • Remove the cached configuration file stored at /var/lib/maxscale/maxscale-config.json on all MaxScale instances.
  • Drop the mysql.maxscale_config table.
  • Start all MaxScale instances.

Limitations
The synchronization only affects the MaxScale configuration. The state of objects or any external files (e.g. Cache filter rules, TLS certificates, data masking rules) are not synchronized by this mechanism.

For More Information