June 23, 2017

Schema Sharding with MariaDB MaxScale 2.1 - Part 1

Most of the time when you start a database design you don’t imagine how your applications need to scale. Sometimes, you need to shard your databases among some different hosts and then, on each shard you want to split reads and writes between master and slaves. This blog is about MariaDB MaxScale being able to handle different databases across shards, and splitting up reads and writes into each of the shards to achieve the maximum level of horizontal scalability with MariaDB Server.

After reading this blog you will be able to:

  • Know how MariaDB MaxScale handles shards with SchemaRouter router;

  • Know how MariaDB MaxScale handles the split of reads and writes with ReadWriteSplit router;

  • Know how we can combine both mentioned router in order to have an scalable environment.

Introduction and Scenario

I have worked with customers and their environments before when it was possible to see customers just duplicating the whole database structure in order to make their systems work. Instead of having a unique database and clients with an ID and their data, some companies have systems that, for each of their clients, they just add a new database with empty tables and start the new client on the system - tables are all the same.

This way, you can imagine that after some time, database servers will have thousands of databases schemas with the same structure to attend different clients. As well, after some time, the capacity problem can arise and a known resolution for this problem is to start sharding the database schemas on more servers.

So, each database server is now a source of databases from a subset of clients and so forth. This is very interesting because the main benefit to use shards is to mitigate the pressure on just one server and spread the world on more servers, while client A’s database is located on shard_a and client B’s database is located on shard_b (for sure, I’m considering an environment with more than 600 databases per server and as said before, I saw customers using up to 6 to 8 shards, sharding clients databases this way). This is not just the number of databases or schemas, but as well, the number of tables each of this schema has (I’m talking about when querying I_S become a complete nightmare and can really crash a database server).

Up on this, we know very well that one of the magic things related to scalability in MySQL world is to split reads and writes, at least, it’s a good start when it can be done. It makes the writes just on master server, while all reads go to the slave or slaves, in case capacity to just one slave is hit. With that, we can say that each of the shards we’re talking about until here should have at least one slave which will be used for reads, while writes go to master, as said previously. With this in mind, I present the following scenario which I will be working with on this blog post:

maxscale - 192.168.50.100 (MaxScale Server)
 
#: SHARD A
box01 - 192.168.50.11 (Shard_A’s master)
    |
    +——— box02 - 192.168.50.12 (Shard_A’s slave)
 
#: SHARD B
box03 - 192.168.50.13 (Shard_B’s master)
    |
    +——— box04 -192.168.50.14 (Shard_B’s slave)

Above, you can see, I setup one server to run MaxScale and four others to run two shards having different databases or schemas in it. The nodes box01 through node 04, they are running MariaDB 10.1.21 (you can also run this on MariaDB 10.2.6, the most recent release of MariaDB Server), setup from MariaDB YUM repository for CentOS 7. You can setup CentOS repository, as well as downloading MaxScale rpm package. I would like to highlight that just the newer version of MariaDB MaxScale is able to combine modules as we’re going to present here: SchemaRouter and ReadWriteSplit.

The replication between servers on each shard is something that you know you need to set up by yourself. Once servers are all up, I’m talking about database nodes from box01 to box04, you can create a replication user on box01 and box03 and, enter the CHANGE MASTER TO, as shown below:

#: create replication user on shard_a
sudo mysql -e "grant replication client, replication slave on *.* to repl@'192.168.%' identified by ‘XXXXXX';"
 
#: create replication user on shard_b
sudo mysql -e "grant replication client, replication slave on *.* to repl@'192.168.%' identified by ‘XXXXXX’;"
 
#: configure replication for shard_a on box02
sudo mysql -e "change master to master_host='192.168.50.11',master_user='repl',master_password='XXXXXX',master_use_gtid=slave_pos,master_connect_retry=5;”
sudo mysql -e "start slave;"
 
#: configure replication for shard_b on box04
sudo mysql -e "change master to master_host='192.168.50.13',master_user='repl',master_password='XXXXXX',master_use_gtid=slave_pos,master_connect_retry=5;"
 sudo mysql -e "start slave;"

MariaDB MaxScale 2.1.3 Setup

Talking about MaxScale, we’re running the version 2.1.3 which supports the combination of SchemaRouter and ReadWriteSplit routers. To set up MaxScale on CentOS 7, you can download the rpm package from MariaDB repository and setup the package as below:

#: setup maxscale
sudo rpm -Uvih https://downloads.mariadb.com/MaxScale/2.1.3/rhel/7server/x86_64/maxscale-2.1.3-1.rhel.7.x86_64.rpm

Create the encrypted password using maxkeys and maxpasswd if you want to encrypt the password for the user that access the database servers: => https://mariadb.com/kb/en/mariadb-enterprise/6360/

#: generate keys
[root@maxscale ~]# maxkeys
Generating .secrets file in /var/lib/maxscale.
 
#: generate the hexadecimal encrypted password to add to the maxscale config file
[root@maxscale ~]# maxpasswd /var/lib/maxscale/ XXXXXX
DF5822F1038A154FEB68E667740B1160
 
#: change .secrets file ownership to permit maxscale user to access it
[root@maxscale log]# chown maxscale:maxscale /var/lib/maxscale/.secrets

After this, add the following configuration file that will combine modules, considering the above servers arrangement, to make the SchemaRouter and ReadWriteSplit happen:

#: maxscale configuration file - /etc/maxscale.cnf
[maxscale]
threads=16
skip_permission_checks=true
 
# Shard-A
 
[Shard-A-Monitor]
type=monitor
module=mysqlmon
servers=server1,server2
user=maxmon
passwd=DF5822F1038A154FEB68E667740B1160
monitor_interval=10000
 
[Shard-A-Router]
type=service
router=readwritesplit
servers=server1,server2
user=maxuser
passwd=DF5822F1038A154FEB68E667740B1160
 
[Shard-A-Listener]
type=listener
service=Shard-A-Router
protocol=MySQLClient
port=4006
 
[server1]
type=server
address=192.168.50.11
port=3306
protocol=MySQLBackend
 
[server2]
type=server
address=192.168.50.12
port=3306
protocol=MySQLBackend
 
# Shard-B
 
[Shard-B-Monitor]
type=monitor
module=mysqlmon
servers=server3,server4
user=maxmon
passwd=DF5822F1038A154FEB68E667740B1160
monitor_interval=10000
 
[Shard-B-Router]
type=service
router=readwritesplit
servers=server3,server4
user=maxuser
passwd=DF5822F1038A154FEB68E667740B1160
 
[Shard-B-Listener]
type=listener
service=Shard-B-Router
protocol=MySQLClient
port=4007
 
[server3]
type=server
address=192.168.50.13
port=3306
protocol=MySQLBackend
 
[server4]
type=server
address=192.168.50.14
port=3306
protocol=MySQLBackend
 
# The two services abstracted as servers
 
[Shard-A]
type=server
address=127.0.0.1
port=4006
protocol=MySQLBackend
 
[Shard-B]
type=server
address=127.0.0.1
port=4007
protocol=MySQLBackend
 
# The sharding service
 
[Sharding-Service]
type=service
router=schemarouter
servers=Shard-A,Shard-B
user=maxuser
passwd=DF5822F1038A154FEB68E667740B1160
router_options=refresh_interval=10
 
[Sharding-Listener]
type=listener
service=Sharding-Service
protocol=MySQLClient
port=3306
 
# Command line interface used by Maxadmin
 
[CLI]
type=service
router=cli

[CLI-Listener]
type=listener
service=CLI
protocol=maxscaled
socket=default

Of course, some other variables and configurations that should be considered as part of the monitor's sections, as well as, if you want to setup an automatic failover on a fail event on the backend servers. But, this is part of a future blog,

Additionally, continuing with the configurations, make sure you create the user for MaxScale to be able to access database servers when it is started up. The below users are created on both masters, box01 and box03, to make the users to exists on those two shards:

sudo mysql -e "grant all on *.* to maxmon@'%'  identified by 'XXXXXX’;"
sudo mysql -e "grant all on *.* to maxuser@'%' identified by 'XXXXXX’;"
 
sudo mysql -e "grant all on *.* to maxmon@'127.0.0.1'  identified by 'XXXXXX’;"
sudo mysql -e "grant all on *.* to maxuser@'127.0.0.1' identified by 'XXXXXX’;"
 
sudo mysql -e "grant all on *.* to maxmon@'192.168.50.100'  identified by 'XXXXXX’;"
sudo mysql -e "grant all on *.* to maxuser@'192.168.50.100' identified by 'XXXXXX’;"

With this, you can start MaxScale:

[root@maxscale log]# systemctl start maxscale
...
MariaDB MaxScale  /var/log/maxscale/maxscale.log  Mon Mar  6 14:04:04 2017
----------------------------------------------------------------------------
2017-03-06 14:04:04   notice : Working directory: /var/log/maxscale
2017-03-06 14:04:04   notice : MariaDB MaxScale 2.1.3 started
2017-03-06 14:04:04   notice : MaxScale is running in process 4203
2017-03-06 14:04:04   notice : Configuration file: /etc/maxscale.cnf
2017-03-06 14:04:04   notice : Log directory: /var/log/maxscale
2017-03-06 14:04:04   notice : Data directory: /var/lib/maxscale
2017-03-06 14:04:04   notice : Module directory: /usr/lib64/maxscale
2017-03-06 14:04:04   notice : Service cache: /var/cache/maxscale
2017-03-06 14:04:04   notice : Loading /etc/maxscale.cnf.
2017-03-06 14:04:04   notice : /etc/maxscale.cnf.d does not exist, not reading.
2017-03-06 14:04:04   notice : [cli] Initialise CLI router module
2017-03-06 14:04:04   notice : Loaded module cli: V1.0.0 from /usr/lib64/maxscale/libcli.so
2017-03-06 14:04:04   notice : [schemarouter] Initializing Schema Sharding Router.
2017-03-06 14:04:04   notice : Loaded module schemarouter: V1.0.0 from /usr/lib64/maxscale/libschemarouter.so
2017-03-06 14:04:04   notice : [readwritesplit] Initializing statement-based read/write split router module.
2017-03-06 14:04:04   notice : Loaded module readwritesplit: V1.1.0 from /usr/lib64/maxscale/libreadwritesplit.so
2017-03-06 14:04:04   notice : [mysqlmon] Initialise the MySQL Monitor module.
2017-03-06 14:04:04   notice : Loaded module mysqlmon: V1.5.0 from /usr/lib64/maxscale/libmysqlmon.so
2017-03-06 14:04:04   notice : Loaded module MySQLBackend: V2.0.0 from /usr/lib64/maxscale/libMySQLBackend.so
2017-03-06 14:04:04   notice : Loaded module MySQLBackendAuth: V1.0.0 from /usr/lib64/maxscale/libMySQLBackendAuth.so
2017-03-06 14:04:04   notice : Loaded module maxscaled: V2.0.0 from /usr/lib64/maxscale/libmaxscaled.so
2017-03-06 14:04:04   notice : Loaded module MaxAdminAuth: V2.1.3 from /usr/lib64/maxscale/libMaxAdminAuth.so
2017-03-06 14:04:04   notice : Loaded module MySQLClient: V1.1.0 from /usr/lib64/maxscale/libMySQLClient.so
2017-03-06 14:04:04   notice : Loaded module MySQLAuth: V1.1.0 from /usr/lib64/maxscale/libMySQLAuth.so
2017-03-06 14:04:04   notice : No query classifier specified, using default 'qc_sqlite'.
2017-03-06 14:04:04   notice : Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/libqc_sqlite.so
2017-03-06 14:04:04   notice : Using encrypted passwords. Encryption key: '/var/lib/maxscale/.secrets'.
2017-03-06 14:04:04   notice : [MySQLAuth] [Shard-A-Router] Loaded 4 MySQL users for listener Shard-A-Listener.
2017-03-06 14:04:04   notice : Listening connections at 0.0.0.0:4006 with protocol MySQL
2017-03-06 14:04:04   notice : [MySQLAuth] [Shard-B-Router] Loaded 4 MySQL users for listener Shard-B-Listener.
2017-03-06 14:04:04   notice : Listening connections at 0.0.0.0:4007 with protocol MySQL
2017-03-06 14:04:04   notice : [schemarouter] Authentication data is fetched from all servers. To disable this add 'auth_all_servers=0' to the service.
2017-03-06 14:04:04   notice : Server changed state: server3[192.168.50.13:3306]: new_master. [Running] -> [Master, Running]
2017-03-06 14:04:04   notice : Server changed state: server4[192.168.50.14:3306]: new_slave. [Running] -> [Slave, Running]
2017-03-06 14:04:04   notice : [mysqlmon] A Master Server is now available: 192.168.50.13:3306
2017-03-06 14:04:04   notice : Server changed state: server1[192.168.50.11:3306]: new_master. [Running] -> [Master, Running]
2017-03-06 14:04:04   notice : Server changed state: server2[192.168.50.12:3306]: new_slave. [Running] -> [Slave, Running]
2017-03-06 14:04:04   notice : [mysqlmon] A Master Server is now available: 192.168.50.11:3306
2017-03-06 14:04:10   notice : Listening connections at 0.0.0.0:3306 with protocol MySQL
2017-03-06 14:04:10   notice : Listening connections at /tmp/maxadmin.sock with protocol MaxScale Admin
2017-03-06 14:04:10   notice : MaxScale started with 1 server threads.
2017-03-06 14:04:10   notice : Started MaxScale log flusher.

An additional user needs to be created to be used by the application, which will connect to MaxScale and MaxScale will do the router based on modules. I suggest you create a user like the one below (if the user of your app does more than the privileges below, add the privileges you need or even, create a ROLE and give the ROLE privilege to the user:

#: create the user for the app to connect to db nodes through MaxScale
sudo mysql -e "grant all on *.* to appuser@'%' identified by 'XXXXXX’;"
sudo mysql -e "grant all on *.* to appuser@'127.0.0.1' identified by 'XXXXXX’;"
sudo mysql -e "grant all on *.* to appuser@'192.168.50.100' identified by 'XXXXXX’;"

At this point, we can use MaxAdmin on MaxScale server to exhibit the configuration we have until now:

#: configured services
[root@maxscale ~]# maxadmin list services
Services.
--------------------------+-------------------+--------+----------------+-------------------
Service Name              | Router Module     | #Users | Total Sessions | Backend databases
--------------------------+-------------------+--------+----------------+-------------------
Shard-A-Router            | readwritesplit    |      1 |              1 | server1, server2
Shard-B-Router            | readwritesplit    |      1 |              1 | server3, server4
Sharding-Service          | schemarouter      |      1 |              1 | Shard-A, Shard-B
CLI                       | cli               |      2 |              2 |
--------------------------+-------------------+--------+----------------+-------------------
 
#: database nodes and their status on shards
[root@maxscale ~]# maxadmin list servers
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server             | Address         | Port  | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
server1            | 192.168.50.11   |  3306 |           0 | Master, Running
server2            | 192.168.50.12   |  3306 |           0 | Slave, Running
server3            | 192.168.50.13   |  3306 |           0 | Master, Running
server4            | 192.168.50.14   |  3306 |           0 | Slave, Running
Shard-A            | 192.168.50.11   |  3306 |           0 | Running
Shard-B            | 192.168.50.13   |  3306 |           0 | Running
-------------------+-----------------+-------+-------------+--------------------

Maxadmin can tell us the statistics about the modules and the all the work they have done until now, this is just to register that we haven't yet tested the services as it’s going to be done on the next section. Below you can see the statistics of the configured services:

#: Shard-A-Router (readwritesplit)
[root@maxscale log]# maxadmin show service "Shard-A-Router"
     Service:                             Shard-A-Router
     Router:                              readwritesplit
     State:                               Started
 
     use_sql_variables_in:      all
     slave_selection_criteria:  LEAST_CURRENT_OPERATIONS
     master_failure_mode:       fail_instantly
     max_slave_replication_lag: -1
     retry_failed_reads:        true
     strict_multi_stmt:         true
     disable_sescmd_history:    true
     max_sescmd_history:        0
     master_accept_reads:       false
 
     Number of router sessions:                1
     Current no. of router sessions:           1
     Number of queries forwarded:               2
     Number of queries forwarded to master:     0 (0.00%)
     Number of queries forwarded to slave:      2 (100.00%)
     Number of queries forwarded to all:        1 (50.00%)
     Started:                             Mon Mar  6 01:37:27 2017
     Root user access:                    Disabled
     Backend databases:
          192.168.50.11:3306    Protocol: MySQLBackend    Name: server1
          192.168.50.12:3306    Protocol: MySQLBackend    Name: server2
     Total connections:                   2
     Currently connected:                 1
 
#: Shard-B-Router (readwritesplit)
[root@maxscale log]# maxadmin show service "Shard-B-Router"
     Service:                             Shard-B-Router
     Router:                              readwritesplit
     State:                               Started
 
     use_sql_variables_in:      all
     slave_selection_criteria:  LEAST_CURRENT_OPERATIONS
     master_failure_mode:       fail_instantly
     max_slave_replication_lag: -1
     retry_failed_reads:        true
     strict_multi_stmt:         true
     disable_sescmd_history:    true
     max_sescmd_history:        0
     master_accept_reads:       false
 
     Number of router sessions:                1
     Current no. of router sessions:           1
     Number of queries forwarded:               2
     Number of queries forwarded to master:     0 (0.00%)
     Number of queries forwarded to slave:      2 (100.00%)
     Number of queries forwarded to all:        1 (50.00%)
     Started:                             Mon Mar  6 01:37:27 2017
     Root user access:                    Disabled
     Backend databases:
          192.168.50.13:3306    Protocol: MySQLBackend    Name: server3
          192.168.50.14:3306    Protocol: MySQLBackend    Name: server4
     Total connections:                   2
     Currently connected:                 1
 
#: Sharding-Service
[root@maxscale log]# maxadmin show service "Sharding-Service"
     Service:                             Sharding-Service
     Router:                              schemarouter
     State:                               Started
 
Session Commands
Total number of queries: 0
Percentage of session commands: 0.00
Longest chain of stored session commands: 0
Session command history limit exceeded: 0 times
Session command history: enabled
Session command history limit: unlimited
Shard map cache hits: 0
Shard map cache misses: 0
 
     Started:                             Mon Mar  6 01:37:27 2017
     Root user access:                    Disabled
     Backend databases:
          192.168.50.11:3306    Protocol: MySQLBackend    Name: Shard-A
          192.168.50.13:3306    Protocol: MySQLBackend    Name: Shard-B
     Total connections:                   1
     Currently connected:                 1

Conclusion

The main goal of this first part is to show that we can mix together both routers SchemaRouter and ReadWriteSplit to have a robust solution for sharding databases among a number of sharding servers, initially, without changing any code on the application side. We show that the whole job can be done with MariaDB MaxScale after configuring it the way I showed in this blog. The second part, coming up next, will focus on showing the mechanics of the routing, while going through the details of how it works.

About Wagner Bianchi

Bianchi (@wagnerbianchijr) has been working with the MySQL ecosystem for more than 12 years now. He is an Oracle ACE Director since 2014 and has worked on the most complex environments running Open Source solutions. Bianchi joint MariaDB RDBA Team in 2017 where he works as Principal Remote DBA @ MariaDB Remote DBA Team.

Read all posts by Wagner Bianchi