Using MariaDB MaxScale for Schema Sharding With Galera

spacer

Schema Sharding is a useful means of distributing the load over multiple MariaDB Enterprise Server instances when you have multiple autonomous users or sets of users. In some cases, the users run the same application where each group of users has the same database schema but different data and in other cases the groups of users have completely different schema and data, but they run in the same MariaDB Enterprise Server instance.

This blog post shows how to implement Schema Sharding with MariaDB MaxScale and also how to combine this with Galera Cluster and how two routers in MariaDB MaxScale can be daisy-chained using the MariaDB MaxScale targets parameter for MariaDB MaxScale routers.

The Problem to Solve

To distribute load across databases, or schemas, in multiple MariaDB Enterprise Server instances we can separate the groups of users, where each group is granted access to a single database in the MariaDB Enterprise Server instance.

To distribute load across databases, or schemas, in multiple MariaDB Enterprise Server instances we can separate the groups of users, where each group is granted access to a single database in the MariaDB Enterprise Server instance.

 

At some point the load on that single MariaDB Enterprise Server instance might get just too high and you want to distribute the groups of users and their data over several instances, like this:

Diagram: distribute the groups of users and their data over several instances

 

There are a few issues with this though, one issue is that as some groups are relocated, their point of access changes and secondly we also need to move the relevant data with it. In addition, every time we have a need to rebalance, we have to go through the same process of moving the access points for a user. What would be nice to be able to have a single point of access, independent of which database that is being accessed and then have a service that would direct traffic to the appropriate MariaDB Enterprise Server based on the database that is being accessed, MariaDB MaxScale Schema Router does just this.

Diagram: MariaDB MaxScale Schema Router

 

And there are more uses for this, let us say we have a single application with 2 or more distinct databases for different uses, maybe one database is rather static and defines the Web-interface, one is the transactional OLTP database and one database contains data for reports. Due to the load on the transactional database, you do not want to have all these databases in one MariaDB Enterprise Server instance, so you would want two instances, but you still want a single access point, the MariaDB MaxScale Schema Router can do this also.

With this background in place, let’s look at how to get this work. And let’s look at how we add high-availability to this and how MariaDB MaxScale is geared up to do this.

MaxScale schema sharding diagram

Schema Sharding in Practice

With this explanation out of the way, let us have a look at Schema Sharding in practice.

Configuring Schema Sharding

The following assumes that you already have set up the necessary MariaDB Enterprise Servers as well as MariaDB MaxScale. What we need to do then is to define the servers in the MariaDB MaxScale configuration, note that the following assumes that the MaxScale user (in this case maxuser) has been defined in MariaDB Enterprise Server and that all the necessary GRANTs have been made.

[server1]
type=server
address=srv1
protocol=MariaDBBackend
monitoruser=maxuser
monitorpw=maxpwd
proxy_protocol=on

[server2]
type=server
address=srv2
protocol=MariaDBBackend
monitoruser=maxuser
monitorpw=maxpwd
proxy_protocol=on

With this in place we next have to set up monitoring of these two servers

[monitor1]
type=monitor
module=mariadbmon
servers=server1,server2
user=maxuser
password=maxpwd

And then can set up the Schema Router

[router1]
type=service
router=schemarouter
cluster=monitor1
auth_all_servers=true
user=maxuser
password=maxpwd

The only setting here that is somewhat unusual is the auth_all_servers one. What this does is that it gets authorization information from all servers, not just the first one. The reason is that in the case of the Schema Router, not all users and not all GRANTs are the same on all Servers, that is actually the whole idea with this router (in MariaDB MaxScale 24.02 and up this is automatic and this setting is not necessary).

The last thing we need to do to make this work is to set up the listener:

[listener1]
type=listener
service=router1
protocol=MariaDBClient
port=4600

With this in place, restart MariaDB MaxScale and then we can test Schema Sharding and see how it works.

 

Testing Schema Sharding With One Database Per User

We are going to do a simple test with the databases and users outlined above, so first we have to create these users and databases in the appropriate instance, first on the MariaDB Enterprise server on srv1

$ mariadb -h srv1 -u root -p
Enter password: ******
MariaDB> CREATE DATABASE db1;
MariaDB> CREATE DATABASE db3;
MariaDB> CREATE USER 'g1u1'@'%' IDENTIFIED BY 'g1u1';
MariaDB> CREATE USER 'g3u1'@'%' IDENTIFIED BY 'g3u1';
MariaDB> GRANT ALL ON db1.* TO 'g1u1'@'%';
MariaDB> GRANT ALL ON db3.* TO 'g3u1'@'%';

And then on srv2

$ mariadb -h srv2 -u root -p
Enter password: ******
MariaDB> CREATE DATABASE db2;
MariaDB> CREATE USER 'g2u1'@'%' IDENTIFIED BY 'g2u1';
MariaDB> GRANT ALL ON db2.* TO 'g2u1'@'%';

Then we can test this by logging in through MariaDB MaxScale, first on srv1

$ mariadb -h maxscalehost -P 4600 -u g1u1 -p db1
Enter password: ******
MariaDB> SELECT @@hostname;
+------------+
| @@hostname |
+------------+
| srv1       |
+------------+
1 row in set (0.001 sec)
MariaDB> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| db1                |
| information_schema |
+--------------------+
2 rows in set (0.000 sec)

And then we can test connecting on srv2

$ mariadb -h maxscalehost -P 4600 -u g2u1 -p db2
Enter password: ******
MariaDB> SELECT @@hostname;
+------------+
| @@hostname |
+------------+
| srv2       |
+------------+
1 row in set (0.000 sec)

As can be seen, the connection is forwarded to the appropriate host, based on which database is being accessed and each user has access to just one database.

 

Testing Schema Sharding With Multiple Schemas Per User

Here we are going to show how multiple schemas, in multiple MariaDB Enterprise Server instances can be accessed as they were in a single instance. For this we first create the relevant databases and a user first on srv1

$ mariadb -h srv1 -u root -p
Enter password: ******
MariaDB> CREATE DATABASE web;
MariaDB> CREATE DATABASE reporting;
MariaDB> CREATE USER 'user1'@'%' IDENTIFIED BY 'user1';
MariaDB> GRANT ALL ON web.* TO 'user1'@'%';
MariaDB> GRANT ALL ON reporting.* TO 'user1'@'%';

And then on srv2

$ mariadb -h srv2 -u root -p
Enter password: ******
MariaDB> CREATE DATABASE oltp;
MariaDB> CREATE USER 'user1'@'%' IDENTIFIED BY 'user1';
MariaDB> GRANT ALL ON oltp.* TO 'user1'@'%';

With this in place lets connect to MariaDB MaxScale

$ mariadb -h maxscalehost -P 4600 -u g2u1 -p web
Enter password: ******
MariaDB> SELECT @@hostname;
+------------+
| @@hostname |
+------------+
| srv1       |
+------------+
1 row in set (0.002 sec)
MariaDB> use oltp
+------------+
| @@hostname |
+------------+
| srv2       |
+------------+
1 row in set (0.001 sec)

As can be seen, depending on the current database we are connected to two different MariaDB Enterprise Server instances.

 

Introducing Galera for High Availability

Using Galera with MariaDB Enterprise Server is a great way to add High Availability to MariaDB Enterprise Server and using MariaDB MaxScale with Galera makes the use of Galera transparent and efficient, with load balancing and failover and many other features. But how can we combine MariaDB MaxScale configured for Schema Sharding with a configuration for Galera? This is what we will look at next.

Assuming that we have three distinct groups of users, all accessing their own database through a single access point. Each database resides in a MariaDB Enterprise Galera cluster and MariaDB MaxScale needs to be configured both for Galera and Schema Sharding.

Diagram: Galera for high availability configuration

Configuring MariaDB MaxScale for Galera

Here I again assume that the basic MariaDB MaxScale configuration is done and it is also assumed that we have two MariaDB Enterprise Galera clusters configured and running, then the first thing we need to do is to define the 6 servers in the two Galera clusters first.

[server11]
type=server
address=srv1a
protocol=MariaDBBackend

[server12]
type=server
address=srv1b
protocol=MariaDBBackend

[server13]
type=server
address=srv1c
protocol=MariaDBBackend

[server21]
type=server
address=srv2a
protocol=MariaDBBackend

[server22]
type=server
address=srv2b
protocol=MariaDBBackend

[server23]
type=server
address=srv2c
protocol=MariaDBBackend

Following this, we need to set up monitoring for these two clusters

[monitor11]
type=monitor
module=galeramon
servers=server11,server12,server13
user=maxuser
password=maxpwd

[monitor21]
type=monitor
module=galeramon
servers=server21,server22,server23
user=maxuser
password=maxpwd

So far, there is nothing unusual about this, and at this point we are about to set up routers for these two clusters

[router11]
type=service
router=readwritesplit
cluster=monitor11
user=maxuser
password=maxpwd

[router21]
type=service
router=readwritesplit
cluster=monitor21
user=maxuser
password=maxpwd

At this point we would usually create a listener, but we are not going to do that this time, as these two clusters are not to be accessed directly but only through the schema router, so the next step is to set up the schema router

[router31]
type=service
router=schemarouter
targets=router11,router21
auth_all_servers=true
user=maxuser
password=maxpwd

As you can see we are not doing schema sharding over servers or clusters here, instead the targets setting is used to point to the routers we are to shard over, which is the two Galera readwritesplit routers we defined above.

 

Testing Schema Sharding with Galera

To test that Schema Sharding works as it should with Galera Cluster, let’s set up the users and databases as outlined in the illustration above, one cluster at the time

$ mariadb -h srv1a -u root -p
Enter password: ******
MariaDB> CREATE DATABASE db1;
MariaDB> CREATE DATABASE db3;
MariaDB> CREATE USER 'g1u1'@'%' IDENTIFIED BY 'g1u1';
MariaDB> CREATE USER 'g3u1'@'%' IDENTIFIED BY 'g3u1';
MariaDB> GRANT ALL ON db1.* TO 'g1u1'@'%';
MariaDB> GRANT ALL ON db3.* TO 'g3u1'@'%';

And the in the other cluster

$ mariadb -h srv2a -u root -p
Enter password: ******
MariaDB> CREATE DATABASE db2;
MariaDB> CREATE USER 'g2u1'@'%' IDENTIFIED BY 'g2u1';
MariaDB> GRANT ALL ON db2.* TO 'g2u1'@'%';

With this in place, let us connect to MariaDB MaxScale and see which cluster we end up with

$ mariadb -h maxscalehost -P 4700 -u g1u1 -p db1
Enter password: ******
MariaDB> SELECT @@wsrep_cluster_name;
+----------------------+
| @@wsrep_cluster_name |
+----------------------+
| clu1                 |
+----------------------+
1 row in set (0.001 sec)

And then we try the same with the other user

$ mariadb -h maxscalehost -P 4700 -u g3u1 -p db2
Enter password: ******
MariaDB> SELECT @@wsrep_cluster_name;
+----------------------+
| @@wsrep_cluster_name |
+----------------------+
| clu2                 |
+----------------------+
1 row in set (0.002 sec)

Conclusion

Schema Sharding is a smart means of distributing load over several MariaDB Enterprise Server instances and MariaDB MaxScale can make this pretty much transparent. This can also be combined with Galera or a replication cluster by combining the Schema Sharding router with the ReadWriteSplit routers using the Targets parameter.

Give it a try. Download MariaDB MaxScale today at mariadb.com/downloads.