MaxScale 24.02 Simple Sharding with Two Servers
Sharding is the method of splitting a single logical database server into
separate physical databases. This tutorial describes a very simple way of
sharding. Each schema is located on a different database server and MariaDB
MaxScale's schemarouter module is used to combine them into a single logical
database server.
This tutorial was written for Ubuntu 22.04, MaxScale 23.08 and . In addition to
the MaxScale server, you'll need two MariaDB servers which will be used for the
sharding. The installation of MariaDB is not covered by this tutorial.
Installing MaxScale
The easiest way to install MaxScale is to use the MariaDB repositories.
This tutorial uses a broader set of grants than is required for the sake of
brevity and backwards compatibility. For the minimal set of grants, refer to the .
All MaxScale configurations require at least two accounts: one for reading
authentication data and another for monitoring the state of the
database. Services will use the first one and monitors will use the second
one. In addition to this, we want to have a separate account that our
application will use.
All of the users must be created on both of the MariaDB servers.
Creating the Schemas and Tables
Each server will hold one unique schema which contains the data of one specific
customer. We'll also create a shared schema that is present on all shards that
the shard-local tables can be joined into.
Create the tables on the first server:
Create the tables on the second server:
Configuring MaxScale
The MaxScale configuration is stored in /etc/maxscale.cnf.
First, we configure two servers we will use to shard our database. The db-01
server has the customer_01 schema and the db-02 server has the customer_02
schema.
The next step is to configure the service which the users connect to. This
section defines which router to use, which servers to connect to and the
credentials to use. For sharding, we use schemarouter router and the
service_user credentials we defined earlier. By default the schemarouter warns
if two or more nodes have duplicate schemas so we need to ignore them withignore_tables_regex=.*.
After this we configure a listener for the service. The listener is the actual
port that the user connects to. We will use the port 4000.
The final step is to configure a monitor which will monitor the state of the
servers. The monitor will notify MariaDB MaxScale if the servers are down. We
add the two servers to the monitor and use the monitor_user credentials. For
the sharding use-case, the galeramon module is suitable even if we're not
using a Galera cluster. The schemarouter is only interested in whether the
server is in the Running state or in the Down state.
After this we have a fully working configuration and the contents of/etc/maxscale.cnf should look like this.
Then you're ready to start MaxScale.
Testing the Sharding
MariaDB MaxScale is now ready to start accepting client connections and routing
them. Queries are routed to the right servers based on the database they target
and switching between the shards is seamless since MariaDB MaxScale keeps the
session state intact between servers.
To test, we query the schema that's located on the local shard and join it to
the shared table.
The sharding also works even if no default database is selected.
One limitation of this sort of simple sharding is that cross-shard joins are not possible.
In most multi-tenant situations, this is an acceptable limitation. If you do
need cross-shard joins, the storage
engine will provide you this.
This page is licensed: CC BY-SA / Gnu FDL