MaxScale - Just What Can It Do Today?
It seems that in all the writing around what MaxScale is, why I think we need it and what we plan to do with it, we have created some confusion as regards what it can do as of today, mid-February 2014. So I thought I would try to clear it up a little my writing this short post. The version that is described here is the one that is available from SkySQL download site and is labelled as version 0.4, later code is available in GitHub, with the changes we are working on, but I will not cause more confusion by referring to these. Our design concept for MaxScale is a core with five plugin interfaces that provide different functionality, currently we have a core, that supports the event driven model we wanted, but we have only implemented three of the five plugin interfaces. The interfaces that have been implemented are the protocol plugin, the query router and the monitor. The authentication mechanism we are using currently is built into the core, not the long term goal but something we have done in order to get a version that we can start sharing with everyone. The authentication offers the standard MySQL authentication protocol, the client applications authenticate themselves with MaxScale and then MaxScale will deal with the authentication between itself and the backend databases. The username and password used to authenticate with MaxScale are extracted from the backend database, in the 0.4 version the client host information is not checked, this is considered a bug which is being worked on. The core also includes the MariaDB parser which is used by the routers or other modules via an interface we call the query classifier. This allows us to extract the read, write or session modification properties of the statement. The result is that we can create routers such as the read/write splitter that we have, but more of this in a moment.
Currently we have a number of protocol modules, the two most important of these are the MySQL Client protocol and the MySQL Backend protocol modules. The client protocol module is used by client applications that connect to MaxScale. In this case MaxScale acts as a server and offers a full implementation of the MySQL protocol with the exception that does not support compression or SSL. We currently turn off these capabilities in the handshake we perform with the clients. The MySQL Backend protocol module is designed to allow MaxScale to attach to a MySQL, MariaDB or Percona Server database. It operates as the client side of the protocol, although it differs from a normal connector since it is designed to take data in a way that is more efficient for the proxy activities it is performing. In particular if we are doing connection based routing we want to take advantage of having the ready made packet formats. The combination of these two protocol modules means that the only database clients and servers we can support with the version 0.4 are those that talk the MySQL protocol. In addition to these two protocol modules used for database connections we have a telnet protocol module that we use for a debugging interface and an experimental HTTP protocol implementation.
We have two routing modules currently, a connection based router and a statement based router. The connection based router is known as the readconnrouter, which is perhaps not the best name for it. What this router can do is evenly distribute connections between a set of servers, it keeps a count of the number of connections that are currently open to a server via MaxScale and uses this to find the server that has the least number of connections. If multiple servers have the same number of connections then it will round-robin between these servers. The connection router can be configured with some options, these options are used to decrease the set of servers that it will consider when it looks for the server to which connections are routed. The options supported in the 0.4 version are;
- master - Only route to servers that are marked as replication masters
- slave - Only route to servers that are marked as replication slaves
- synced - Only route to servers that are marked as synced, in a Galera cluster
This allows a service in MaxScale to be setup for read/write connections and have them sent to only the master, by specifying the router option "master". Another service, on a different port, can use the same set of servers but by using the router option "slave" it will route to just the slave servers. This option allows for read only connections to scale the read load by making use of the slaves. The synced option is for Galera Clusters and will distribute connections across all the servers that are synced and part of the Galera cluster. If no router option is given then the router will distribute the connections across all the servers that are available. The read/write splitter module is a statement based router, it uses the query classifier to parse the query and determine if it is a read statement, write statement or a session modification statement. Read statements are sent to a slave server whilst write statements are sent to the master server. Session modification statements are sent to all the servers, these are statements such as those that modify a variable or change the default database. Statements that may have unknown side effects, such as selects that use a stored procedure or user defined function are sent to the master. Any statement that fails to parse is also sent to the master rather than depend on the parser that is built into MaxScale. The read/write splitter is only designed to work with MySQL replication and does not support any router options, however a bug in the current version means that it does not ignore the "master" and "slave" options, these should not be given as they cause the router to fail. There are also some limitations in the 0.4 version that we are working on. These limitations are:
- Session modification commands may result in multiple returns begin sent to the client for a single statement
- Very long statements that require multiple packets fail to parse and are thus always routed to the master
- Prepared statements and the execution of prepared statements may not always occur on the right server
- Transaction support is currently missing - a transaction should be execute on a single server
- The choice of master and slave server is currently only done as connect time, if the master moves the connection to the client will be failed. Reconnection to the new master should be handled by the router without intervention by the client.
We also have one other module that implements the router API, the debugcli module. This is not a true router as it does not route queries, however it is used to obtain a debug hook into MaxScale in order to examine the internal state of MaxScale. It makes use of the telnet protocol in order to allow connections to MaxScale and a simple set of administrative commands are supported. This is documented in one of the PDF files that can be found in the Documentation directory in the GitHub repository.
Currently there are two monitor modules available in MaxScale, the mysqlmon and the galeramon. The MySQL monitor is designed for use with Master/Slave replication environments and provides MaxScale data to determine if each database server is up and whether the server is current master or a slave. This information is used by the routers when they determine the set of servers to which to route connections or queries. The galeramon monitor is designed for Galera clusters and looks at the Galera status variables to determine if each database server is part of the cluster and eligible for statement execution to take place or not. It will set the "synced" property on only those servers that report as begin synced to the cluster.
Was I Clear?
Hopefully I have been clear here as to what we support and cleared up some of the confusion that I probably caused in my enthusiasm to share our ideas with you all. Things are changing all the time of course, there are already fixes and improvements in the GitHub repository and we are working on many more. In the next few weeks I will do another update of what we have achieved, I hope this all makes sense and please feel free to comment if there is still some confusion, you wish to point us in a particular direction or get more involved in what we are doing. Source Code: https://github.com/SkySQL/MaxScale GoogleGroup:https://groups.google.com/forum/#!forum/maxscale Download:http://www.skysql.com/downloads/maxscale-mysql-mariadb