MaxScale read/write splitting, authentication and networking updates
It is now 3 months since we put MaxScale out in the public domain in GitHub, we have not been sitting idle during that time, rather we have been adding some new features and fixing some bugs. We focused on three main areas of functionality; statement routing, authentication and network support.
Our main focus has been on the read/write splitter routing module, as it stood when we put the code out into GitHub there were a number of limitations on its use. We wanted to remove as many of these as we could and refresh what we had made available in both GitHub and as a downloadable binary.
This might seem like a strange thing, but one of the considerations when doing statement based routing, in which statements in a single client connections may be diverted to one of many backend servers, is managing statements that have an effect on the execution environment. It is important to ensure that the same effect is propagated to any server to which a future statement might be sent. A simple example of this would be the USE DATABASE statement, although other statements, such as SET also modify the client session. Consider a simple environment in which we have two possible servers to which to route statements; server A and server B. Server A will handle any statement that modifies the database content and server B will be used to read data from the database when the statement does not update the database. If we route the USE statement to server A then we will potentially read the wrong data back when we send a SELECT statement, since this will be routed to server B. The opposite applies if we just send the USE statement to server B, INSERT, DELETE and UPDATE will get run on the wrong database. Therefore we must send the USE statement to both server A and server B. The problem now is that the client has sent one statement to MaxScale, but MaxScale has sent two statements, albeit identical, to two different backend servers. MaxScale can not simply send the responses back the client, otherwise it will get two responses to the same statement. Therefore MaxScale must manage this, what it now does is send back the first reply it gets from either server. This solves the problem for the client, but gives MaxScale another issue. As soon as the client receives the response it may send another statement, MaxScale has to be careful not to forward that statement to a server that is still executing a previous session command, such as the USE statement in the example above. There may also be multiple successive session modification commands in the incoming client stream, MaxScale must therefore ensure it maintains the ordering and overlapping of these commands to the various backend servers.
Statement routing needs to be aware of transactional boundaries in the incoming stream of statements from a client and ensure that all the statements within the transaction are executed within a single transaction on a single backend database server. The approach the read/write splitter takes currently is to send the entire transaction to the master server. This is done as it is not possible to determine if the transaction will at any point cause a database update. Therefore the safest option is to send all transactions to the master server. There is currently a limitation in the read/write splitter implementation, it does not take note of implicit commit operations caused by setting auto commit to true within a running transaction.
MaxScale has been updated to handle authentication in the same way as is done within MySQL, it is not just a username and password that is used to authenticate, but rather a triple consisting of username, password and client host are used. This allows MaxScale to apply the same rules as the backend servers themselves apply. Therefore user A from host X is treated as different from user A at host Y. It should be noted however that the backend database sees all user as connecting from the host that runs MaxScale, therefore there must be permission for users to also connect from the MaxScale host. The password for the MaxScale host must also be the same as the password used to connect from the originating host. One problem with previous versions of MaxScale was that it only loaded the user data at startup, this meant that any changes or additions to the users within the backend database was not reflected within MaxScale until the next restart. This restriction has been removed, with MaxScale now reloading the user data whenever there is an authentication failure. This reload facility is rate limited to prevent bogus authentication requests being used to launch a denial of service attack on MaxScale or on the backend servers themselves. The other change related to authentication is the introduction of an explicit configuration option on a per MaxScale service basis that can be used to enable and disable access with the root user.
The two main changes in the networking area are the introduction of the ability to bind listeners for services to a particular address and the option to use Unix domain sockets for connections from clients to MaxScale. The use of Unix domain sockets allows for a more efficient implementation when installing MaxScale on the same host as the client software, e.g. an application server and also gives the option for improved security by blocking network access to those MaxScale instances.
A great many bugs and smaller fixes have been incorporate into MaxScale, details of the bug fixes can be found on the SkySQL bugzilla system (http://bugs.skysql.com), a release note is available within the GitHub repository for MaxScale, as is all of the code. For those of you that would rather download a binary rather than compile the source, this is also available on the SkySQL download site. Source Code: https://github.com/SkySQL/MaxScale GoogleGroup: https://groups.google.com/forum/#!forum/maxscale Download: https://downloads.skysql.com/files/SkySQL/MaxScale IRC: #maxscale on freenode