January 13, 2015

MaxScale and Transparent Session Handling

Applications are often built on top of single MySQL-compliant database instance but often there is a need for more performance and/or availability than what one database instance can provide. Adding slaves or replacing standalone database server with full-fledged MySQL-compliant cluster often requires changes to the application.

MaxScale and its Read Write Split router (rwsplit, for short) attempts to hide the complexity brought by multiple backend servers and to minimize the need for application changes by taking care of authentication, managing connections and providing session management on behalf of the client.

Read Write Split router is a plug-in module, which can be used with MaxScale to receive, examine and forward queries sent by clients. Routing decision is based on query type and some other criteria, such as transaction state, and replication lag. For rwsplit, every cluster consists of a master and slaves. In case of Galera or NDB cluster, the monitor plugin for the particular database environment elects one of the nodes to act as the master node, which rwsplit treats as a master where it routes writes and transactional queries.

The session data can be considered as the state of the client connection, this includes any SQL variables that have been set, the current database and transaction state. Any command that might alter this state is considered to be a session command. Examples of such commands are 'SET @myvar:=5', 'SET AUTOCOMMIT=false', or 'USE test'. Queries after session data modification must see the new data regardless of which backend server rwsplit routes them.

MaxScale's rwsplit router allows for queries to be routed to master or any of the slaves used by the session. Therefore subsequent queries may be executed in different servers, which makes it necessary for an rwsplit router to send session data modifications to all backend servers. It detects when an incoming query includes session update and multiplies the command to all session's backends. Since the client waits for a single reply to the query, and multiple servers send multiple replies to MaxScale, only the first is routed to the client while redundant ones are discarded. On the contrary, queries which read session variable values, for example, SELECT * from mysql.user where host=@myhost, can be routed to any available backend server.

MaxScaleGraphics1.png
Direct client/server connection

 

MaxScaleGraphics2.png
Client connection to MySQL Replication cluster through Read/Write Split router

Session modification commands and execution order

In a single thread execution mode handling session variable is easy: detect, route, receive and filter out all but one response which is then sent to the client. MaxScale, however, is a multi-threaded, non-blocking server where threads are not dedicated to specific client or task. Moreover, rwsplit router executes session variable writes in an optimistic way, which means that first response to session variable update is sent to client - which may then continue sending queries to MaxScale. And of course, responses from backend servers may arrive anytime, in any order, and in multiple pieces. Therefore, it is possible that one of the slaves updates session variable, say @myvar:=5, faster than other backends and its response packet is sent to the client, which immediately tries to read @myvar. Since session variable read can be executed in any backend, rwsplit router may send the read to a slave which haven't even received the variable write yet. In order to maintain the execution order between session updates and subsequent queries, rwsplit suspends the execution of further queries until the backend server has executed all session commands that were initiated before the query.

Adding a backend database node to session

When client connects to rwsplit router service, master and slaves are selected for the session. If master fails during session, session is aborted, but if slave fails it can be replaced with another slave or the failed node can rejoin the session. Search for replacement is triggered by monitor as soon as the failure is detected. A node selection routine is called, and necessary amount of slaves searched and connected. New backend candidate nodes must have all session modifications executed before they can join. That is, new node candidates must catch up with other members first.

Rwsplit saves all session modification commands to session's command history. Each session member is provided with a cursor to the beginning of the command history list and executing the listed commands is part of the process of becoming a session member. As a result, every backend node in the session shares same session data with other session nodes, which makes it possible to balance reads and writes between the cluster nodes.

Limitations

The known limitations are partially due to the nature of variables in MySQL and partially due to limitations in rwsplit implementation. Since session management relies on statement replication (from MaxScale to backend servers), they are executed in multiple backends under different conditions (location, time, etc.). Thus, the result of execution of a session modification command may differ in different nodes. If the new value depends on environment or time, the result typically differs in each of the nodes. Example of such a command is : SELECT @mytime:=now() .

Another known limitation is that if session modification command is embedded into write statement, it won't be replicated to slaves. Rwsplit detects the query type correctly but extraction and replication of partial query is not implemented yet. An example of an embedded session modification command : INSERT INTO test.t1 values (@myvar:=5) .

Current implementation is based on optimistic method, which assumes that a session modification command either fails or succeeds in all backend nodes. Thus, cases, where the first reply (routed to the client) contains different result than the responses from other backends include a risk of invalid behavior. Imagine a case where 'USE test' succeeds in one of the typically slowest slaves but fails in other, more stressed backends - and especially in the master. The client assumes that it is using 'test' instead of 'product' and executes set of writes, which are routed to master where administrator just dropped the test database before the execution of 'USE test'. Thus, optimism and asynchronous replication can potentially cause unwanted damage.

If session modification command routing fails to any of the backends due to backend failure, for example, client session will be closed. This behavior is limited to session modification command routing only. If backend failure is noticed by monitor or in normal SQL query routing, query must be re-executed but session remains untouched.

Summary

In addition to all other its features, MaxScale's Read/Write Split router provides transparent client session management, which, despite its current limitations, makes it possible to use an application originally designed for single database server, with a MySQL-compatible database cluster without modifications. Rwsplit handles the query routing and response transfer in an optimistic way, which every time works as fast as fastest backend server. The current method includes known, yet small risks, but the implementation is also likely to evolve as we get more experiences from the users and from testing.

About Vilho Raatikka

Vilho is a Senior Software Engineer working mainly on MaxScale. Vilho has worked with databases server technologies since the year 2000 for IBM, Solid Information Technology and Helsinki University as software engineer, teaching assistant and researcher.

Read all posts by Vilho Raatikka