Getting Started with the MariaDB HandlerSocket Plugin

About 4 years ago, the guys at DeNA created the HandlerSocket plugin for MySQL. In Yoshinori Matsunobu’s blog, he benchmarks HandlerSocket as more than 7 times the throughput of using the standard libmysql, and nearly twice that of a memcache front end.

So what is HandlerSocket? It is a plugin that bypasses the SQL layer and therefore misses out on the overhead required to parse the SQL. This does mean that you don’t have access to SQL statements, but it does provide CRUD (Create, Read, Update, Delete) operations that effectively gives you a NoSQL access to your database.

HandlerSocket, when started, creates a set of listener processes that wait for connections. Two separate TCP ports are required, one for read operations, and one for write operations.

Let’s take a look at getting HandlerSocket started and some simple use-cases.

I’ve started with a stock Ubuntu 14.04LTS release of Linux, and installed their default mariadb-server and mariadb-client packages by running the command:

sudo apt-get install mariadb-server mariadb-client

Following the MariaDB manual we need to install the plugin to get it running:

INSTALL PLUGIN handlersocket SONAME 'handlersocket.so';

At this point there are supposed to be some processes showing in the SHOW PROCESSLIST output.

MariaDB [(none)]> SHOW PROCESSLIST;
+----+------+-----------+------+---------+------+-------+------------------+----------+
| Id | User | Host      | db   | Command | Time | State | Info             | Progress |
+----+------+-----------+------+---------+------+-------+------------------+----------+
| 28 | root | localhost | NULL | Query   |    0 | NULL  | SHOW PROCESSLIST |    0.000 |
+----+------+-----------+------+---------+------+-------+------------------+----------+
1 row in set (0.00 sec)

Hmm. Not there. I wonder if the plugin is set up?

MariaDB [(none)]> SHOW PLUGINS;
+--------------------------------+----------+--------------------+------------------+---------+
| Name                           | Status   | Type               | Library          | License |
+--------------------------------+----------+--------------------+------------------+---------+
| binlog                         | ACTIVE   | STORAGE ENGINE     | NULL             | GPL     |

...

| handlersocket                  | ACTIVE   | DAEMON             | handlersocket.so | BSD     |
+--------------------------------+----------+--------------------+------------------+---------+
45 rows in set (0.03 sec)

I’ve deleted most of the output, but you can see the last line is showing that the handlersocket daemon is active, so where is it? Let’s take a look at the handlersocket related variables:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'handlersocket%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| handlersocket_accept_balance  | 0     |
| handlersocket_address         |       |
| handlersocket_backlog         | 32768 |
| handlersocket_epoll           | 1     |
| handlersocket_plain_secret    |       |
| handlersocket_plain_secret_wr |       |
| handlersocket_port            |       |
| handlersocket_port_wr         |       |
| handlersocket_rcvbuf          | 0     |
| handlersocket_readsize        | 0     |
| handlersocket_sndbuf          | 0     |
| handlersocket_threads         | 16    |
| handlersocket_threads_wr      | 1     |
| handlersocket_timeout         | 300   |
| handlersocket_verbose         | 10    |
| handlersocket_wrlock_timeout  | 12    |
+-------------------------------+-------+
16 rows in set (0.00 sec)

It looks like the defaults as mentioned in the MariaDB manual are a bit off, at least in the version supplied with Ubuntu 14.04. There is no handlersocket_address, handlersocket_port or handlersocket_port_wr set. These specify the bind address, the TCP port to be used for read operations and the TCP port to be used for write operations. This turns out to be a blessing in disguise as the default address of 0.0.0.0 (or INADDR_ANY) would mean that anyone could connect from anywhere, unless there were firewall rules to stop this. Instead let’s use the localhost address of 127.0.0.1 so connections can only be made from the same host as the database server. To do this we need to edit the my.cnf file. In Ubuntu this is found in /etc/mysql/my.cnf.

We will add the following lines to the my.cnf file

handlersocket_address = 127.0.0.1
handlersocket_port = 9998
handlersocket_port_wr = 9999

Now we restart the server

$ sudo service mysql restart

Now let’s take another look at the SHOW PROCESSLIST output:

MariaDB [(none)]> SHOW PROCESSLIST;
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+----------+
| Id | User        | Host            | db            | Command | Time | State                                     | Info             | Progress |
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+----------+
|  1 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  2 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  3 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  4 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  5 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  6 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  7 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  8 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
|  9 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 10 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 11 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 12 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 13 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 14 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 15 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 16 | system user | connecting host | NULL          | Connect | NULL | handlersocket: mode=rd, 0 conns, 0 active | NULL             |    0.000 |
| 17 | system user | connecting host | handlersocket | Connect | NULL | handlersocket: mode=wr, 0 conns, 0 active | NULL             |    0.000 |
| 45 | root        | localhost       | NULL          | Query   |    0 | NULL                                      | SHOW PROCESSLIST |    0.000 |
+----+-------------+-----------------+---------------+---------+------+-------------------------------------------+------------------+----------+
18 rows in set (0.00 sec)

That is more like it! Now we have handlersocket installed and listening, what can we do with it?

Having a look at the Protocol document it turns out the protocol is extremely simple. So simple that I’m going to use a fairly standard Linux command, netcat, most commonly aliased as nc, which is a simple network utility for making connections and passing input from a file or the command line. First though we need to create a few tables and populate them.

MariaDB [(none)]> CREATE DATABASE `hs_test`;
Query OK, 1 row affected (0.03 sec)

MariaDB [(none)]> USE `hs_test`
Database changed

MariaDB [hs_test]> CREATE TABLE `people` ( 
  `id` INTEGER NOT NULL AUTO_INCREMENT, 
  `name` VARCHAR(60) NOT NULL DEFAULT '',
  `dob` DATE,
  PRIMARY KEY (`id`),
  UNIQUE KEY `name_key` (`name`) );
Query OK, 0 rows affected (0.08 sec)

MariaDB [hs_test]> INSERT INTO `people` (`name`, `dob`) VALUES
    ('Joe', '1985-02-23'), ('Mary', '1982-03-12'), ('Gordon', '1978-09-02');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

MariaDB [hs_test]> SELECT * FROM `people`;
+----+--------+------------+
| id | name   | dob        |
+----+--------+------------+
|  1 | Joe    | 1985-02-23 |
|  2 | Mary   | 1982-03-12 |
|  3 | Gordon | 1978-09-02 |
+----+--------+------------+
3 rows in set (0.00 sec)

MariaDB [hs_test]> SELECT `dob` FROM `people` WHERE `name` = 'Gordon';
+------------+
| dob        |
+------------+
| 1978-09-02 |
+------------+
1 row in set (0.00 sec)

Now for some fun. The Handlersocket protocol is a plain-ASCII protocol using simple TAB-separated fields terminated by a LineFeed. First we need to identify the database, table, index and fields we are interested in using, and then we can query using that index. If we want to use the primary index we can use the literal PRIMARY. So, to replicate the last two queries we can do the following:

$ echo -e "Pt1ths_testtpeopletPRIMARYtid,name,dob
1t<=t1t3t10t0" | nc localhost 9998
0	1
0	3	3	Gordon	1978-09-02	2	Mary	1982-03-12	1	Joe	1985-02-23

This might need some explaining. t represents the TAB character, so the first line is sending a literal P, followed by a literal 1, then the name of the database (hs_test), the table name (people), the key name (PRIMARY) and the list of fields we want to retrieve, all separated by tabs. The next line uses the literal 1 from the previous line to indicate the table/index we are interested in and queries for up to 10 records starting at the first record that has the first field (id) less than or equal to 3. The way it is structured is key index (1), comparison operation (<=), the number of variables that follow (1), the value of the field to compare (3), and optionally the limit (10) and offset (0) to use. This would equate to the SQL statement SELECT `id`,`name`,`dob` FROM `hs_test`.`people` WHERE `id` <= 3 LIMIT 10 OFFSET 0;.

The result is again two lines, each with a 0 in the first field, indicating no error. If an error occurred at any stage this would be non-zero and the next field would be an error message. The next field shows the number of results returned, then if there are results, they are provided as tab separated fields.

So now lets look at changing the index and pulling a single entry out.

$ echo -e "Pt1ths_testtpeopletname_keytname,dob
1t=t1tGordon" | nc localhost 9998
0	1
0	2	Gordon	1978-09-02

In this case we’ve used the `name_key` index, and asked for the name to equal ‘Gordon’. Because we’ve left out the limit and offset, they default to 1 and 0 respectively.

A great start! What about adding something in? We can use a similar construction, however we use the + operation to insert a new record.

$ echo -e "Pt1ths_testtpeopletname_keytname,dob
1t+t2tFredt1965-07-07" | nc localhost 9999
0	1
0	1	4

Seems OK, but how does it look via SQL?

MariaDB [hs_test]> SELECT * FROM `people`;
+----+--------+------------+
| id | name   | dob        |
+----+--------+------------+
|  1 | Joe    | 1985-02-23 |
|  2 | Mary   | 1982-03-12 |
|  3 | Gordon | 1978-09-02 |
|  4 | Fred   | 1965-07-07 |
+----+--------+------------+
4 rows in set (0.00 sec)

Wonderful! We’ve only just scratched the surface here but you can already see that there are some interesting possibilities opened up by the handlersocket plugin. A word of caution though, and that is that as you’ve probably noticed I didn’t have to log in to the database to be able to add or extract data. This is because we are working below the access control layer, so we bypass all of those GRANT statements. There is a layer of authentication that can be added, but it uses a plain-text secret, one for each of the read and write ports. They are controlled by the handlersocket_plain_secret and handlersocket_plain_secret_wr variables. The only problem is that some of the clients do not support authentication yet.

I’d love to hear if you are using handlersocket, or if you’ve got an idea for how to exploit its capabilities. I know I have some ideas, and I hope to share them with you soon.