How to get MySQL 5.6 parallel replication and Percona XtraBackup to play nice together

spacer

MySQL 5.6 introduces a new “multi-threaded slave” feature (aka “parallel replication”) that turns the “slave SQL thread” into a number of distinct threads that each apply events to different schemas. This means that separate slave “worker threads” will be executing different positions in the slave’s relay log (corresponding, of course, to different positions in the master’s binary log) and that the Relay_Master_Log_File and Exec_Master_Log_Pos values in SHOW SLAVE STATUS output are no longer an accurate indicator of the current position of the slave. This causes severe problems if using Percona XtraBackup/innobackupex with the --slave-info option to get a backup that can be used to seed a new slave. Let’s take a look at how multi-threaded slave works in MySQL 5.6, how innobackupex gets position information, how to recognize problems, and how to work around them to get a consistent backup that can be used for seeding a new slave. When a MySQL 5.6 slave is started with slave_parallel_workers set great than 0 (the default), you can identify the additional worker threads in SHOW PROCESSLIST output. Here’s an example with slave_parallel_workers=3:

mysql> show processlist;
+----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+
| Id | User        | Host      | db   | Command | Time | State                                                                       | Info             |
+----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+
|  1 | root        | localhost | NULL | Query   |    0 | init                                                                        | show processlist |
| 27 | system user |           | NULL | Connect |    4 | Waiting for master to send event                                            | NULL             |
| 28 | system user |           | NULL | Connect |    4 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL             |
| 29 | system user |           | NULL | Connect |    4 | Waiting for an event from Coordinator                                       | NULL             |
| 30 | system user |           | NULL | Connect |    4 | Waiting for an event from Coordinator                                       | NULL             |
| 31 | system user |           | NULL | Connect |    4 | Waiting for an event from Coordinator                                       | NULL             |
+----+-------------+-----------+------+---------+------+-----------------------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)

Thread 1 is the thread executing SHOW FULL PROCESSLIST, thread 27 is the slave IO thread, thread 28 is what would traditionally be the “slave SQL thread”, known in the context of parallel replication as the “Coordinator”, and threads 29, 30, and 31 are the slave “worker” threads, the threads that actually execute events on the slave. The multi-threaded slave feature in MySQL 5.6 works by segragating traffic by schema. That is, queries/transactions affecting separate schemas can be handled by different worker threads. I’ve created two schemas (“db1” and “db2”), each of which has a table “t1” with a single auto-increment primary key column:

CREATE TABLE `t1` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB

I am executing two simple bash commands that insert rows into those two tables:

{ while :; do printf 'INSERT INTO t1 () VALUES ();n'; done; } | 
    ./bin/mysql --defaults-file=master.cnf db1 & 
{ while :; do printf 'INSERT INTO t1 () VALUES ();n'; done; } | 
    ./bin/mysql --defaults-file=master.cnf db2 &

I’m using relay_log_info_repository=TABLE to make it more easy to see what is happening with each of the worker threads. When using the table-based relay log info repository, you can consult mysql.slave_worker_info for information about the state of each of the worker threads:

mysql> select * from mysql.slave_worker_infoG
*************************** 1. row ***************************
                        Id: 1
            Relay_log_name:
             Relay_log_pos: 0
           Master_log_name:
            Master_log_pos: 0
 Checkpoint_relay_log_name:
  Checkpoint_relay_log_pos: 0
Checkpoint_master_log_name:
 Checkpoint_master_log_pos: 0
          Checkpoint_seqno: 0
     Checkpoint_group_size: 64
   Checkpoint_group_bitmap:
*************************** 2. row ***************************
                        Id: 2
            Relay_log_name: ./db1-relay-bin.000007
             Relay_log_pos: 685319
           Master_log_name: db1-bin.000005
            Master_log_pos: 38233290
 Checkpoint_relay_log_name: ./db1-relay-bin.000007
  Checkpoint_relay_log_pos: 617615
Checkpoint_master_log_name: db1-bin.000005
 Checkpoint_master_log_pos: 38165586
          Checkpoint_seqno: 363
     Checkpoint_group_size: 64
   Checkpoint_group_bitmap: ...
*************************** 3. row ***************************
                        Id: 3
            Relay_log_name: ./db1-relay-bin.000007
             Relay_log_pos: 685505
           Master_log_name: db1-bin.000005
            Master_log_pos: 38233476
 Checkpoint_relay_log_name: ./db1-relay-bin.000007
  Checkpoint_relay_log_pos: 617615
Checkpoint_master_log_name: db1-bin.000005
 Checkpoint_master_log_pos: 38165586
          Checkpoint_seqno: 364
     Checkpoint_group_size: 64
   Checkpoint_group_bitmap: ...
3 rows in set (0.01 sec)

I executed that query after traffic to the master had been stopped. Here are the relevant parts of the corresponding SHOW SLAVE STATUS output:

mysql> SHOW SLAVE STATUSG
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 4001
                Connect_Retry: 60
              Master_Log_File: db1-bin.000005
          Read_Master_Log_Pos: 38233476
               Relay_Log_File: db1-relay-bin.000007
                Relay_Log_Pos: 685505
        Relay_Master_Log_File: db1-bin.000005
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
          Exec_Master_Log_Pos: 38233476
              Relay_Log_Space: 685837
        Seconds_Behind_Master: 0
             Master_Server_Id: 1
                  Master_UUID: f13feb2c-441f-11e4-846c-0800273876d5
             Master_Info_File: mysql.slave_master_info
1 row in set (0.00 sec)

We can see there that Exec_Master_Log_Pos is 38233476, which matches the value of Master_log_pos for worker thread #3 that we got from mysql.slave_worker_info. Worker thread #2, on the other hand, has a value of 38233290, though, even though there’s no traffic coming from the master. Worker thread #1 isn’t doing anything; because I’m only working with only 2 databases, only 2 worker threads can be utilized. So why aren’t both the active worker threads at the same position? Let’s look at the binary log to find out. Here’s a snippet of the binary log starting from the transaction that ends at the position of worker thread #2 (38233290) until the position shown by worker thread #3 and Exec_Master_Log_Pos (38233476):

# at 38233104
#140925 12:26:32 server id 1  end_log_pos 38233175 CRC32 0x2dde38dd     Query   thread_id=11    exec_time=0     error_code=0
SET TIMESTAMP=1411673192/*!*/;
BEGIN
/*!*/;
# at 38233175
#140925 12:26:32 server id 1  end_log_pos 38233219 CRC32 0xf87580f7     Table_map: `db2`.`t1` mapped to number 77
# at 38233219
#140925 12:26:32 server id 1  end_log_pos 38233259 CRC32 0x08c61960     Write_rows: table id 77 flags: STMT_END_F

BINLOG '
aGwkVBMBAAAALAAAAINkRwIAAE0AAAAAAAEAA2RiMgACdDEAAQMAAPeAdfg=
aGwkVB4BAAAAKAAAAKtkRwIAAE0AAAAAAAEAAgAB//4ikQEAYBnGCA==
'/*!*/;
### INSERT INTO `db2`.`t1`
### SET
###   @1=102690 /* INT meta=0 nullable=0 is_null=0 */
# at 38233259
#140925 12:26:32 server id 1  end_log_pos 38233290 CRC32 0xb1314ada     Xid = 205705
COMMIT/*!*/;
# at 38233290
#140925 12:26:32 server id 1  end_log_pos 38233361 CRC32 0xbe4c8c21     Query   thread_id=12    exec_time=0     error_code=0
SET TIMESTAMP=1411673192/*!*/;
BEGIN
/*!*/;
# at 38233361
#140925 12:26:32 server id 1  end_log_pos 38233405 CRC32 0x69f53824     Table_map: `db1`.`t1` mapped to number 76
# at 38233405
#140925 12:26:32 server id 1  end_log_pos 38233445 CRC32 0xe7cc526f     Write_rows: table id 76 flags: STMT_END_F
BINLOG '
aGwkVBMBAAAALAAAAD1lRwIAAEwAAAAAAAEAA2RiMQACdDEAAQMAACQ49Wk=
aGwkVB4BAAAAKAAAAGVlRwIAAEwAAAAAAAEAAgAB//7OkQEAb1LM5w==
'/*!*/;
### INSERT INTO `db1`.`t1`
### SET
###   @1=102862 /* INT meta=0 nullable=0 is_null=0 */
# at 38233445
#140925 12:26:32 server id 1  end_log_pos 38233476 CRC32 0x680a36cd     Xid = 205706
COMMIT/*!*/;
# at 38233476

The first transaction is operating on database db2 and the second is operating on database db1. Remember that parallel slave workers in MySQL 5.6 operate on different databases. That means that slave worker #2 will never see the 2nd transaction shown, because worker thread #2 isn’t operating on the database in that transaction’s table map. Thus, worker thread #2 will simply leap-frog over that transaction when there is more relay log to be applied. The problem here comes when trying to grab a snapshot of the state of the slave. When there is traffic coming to the slave, one of the slave workers might be a couple events behind the position reported by Exec_Master_Log_Pos in SHOW SLAVE STATUS output at the time that innobackupex executes FLUSH TABLES WITH READ LOCK. In that case, relying on Exec_Master_Log_Pos to give you a snapshot of the slave’s position will fail. Here’s a demonstration of that:

mysql> flush tables with read lock;
Query OK, 0 rows affected (0.00 sec)

mysql> select Id, Master_log_name, Master_log_pos from mysql.slave_worker_info;
+----+-----------------+----------------+
| Id | Master_log_name | Master_log_pos |
+----+-----------------+----------------+
|  1 |                 |              0 |
|  2 | db1-bin.000010  |        2207196 |
|  3 | db1-bin.000010  |        2206266 |
+----+-----------------+----------------+
3 rows in set (0.00 sec)

mysql> show slave statusG
*************************** 1. row ***************************
...
        Relay_Master_Log_File: db1-bin.000010
...
          Exec_Master_Log_Pos: 2206638

SHOW SLAVE STATUS output reports Exec_Master_Log_Pos of 2206638, slave worker #2 reports Master_log_pos of 2207196, and slave worker #3 reports Master_log_pos of 2206266. None of them match up! Here’s the binlog snippet from the beginning of the event that ends at the position reported by worker thread #3, the lowest of the three positions we see, 2206266. The snippet goes through the position of worker thread #2, the highest position value, at 2207196. The text for the highest and lowest positions should be red and the text for the middle number reported by SHOW SLAVE STATUS should be blue.

#140930 17:40:19 server id 1 end_log_pos 2206151 CRC32 0x0b522e06 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2206151 #140930 17:40:19 server id 1 end_log_pos 2206195 CRC32 0xe2d479fd Table_map: `db2`.`t1` mapped to number 71 # at 2206195 #140930 17:40:19 server id 1 end_log_pos 2206235 CRC32 0xee492a80 Write_rows: table id 71 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAAPOpIQAAAEcAAAAAAAEAA2RiMgACdDEAAQMAAP151OI= c00rVB4BAAAAKAAAABuqIQAAAEcAAAAAAAEAAgAB//7rkhIAgCpJ7g== '/*!*/; ### INSERT INTO `db2`.`t1` ### SET ### @1=1217259 /* INT meta=0 nullable=0 is_null=0 */ # at 2206235 #140930 17:40:19 server id 1 end_log_pos 2206266 CRC32 0x3c27dd24 Xid = 11874 COMMIT/*!*/; # at 2206266

#140930 17:40:19 server id 1 end_log_pos 2206337 CRC32 0xe8d590b0 Query thread_id=2 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2206337 #140930 17:40:19 server id 1 end_log_pos 2206381 CRC32 0xe6622795 Table_map: `db1`.`t1` mapped to number 70 # at 2206381 #140930 17:40:19 server id 1 end_log_pos 2206421 CRC32 0xd573758f Write_rows: table id 70 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAAK2qIQAAAEYAAAAAAAEAA2RiMQACdDEAAQMAAJUnYuY= c00rVB4BAAAAKAAAANWqIQAAAEYAAAAAAAEAAgAB//4rqBIAj3Vz1Q== '/*!*/; ### INSERT INTO `db1`.`t1` ### SET ### @1=1222699 /* INT meta=0 nullable=0 is_null=0 */ # at 2206421 #140930 17:40:19 server id 1 end_log_pos 2206452 CRC32 0xf9dc9029 Xid = 11875 COMMIT/*!*/; # at 2206452

#140930 17:40:19 server id 1 end_log_pos 2206523 CRC32 0x6450488d Query thread_id=2 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2206523 #140930 17:40:19 server id 1 end_log_pos 2206567 CRC32 0x22709a74 Table_map: `db1`.`t1` mapped to number 70 # at 2206567 #140930 17:40:19 server id 1 end_log_pos 2206607 CRC32 0xb61b2645 Write_rows: table id 70 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAAGerIQAAAEYAAAAAAAEAA2RiMQACdDEAAQMAAHSacCI= c00rVB4BAAAAKAAAAI+rIQAAAEYAAAAAAAEAAgAB//4sqBIARSYbtg== '/*!*/; ### INSERT INTO `db1`.`t1` ### SET ### @1=1222700 /* INT meta=0 nullable=0 is_null=0 */ # at 2206607 #140930 17:40:19 server id 1 end_log_pos 2206638 CRC32 0x1719512d Xid = 11877 COMMIT/*!*/; # at 2206638

#140930 17:40:19 server id 1 end_log_pos 2206709 CRC32 0xdb47b048 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2206709 #140930 17:40:19 server id 1 end_log_pos 2206753 CRC32 0x90a42f1d Table_map: `db2`.`t1` mapped to number 71 # at 2206753 #140930 17:40:19 server id 1 end_log_pos 2206793 CRC32 0xe841042c Write_rows: table id 71 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAACGsIQAAAEcAAAAAAAEAA2RiMgACdDEAAQMAAB0vpJA= c00rVB4BAAAAKAAAAEmsIQAAAEcAAAAAAAEAAgAB//7skhIALARB6A== '/*!*/; ### INSERT INTO `db2`.`t1` ### SET ### @1=1217260 /* INT meta=0 nullable=0 is_null=0 */ # at 2206793 #140930 17:40:19 server id 1 end_log_pos 2206824 CRC32 0x9cc8044e Xid = 11876 COMMIT/*!*/; # at 2206824 #140930 17:40:19 server id 1 end_log_pos 2206895 CRC32 0x056b4ff2 Query thread_id=3 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2206895 #140930 17:40:19 server id 1 end_log_pos 2206939 CRC32 0x5136c994 Table_map: `db2`.`t1` mapped to number 71 # at 2206939 #140930 17:40:19 server id 1 end_log_pos 2206979 CRC32 0xaf308e76 Write_rows: table id 71 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAANusIQAAAEcAAAAAAAEAA2RiMgACdDEAAQMAAJTJNlE= c00rVB4BAAAAKAAAAAOtIQAAAEcAAAAAAAEAAgAB//7tkhIAdo4wrw== '/*!*/; ### INSERT INTO `db2`.`t1` ### SET ### @1=1217261 /* INT meta=0 nullable=0 is_null=0 */ # at 2206979 #140930 17:40:19 server id 1 end_log_pos 2207010 CRC32 0x90f5f7dc Xid = 11878 COMMIT/*!*/; # at 2207010

#140930 17:40:19 server id 1 end_log_pos 2207081 CRC32 0xc9894ac6 Query thread_id=2 exec_time=0 error_code=0 SET TIMESTAMP=1412124019/*!*/; BEGIN /*!*/; # at 2207081 #140930 17:40:19 server id 1 end_log_pos 2207125 CRC32 0x1ebe77c6 Table_map: `db1`.`t1` mapped to number 70 # at 2207125 #140930 17:40:19 server id 1 end_log_pos 2207165 CRC32 0x9257429d Write_rows: table id 70 flags: STMT_END_F BINLOG ' c00rVBMBAAAALAAAAJWtIQAAAEYAAAAAAAEAA2RiMQACdDEAAQMAAMZ3vh4= c00rVB4BAAAAKAAAAL2tIQAAAEYAAAAAAAEAAgAB//4tqBIAnUJXkg== '/*!*/; ### INSERT INTO `db1`.`t1` ### SET ### @1=1222701 /* INT meta=0 nullable=0 is_null=0 */ # at 2207165 #140930 17:40:19 server id 1 end_log_pos 2207196 CRC32 0xfed8fe65 Xid = 11879 COMMIT/*!*/; # at 2207196

The MySQL Reference Manual says this about about Exec_Master_Log_Pos when using parallel replication in MySQL 5.6:

When using a multi-threaded slave (by setting slave_parallel_workers to a nonzero value in MySQL 5.6.3 and later), the value in this column actually represents a “low-water” mark, before which no uncommitted transactions remain. Because the current implementation allows execution of transactions on different databases in a different order on the slave than on the master, this is not necessarily the position of the most recently executed transaction.

There are 6 transactions in that snippet. Indeed, all three transactions before position 2206638 have been executed, one in database db2 and two in database db1. But the problem here is that worker thread #2, executing statements against db1, has already progressed beyond this position and has executed another event which ends at position 2207196. If you were to execute a CHANGE MASTER TO statement with MASTER_LOG_POS=2206638, which would be the recommendation of innobackupex when using --slave-info, you’d immediately hit a duplicate key error when the transaction against db1 ending at position #2207196 was executed again. I filed a bug about this: innobackupex –slave-info doesn’t handle slave_parallel_workers>0. In the bug report, I originally proposed a complicated sequence that would stop the slave, set slave_parallel_workers back to 0, and start the slave again. But there’s an easier solution, and that’s to use the --safe-slave-backup option to innobackupex. The --safe-slave-backup causes innobackupex to execute STOP SLAVE SQL_THREAD before getting the slave’s replication position. When you execute STOP SLAVE SQL_THREAD on a slave with slave_parallel_workers>0, the server waits until the furthest-behind slave can catch up before the STOP SLAVE command returns. At that time, it is safe to trust Exec_Master_Log_Pos in SHOW SLAVE STATUS output. Here are the relevant parts of mysql.slave_worker_info and SHOW SLAVE STATUS after executing STOP SLAVE SQL_THREAD on a slave that’s receiving and executing replication events from an active master:

mysql> stop slave;
Query OK, 0 rows affected (2.24 sec)

mysql> select Id, Master_log_name, Master_log_pos from mysql.slave_worker_info;
+----+-----------------+----------------+
| Id | Master_log_name | Master_log_pos |
+----+-----------------+----------------+
|  1 |                 |              0 |
|  2 | db1-bin.000010  |        3534678 |
|  3 | db1-bin.000010  |        3534864 |
+----+-----------------+----------------+
3 rows in set (0.00 sec)

mysql> show slave statusG
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 127.0.0.1
                  Master_User: root
                  Master_Port: 4001
                Connect_Retry: 60
              Master_Log_File: db1-bin.000010
          Read_Master_Log_Pos: 8177610
               Relay_Log_File: db1-relay-bin.000009
                Relay_Log_Pos: 3535025
        Relay_Master_Log_File: db1-bin.000010
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
...
          Exec_Master_Log_Pos: 3534864
...

I challenge readers to verify the safety of this technique on their own using the same techniques I used above to view the status of the slave worker threads, compare them to SHOW SLAVE STATUS output, and correlate those with binary log contents via mysqlbinlog. If you’re using slave_parallel_workers>0 along with innobackupex --slave-info, you should use innobackupex‘s --safe-slave-backup option in order to get a consistent view of the slave’s position with respect to the master. If slave SQL thread downtime is problematic for you, you should consider using a dedicated backup slave. Remember that even --slave-info by itself entails FLUSH TABLES WITH READ LOCK which will result in some amount of downtime for the SQL thread. The additional time to stop the slave SQL thread(s) is unlikely to be considerably more drastic.