Using pt-table-checksum with MariaDB Galera

A rapidly increasing number of large, high traffic sites are using MariaDB Galera for their database needs. This makes sense since it works so well. Administrators will connect asynchronous slaves to Galera for reporting, back-ups and disaster recovery.

MariaDB Galera is a very dependable system, involving the coupling of MariaDB with Galera. Nevertheless, Galera administrators need to check regularly the consistency of their slaves. An excellent tool for doing this is the pt-table-checksum. However, when you try to run it on Galera, depending on the version you’re using, you may encounter some results similar to the following:

root@lab-galera01:~# pt-table-checksum localhost  
--create-replicate-table --no-check-replication-filters 

12-12T11:00:37 Failed to /*!50108 SET @@binlog_format := 'STATEMENT'*/: 
DBD::mysql::db do failed: 
Variable 'binlog_format' can't be set to the value of 'STATEMENT' 
[for Statement "/*!50108 SET @@binlog_format := 'STATEMENT'*/"] 
at /usr/bin/pt-table-checksum line 9148.

The pt-table-checksum tool requires the binlog_format variable to be set to a value of STATEMENT. This is because the developers of Galera at Codership added a hook to disable any replication mode other than ROW. Incidentally, this problem was resolved starting with version 5.5.42 and version 10.0.16 of MariaDB.

Looking at the results shown above, the binlog_format variable was set to ROW and that an error occurred while it attempted to change that variable. The problem is that if the server is running on MySQL version 5.1.29 or higher, setting binlog_format requires the SUPER privilege. So you will need manually to set the binlog_format variable to STATEMENT before running pt-table-checksum.

Having to set manually that variable before running pt-table-checksum may be inconvenient. If you’re incorporating its use within a script, this will require the script to use a user account with the SUPER privilege. That might be a security vulnerability.

As an alternative, you can disable Galera replication on a per-session basis. If you do this, though, you will need to be very careful with what you are doing. Be sure to resume replication on the server for which you’re checking. Otherwise, you may shutdown replication and cause other problems.

Below is an example of how this might be done, using pt-table-checksum with the –set-vars option. Note, in this example the Galera node which is replicating to the asynchronous slaves is named, lab-galera01.

root@lab-galera01:~# mysql -e "CREATE DATABASE IF NOT EXISTS mariadb_monitoring" 

root@lab-galera01:~# pt-table-checksum localhost --set-vars="wsrep_on=off" 
                     --create-replicate-table --no-check-replication-filters

Diffs cannot be detected because no slaves were found. 

Please read the --recursion-method documentation for information. 

mysql.column_stats 12-12T11:00:53 0 0 0 1 0 0.002 
mysql.columns_priv 12-12T11:00:53 0 0 0 1 0 0.001 
mysql.db 12-12T11:00:53 0 0 0 1 0 0.008 
mysql.event 12-12T11:00:53 0 0 0 1 0 0.001 
mysql.func 12-12T11:00:53 0 0 0 1 0 0.002 
mysql.gtid_slave_pos 12-12T11:00:53 0 0 39 1 0 0.001 
mysql.help_category 12-12T11:00:53 0 0 464 1 0 0.015 
mysql.help_keyword 12-12T11:00:53 0 0 1028 1 0 0.005 
mysql.help_relation 12-12T11:00:53 0 0 508 1 0 0.009 
mysql.help_topic (...)

After executing this, we should then connect to the slave and verify consistency. We would do that like so:

root@lab-slave01:~# mysql -e "SELECT * FROM mariadb_monitoring.checksums
                              WHERE this_crc <> master_crc"

Warning: The table has not been replicated on the galera cluster.

As you can see, that worked just fine. This allows you to work around the problem. However, you may encounter problems later if you drop the table without disabling replication first. Therefore, the best solution is to upgrade to the latest version of MariaDB.