MaxScale redirects writes in slave
Hello,
I'm new to MaxScale. I have a working read-write splits setup with one master and one replica, which worked for months without any issue.
Today I was running some queries to compare the slave with the master and I realized that the slave has more rows than the master. I've refreshed the slave from master to make sure slave's data is refreshed and identical with the master. After a few hours, I've ended up in the same sittuation, slave had more data than the master. I've modified the slave's configuration and set gtid_strict_mode to ON and gtid_ignore_duplicates to OFF. A few minutes later the slave stopped its replication thread and complained about duplicate GTIDs or GTID sequence out of order. After turning the slave's general_log ON, I've found write queries (UPDATE) being sent from MaxScale to the slave instead of the master.
In which cases MaxScale might miss-direct a write query to a slave instead of the master?
- relevant SQL log:
304 Connect [email protected] on radius using TCP/IP
304 Query UPDATE radacct SET acctupdatetime = (@acctupdatetime_old:=acctupdatetime), acctupdatetime = FROM_UNIXTIME(1659609569), acctinterval = 1659609569 - UNIX_TIMESTAMP(@acctupdatetime_old), framedipaddress = '10.10.40.16', acctsessiontime = 517608, acctinputoctets = '0' << 32 | '225486073', acctoutputoctets = '0' << 32 | '566662648' WHERE AcctUniqueId = '06f4e5716e55c04126ec86fdcce458a6'
304 Query UPDATE radacct SET acctupdatetime = (@acctupdatetime_old:=acctupdatetime), acctupdatetime = FROM_UNIXTIME(1659609569), acctinterval = 1659609569 - UNIX_TIMESTAMP(@acctupdatetime_old), framedipaddress = '10.10.40.11', acctsessiontime = 25981, acctinputoctets = '0' << 32 | '22814945', acctoutputoctets = '0' << 32 | '324705193' WHERE AcctUniqueId = '64bb8e55485cb66186daecec6dbe52cf'
- /var/log/mysqld/mysql.log 2022-08-04 10:39:29 6 [ERROR] Slave SQL: An attempt was made to binlog GTID 0-2003-5320035 which would create an out-of-order sequence number with existing GTID 0-2002-5320035, and gtid strict mode is enabled, Gtid 0-2003-5320035, Internal MariaDB error code: 1950 2022-08-04 10:39:29 6 [Warning] Slave: An attempt was made to binlog GTID 0-2003-5320035 which would create an out-of-order sequence number with existing GTID 0-2002-5320035, and gtid strict mode is enabled Error_code: 1950 2022-08-04 10:39:29 6 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mariadb-2-3-bin.000020' position 20299053; GTID position '0-2003-5320034' 2022-08-04 10:39:29 6 [Note] Slave SQL thread exiting, replication stopped in log 'mariadb-2-3-bin.000020' at position 20299053; GTID position '0-2003-5320034', master: adm.stl:3306
- binlog:
- 220804 10:46:48 server id 2002 end_log_pos 315 CRC32 0x8d41d30e Gtid list [0-2003-5320034,
- 0-2002-5320036]
- at 315
- 220804 10:46:48 server id 2002 end_log_pos 364 CRC32 0x9ab687c9 Binlog checkpoint mariadb-2-2-bin.000008
- at 364
- 220804 10:39:29 server id 2003 end_log_pos 406 CRC32 0xb9280fad GTID 0-2003-5320035 trans /*!100101 SET @@session.skip_parallel_replication=0**!*/; /*!100001 SET @@session.gtid_domain_id=0**!*/; /*!100001 SET @@session.server_id=2003**!*/; /*!100001 SET @@session.gtid_seq_no=5320035**!*/; START TRANSACTION /*!*/;
- at 406
- 220804 10:39:29 server id 2003 end_log_pos 872 CRC32 0xedb462c0 Query thread_id=52079 exec_time=440 error_code=0 xid=0 use `radius`/*!*/; SET TIMESTAMP=1659609569/*!*/; SET @@session.pseudo_thread_id=52079/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1, @@session.check_constraint_checks=1, @@session.sql_if_exists=0/*!*/; SET @@session.sql_mode=1411383296/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C latin1 **!*/; SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=45/*!*/; SET @@session.time_zone='SYSTEM'/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; UPDATE radacct SET acctupdatetime = (@acctupdatetime_old:=acctupdatetime), acctupdatetime = FROM_UNIXTIME(1659609569), acctinterval = 1659609569 - UNIX_TIMESTAMP(@acctupdatetime_old), framedipaddress = '10.10.40.16', acctsessiontime = 517608, acctinputoctets = '0' << 32 | '225486073', acctoutputoctets = '0' << 32 | '566662648' WHERE AcctUniqueId = '06f4e5716e55c04126ec86fdcce458a6' /*!*/;
- at 872
- 220804 10:39:29 server id 2003 end_log_pos 903 CRC32 0xcc4d6049 Xid = 2 COMMIT/*!*/;
- at 903
- 220804 10:39:29 server id 2003 end_log_pos 945 CRC32 0x5453fa9d GTID 0-2003-5320036 trans /*!100001 SET @@session.gtid_seq_no=5320036**!*/; START TRANSACTION /*!*/;
- at 945
- 220804 10:39:29 server id 2003 end_log_pos 1019 CRC32 0x4f9e770f User_var SET @`acctupdatetime_old`:=_latin1 X'323032322D30382D30342031303A32393A3238' COLLATE `latin1_swedish_ci`/*!*/;
- at 1019
- 220804 10:39:29 server id 2003 end_log_pos 1483 CRC32 0x22080a13 Query thread_id=52079 exec_time=440 error_code=0 xid=0 SET TIMESTAMP=1659609569/*!*/; UPDATE radacct SET acctupdatetime = (@acctupdatetime_old:=acctupdatetime), acctupdatetime = FROM_UNIXTIME(1659609569), acctinterval = 1659609569 - UNIX_TIMESTAMP(@acctupdatetime_old), framedipaddress = '10.10.40.11', acctsessiontime = 25981, acctinputoctets = '0' << 32 | '22814945', acctoutputoctets = '0' << 32
'324705193' WHERE AcctUniqueId = '64bb8e55485cb66186daecec6dbe52cf' |
/*!*/;
- at 1483
- 220804 10:39:29 server id 2003 end_log_pos 1514 CRC32 0x91e04961 Xid = 4 COMMIT/*!*/;
Thank you very much for any responses.
George
Answer Answered by Markus Mäkelä in this comment.
I opened MXS-4269 for this bug.