Comments - Replication stops after one hour or so

11 years ago Martin Trenz

SHOW SLAVE STATUS says that SQL and IO Slave-threads are running. Exec_Master_Log_Pos is static (which is bad), but Relay_Log_Space is increasing, as well as Read_Master_Log_Pos, so IO seems to work. Seconds_Behind_Master (SBM) is ticking up one second per second (bad). Slave_IO_State is "Waiting for master to send event".

In the initial state, right after setting up replication and starting the slave, he goes down from like 1700 SBM to zero SBM and the servers are in full sync. And then, after "some amount of time" the slave simply stops executing the changes made to the master. We have large tables (27 million records) but they are well indexed.

With "show full processlist" i see that he is executing "Reading event from the relay log" for the same time as SBM. I also see that the slave is running at 100% (or even more, multicore) CPU. I just checked the binlog AFTER the last executed entry and saw a lot of "at <some number>" with nothing behind (so i guess its a read-query on the master) but then a large block of deletes.

I will setup replication again, wait for it to fail and check that sequence again. Maybe this is the right track...

 
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.