Can't optimize database in cluster?

I'm using a cluster of 3 MariaDB 10.1.10 servers on Ubuntu. The servers are located in different datacenters and have a network latency of 6 ms.

Previously I found that I couldn't do a database dump or optimize, because the job would hang and took the cluster down with it. This issue was caused by flow control, it has been fixed by increasing the gcs.fc_limit.

I currently have the problem that I can't run a optimize job for databases. For example: mysqlcheck --auto-repair --optimize --databases joomla The optimize job runs fine on the server I execute the job on and finishes in about 10 seconds. Let's call the server that executes this job the "master server". According to syslog the job runs on the other servers as well, I see entries like:

Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : Start merge-sorting index `PRIMARY` (1 / 3), estimated cost : 10.0000 Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : End of merge-sorting index `PRIMARY` (1 / 3) Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : Start building index `PRIMARY` (1 / 3), estimated cost : 15.0000 Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : End of building index `PRIMARY` (1 / 3) Jan 23 16:01:32 Netherlands mysqld[1064]: 2016-01-23 16:01:32 140116180166400 [Note] InnoDB: Online DDL : Completed

These other servers go down silently, they stop responding to query's from the users/applications any more but there are no error messages in the syslog. This doesn't happen on the "master" server, which continues to work like nothing happened.

I've found that this issue doesn't happen when I run a optimize job for one single table. I believe the issue is that the operation for a entire database is to resource intensive or takes to long to finish.

I use the following wsrep_provider_options = gcache.size = 32G;evs.keepalive_period = PT3S;evs.suspect_timeout = PT30S;evs.inactive_timeout = PT1M;evs.install_timeout = PT1M;evs.send_window = 512;evs.user_send_window = 512;gcs.fc_limit=2048 ;gcs.fc_master_slave=YES; gcs.fc_factor=1.0

Answer Answered by Geoff Montee in this comment.

I would recommend setting wsrep_OSU_method to "RSU" before executing the command, so the OPTIMIZE TABLE command does not get replicated to the other nodes. That would help prevent all nodes from being overloaded at the same time. If you want the table to be optimized on all nodes, just execute OPTIMIZE TABLE on each node, one node at a time.

http://galeracluster.com/documentation-webpages/schemaupgrades.html

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.