Comments - Introduction to State Snapshot Transfers (SSTs)
Content reproduced on this site is the property of its respective owners,
and this content is not reviewed in advance by MariaDB. The views, information and opinions
expressed by this content do not necessarily represent those of MariaDB or any other party.
I have 2-node cluster running on MariaDB 10.4.12 and Galera 4 v26.4.3. Actually they are running smoothly as a cluster. After reboot both servers, I started node 1 using /usr/bin/galera_new_cluster command, starting node 2 with command 'systemctl start mariadb'. But node 2 could not join the cluster, with the following error messages on /var/log/messages:
May 20 15:54:18 uodbdb2 mysqld: 2020-05-20 15:54:18 1 [Note] WSREP: GCache history reset: old(8a371dc2-6762-11ea-b859-274830d9bb21:1261 -> 8a371dc2-6762-11ea-b859-274830d9bb21:1324 May 20 15:54:18 uodbdb2 mysqld: 2020-05-20 15:54:18 1 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset May 20 15:54:19 uodbdb2 mysqld: 2020-05-20 15:54:19 0 [Warning] WSREP: 1.0 (JOAL_node1): State transfer to 0.0 (JOAL_node2) failed: -255 (Unknown error 255) May 20 15:54:19 uodbdb2 mysqld: 2020-05-20 15:54:19 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1178: Will never receive state. Need to abort. May 20 15:54:19 uodbdb2 mysqld: 2020-05-20 15:54:19 0 [Note] WSREP: gcomm: terminating thread May 20 15:54:19 uodbdb2 mysqld: 2020-05-20 15:54:19 0 [Note] WSREP: gcomm: joining thread May 20 15:54:19 uodbdb2 mysqld: 2020-05-20 15:54:19 0 [Note] WSREP: gcomm: closing backend May 20 15:54:20 uodbdb2 mysqld: 2020-05-20 15:54:20 0 [Note] WSREP: view(view_id(NON_PRIM,1b90d2ca,62) memb { May 20 15:54:20 uodbdb2 mysqld: 1b90d2ca,0 May 20 15:54:20 uodbdb2 mysqld: } joined { May 20 15:54:20 uodbdb2 mysqld: } left { May 20 15:54:20 uodbdb2 mysqld: } partitioned { May 20 15:54:20 uodbdb2 mysqld: 834dc803,0 May 20 15:54:20 uodbdb2 mysqld: }) May 20 15:54:20 uodbdb2 mysqld: 2020-05-20 15:54:20 0 [Note] WSREP: PC protocol downgrade 1 -> 0 May 20 15:54:20 uodbdb2 mysqld: 2020-05-20 15:54:20 0 [Note] WSREP: view((empty)) May 20 15:54:20 uodbdb2 mysqld: 2020-05-20 15:54:20 0 [Note] WSREP: gcomm: closed May 20 15:54:20 uodbdb2 mysqld: 2020-05-20 15:54:20 0 [Note] WSREP: /usr/sbin/mysqld: Terminated. May 20 15:54:20 uodbdb2 systemd: mariadb.service: main process exited, code=killed, status=6/ABRT May 20 15:54:20 uodbdb2 mysqld: Terminated May 20 15:54:20 uodbdb2 mysqld: WSREP_SST: [INFO] Joiner cleanup. rsync PID: 10392 (20200520 15:54:20.647) May 20 15:54:21 uodbdb2 rsyncd[10392]: sent 0 bytes received 0 bytes total size 0 May 20 15:54:21 uodbdb2 mysqld: WSREP_SST: [INFO] Joiner cleanup done. (20200520 15:54:21.155)
What's wrong?
Hi,
This is an important clue:
It appears that the
mysqldprocess was killed bysystemd. To check for sure, execute this:This usually happens because your SST timed out. You most likely have to increase the systemd timeout. e.g.:
See here: https://mariadb.com/kb/en/systemd/#configuring-the-systemd-service-timeout
Thank you for your advice. I didn't update and change any clustering configuration or timeoutsec.conf. I found there are selinux issues and got them fixed. Then node 2 could successfully join the cluster.