Mariadb Galera Cluster | Table Update Issue

We created a 3 node galera cluster in our system and then moved our existing DB/tables from existing single node standalone mariadb. Then when we started our application on this 3 node cluster then it runs fine for update of few records but after that our update gets struck indefinitely (lock), then we are unable to do any write or update to the table.. We are constantly facing this issue we tried multiple times to rebuild our cluster again but same issues happens again and again. Please help in providing some direction for this issue. In this particular case, I am (application) just running update command on records one after the another. Total records it is updating is around 250 in one go (one after the another). Total records in table are 0.4 million. This particular table does not have any primary key. Update query example which is used inside this java application - ``` "update dost_master_prod set a2d_api_request = ?, a2d_api_response = ?, a2d_api_status = ?, a2d_api_description = ?, d2r_api_request = ?, d2r_api_response = ?, d2r_api_status = ?, d2r_api_description = ?, updated_date_time = ?, update_description = ? where initiator_agent = ? and report_date = ?;" ``` Connecting to cluster through application this way - jdbc:mariadb:sequential.

Error logs - ``` SHOW ENGINE INNODB STATUS ------------ TRANSACTIONS ------------ Trx id counter 11316748 Purge done for trx's n:o < 11316747 undo n:o < 0 state: running History list length 29 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 421357846471208, not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION 421357846466944, not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION 11316747, ACTIVE 1785 sec mysql tables in use 1, locked 1 51495 lock struct(s), heap size 5726328, 3641448 row lock(s), undo log entries 1 MySQL thread id 9627, OS thread handle 139882031507200, query id 5693745 injiffyan01.india.airtel.itm 10.92.228.55 root Commit update dost_master_prod set a2d_api_request = '{ \"amount\" : \"3823.75\", \"currency\" : \"XXX\", \"subType\" : \"commission_otf\", \"type\" : \"adjustment\", \"requestDate\" : \"2021-05-13T11:15:44.011+05:30\", \"debitParty\" : [ { \"key\" : \"accountid\", \"value\" : \"994\" } ], \"creditParty\" : [ { \"key\" : \"accountid\", \"value\" : \"09944890425\" } ], \"metadata\" : [ { \"key\" : \"slab_next\", \"value\" : \"Slab for next OTF payout is 61-80 GA\" }, { \"key\" : \"slab_achieved\", \"value\" : \"41-60 GA\" }, { \"key\" : \"de ---TRANSACTION 421357846475472, not started 0 lock struct(s), heap size 1128, 0 row lock(s) ---TRANSACTION 421357846462680, not started 0 lock struct(s), heap size 1128, 0 row lock(s) -------- ```

Question 1) Does this issue is because we don't have primary key in our table ? Question 2) What all are limitations of Galera cluster ? Is it feasible to move more than 10 high impacting applications from single node mariadb to 3 node galera cluster ? If yes, what precautions one needs to take ? Question 3) Why does whole cluster gets struck/hang/locked due to one table (on one DB) issue ? Question 4) Which mode should we use sequential, replication or loadbalanced ?

Answer Answered by Daniel Black in this comment.

Q1)

its recommended that innodb tables always have a primary key, especially with galera. I can't say it will definately fix this problem you are encountering, however it will help a lot of operations be able to be quicker in galera.

Q2a)

Galera limitations are in https://mariadb.com/kb/en/mariadb-galera-cluster-known-limitations/

Q2b)

More information as the the "high impact" is needed to assess this. Provided the known limitations are addressed a migration should be possible. Like all major change test in a development environment.

Q3)

" 3641448 row lock(s)" is likely a symptom of no primary key.

Q4)

sequential and loadbalance are the only options that should be used - https://mariadb.com/kb/en/about-mariadb-connector-j/#failover-and-load-balancing-modes . loadbalance will fail badly until you have a primary key as galera is opportunistic locking. If (initiator,report_date) isn't the primary key I'd stick to sequential.

Comments

Comments loading...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.