Recent issues with "Waiting for bakup lock"

tucj7 · June 27, 2020, 3:14pm

Hi,Some help please. Over the last couple of days, we’re been having issues with our cluster. When a node goes down for some reason and starts coming back over SST, we’ve been getting a status of “Waiting for backup lock” when a “CREATE” or “DROP” table query is executed. Both are executed on InnoDB tables.

What this does is locks till the node is completely back up, which can take up to an hour. This renders the whole cluster unavailable during this time.Any ideas what is causing this? I’ve been running this cluster for 4+ years and this is the first time seeing this.Thanks!

vaibhav_upadhyay40 · June 27, 2020, 4:32pm

Hi @tucj7
It would be great if you can share version details of xtrabakup and cluster.

tucj7 · June 27, 2020, 4:35pm

Hi,
No problem:

Server version: 5.7.26-29-57 Percona XtraDB Cluster (GPL), Release rel29, Revision 03540a3, WSREP version 31.37, wsrep_31.37xtrabackup version 2.4.15 based on MySQL server 5.7.19 Linux (x86_64) (revision id: 544842a)

tucj7 · June 28, 2020, 1:38am

This is what’s running during the SST

tucj7 · June 28, 2020, 5:20am

It looks like it could be related to this: https://jira.percona.com/browse/PXC-2365Can someone confirm that this is the same issue? Is there a resolution to this? Or, at least, a workaround?

vaibhav_upadhyay40 · June 28, 2020, 8:43am

Hi @tucj7
Yes you are right.
As per my understanding i was expecting this issue on older version, but as you have shared it is not the issue in your case.
Looking at above jira it is appears its a bug. May be someone can recommend interim solutions, if any.

tucj7 · July 2, 2020, 7:10am

@“lorraine.pocklington” is this something you could check out?

DGB · July 14, 2020, 9:12am

Hi
Let me start by saying that the behaviour you are experience is expected by design. On 5.7, the xtrabackup command used for SST includes the parameter --lock-ddl (https://github.com/percona/percona-xtradb-cluster/blob/5.7/scripts/wsrep_sst_xtrabackup-v2.sh#L1582) which will execute the LOCK TABLES FOR BACKUP command (https://www.percona.com/doc/percona-xtrabackup/LATEST/xtrabackup_bin/xbk_option_reference.html#cmdoption-lock-ddl)
All this is to guarantee consistency of the backup.
The fix is to avoid running DDLs on the cluster during the SST process (or at least at the beginning of it where all the backup lock happens)

tucj7 · July 14, 2020, 9:20am

Thanks for the feedback - that makes sense. The trouble I have is that when this happens it locks up the DB for the entire SST process, which currently runs around 1.5 hours.
We have a bunch of jobs (and client initiated jobs) that run regularly that create/drop temp tables and it’s impossible to know when a node will go down and cause SST. It turns out this was likely happening due to a memory issue on one node.
Do you have any recommendations on how to anticipate an SST and then push a change to crons/etc. to prevent DDL during this process? Or is there a better way to get around this. My constant fear is that this happens at critical times or after hours and causes considerable problems.

Topic		Replies	Views
SST locks donor node for writing after any DDL Percona XtraDB Cluster 8.x	4	73	January 6, 2025
Waiting for table backup lock drop table if exists file_search	15	1590	March 11, 2022
Xtrabackup Waiting for table backup lock mysql 8.0 Percona XtraBackup	2	475	March 4, 2024
SST : Xtrabackup with no-lock Percona XtraDB Cluster 5.x	1	1197	February 19, 2014
Backup locking other nodes Percona XtraBackup	4	825	March 18, 2022

Recent issues with "Waiting for bakup lock"

Related topics