Database sometimes locks during backup

Hi,

I created an account here because we are having some problems with the database locking itself when we are doing our backups. This doesn’t happen that frequently but since it takes 15-20 minutes to restart the database I’d like to find what is causing this problem and solve it.

It all started early this year and then I found out that DDL statements were not good while taking the backups and we had a cronjob running that ran a TRUNCATE and sometime that cronjob timed the backup and it locked itself. After that I tried to find any occurances of DDL statements in any script so it wouldn’t happen again. And it was fine for months.

Then now during summer it started to happen again. I have told others working with the database (two others) not to do any such statements and have them check their code as well and it should be fine.

Except, we do run a lot of CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE.

So, my first question is if those are safe to run during backup?

Also, what would be the best way to find out what locked the database? Just turning on logging to general log file?

I read that -lock-ddl since 8.0.13(?) is on by default and I haven’t touched that so it should be on. Is there anything else to think about concerning this that would be beneficial to us taking these backups.

Note: the version I’m running of xtrabackup is 8.0.32.

Best regards,

Henrik

Hi Henrik,

Can you please paste the xtrabackup command line you use? Also, when the database is locked, can you also get the output of “SHOW PROCESSLIST” and “SHOW ENGINE INNODB STATUS” queries?

How big is your database? Remember that when xtrabackup takes BACKUP LOCK, all DDLS will wait until the backup is over. DMLs are not affected.

Hi Satya,

thanks for the reply!

I found a tutorial with scripts on Digital Ocean for how to setup xtrabackup but what that script mostly does is to create folders for each day (and keep the 3 last day’s backups). The arguments for xtrabackup is set before the command but it should be:

xtrabackup --defaults-dir=${defaults_file} --backup --extra-lsndir=${todays_dir} --compress --stream=xbstream --encrypt=AES256 – encrypt-key-file={$encryption_key_file} --parallel=${processors} --compress-threads=${processors} --encrypt-threads=${processors} --target-dir= and so on.

(wrote it by hand, sorry if there were any typo)

I’m leaving out the directories as they aren’t important. The backup works most of the time so there is nothing wrong with the locations or encryption.

And no sorry, I can’t give any SHOW PROCESSLIST or SHOW ENGINE INNODB STATUS as I restarted the database when this happened this morning and it has worked since. It’s just that I know this will happen again and if I’m not at the computer it might take a long time to get it restrated.

But, when I tried to access the database through sqlYOG nothing happens, like everything is locked. I can’t even run SHOW PROCESSLIST I think. I did try this another time it locked up but will try again when this happens again if this thread is still ongoing. Also will try it directly on the server instead of through a program like sqlYOG.

All of the databases on the server is around 50 GB.

But when this happens, nothing works… no SELECTS, INSERT, UPDATES. The web pages goes down because they are waiting on a response that never comes. It’s like the whole database locks. Or I’m mistaken about the SHOW PROCESSLIST not working and just a single table is locked, but it doesn’t feel like that.

Edit: All tables in the database(s) are InnoDB except the system tables. Also, when checking, views aren’t reported to have an engine so are those treated differently when taking backups? I read that when the backup is taken non-InnoDB tables are locked?

Just an update. This happened once again now and first I could check a table from sqlYOG albeit slow. But as soon as I tried SHOW PROCESSLIST it got stuck at executing query.

I’m restaring the database now so can’t check much else as it is crucial it is back up ASAP. But any suggestions is helpful for the next time this happens.

I want to solve this, but suggestions of how to locate what happened is welcome.

It is weird to see that SELECTs and DMLs get blocked. Is it possible to redirect xtrabackup command output to log file and share the contents here? Also, please check the server’s error log.

Please share the xtrabackup version and the server version.

Are you connecting via UI? Can you try to connect using MySQL client command line tool?

I did get the content from one table I ‘clicked on’ in sqlYOG so SELECTs might work at first but as soon as I tried SHOW PROCESSLIST it at least hang that program since it didn’t give a response from the DB.

The MYSQL error log didn’t have anything in it except these ’ [SERVER] No suitable ‘keyring_component_metadata_query’ service implementation found to fulfill the request’ messages that is written in it at the time the backup process starts. I tried to search for this message last week but it seems this is from a bug in the version I am using. It is also written even if the backup is successful.

There is a log from xtrabackup now that I think of it. But the one from this incident has been overwritten by the backup taken the hour after (which went well).

The versions are:

xtrabackup 8.0.32-25-1
percona 8.0.31-23-1

I will try to connect to the DB next time through the CLI instead of sqlYOG. But as I said, I need to get the DB back online asap when this happens so I need some sort of checklist I can go through fast.

I forgot… there is a plan to get the servers updated end of next week to:

xtrabackup 8.0.34-29-1
percona 8.0.33-25-1