Timeouts on "Starting to lock all tables..."

We are trying to use innobackupex to backup our MySql 5.1 instance that is used for our Zarafa email server. The zarafa database is all innodb and I’m using the --rsync option. Maybe 95% of the time when the incremental backup is run we have a time out. Additionally looking at the zarafa server logs its having difficulty at this time accessing the database. I’ve tried shutting down httpd to eliminate the traffic from ActiveSync phones on push but we still have the problem. It’s run at 2am in the morning.

Here’s a snippet of the failed part of the backup

130619 02:33:49 innobackupex: Finished a prep copy of non-InnoDB tables and files

130619 02:33:49 innobackupex: Starting to lock all tables…
innobackupex: Error: Connection to mysql child process (pid=9774) timedout. (Time limit of 900 seconds exceeded. You may adjust time limit by editing the value of parameter “$mysql_response_timeout” in this script.) while waiting for reply to MySQL request: ‘FLUSH TABLES WITH READ LOCK;’ at /usr/bin/innobackupex line 386.
Starting httpd: [ OK ]

I’m not really inclined to increase the timeout above the 15min default. I’m not sure if I can use --no-lock due to the restriction of no DDL. I’m pretty sure there is some small chance of ALTER statements being issued.

Here’s my incremental command line, os is Oracle Linux 6.4.

innobackupex --rsync --no-timestamp --incremental --user=xxx --password=xxx /mailbackup/db/$dayno --incremental-basedir=/mailbackup/db/$prevd

Any advice?

If the backup is failing on FLUSH TABLES WITH READ LOCK, make sure there is no long running transaction on the server when you run the backup. Otherwise, if it cannot acquire the read lock on ANY table, it will timeout.

Another option would be to setup a slave database to perform the backup on. If locking is still an issue on the slave, you could simply stop replication, take the backup, and then re-start replication after the backup is successful.

As for troubleshooting your existing setup, try running a “show engine innodb status \G” while your backup is running to see what is happening in regards to the locking.

Hey @deisenlord, saying you would prefer not to change the timeout but are you aware it’s at session scope not global?

Some good ideas here, thanks. I’ve also been grep’ing though my binary logs looking for DDL and haven’t found any yet so maybe no lock is a possibility.