Giving up on xtrabackup

geek_prophet · February 17, 2022, 5:26am

After days of failing to track down the reason for “you may have a corrupt database” and “lsn is in the future” errors, I am throwing in the towel. This is a big disappointment as I have looked forward to switching to xtrabackup for years.

Instead, I am going to use the following strategy:

flush tables with read lock
flush logs
take LVM snapshot of the whole MySQL directory structure
release the table lock
rsync the snapshot to a separate server.

This is fast, easy, results in almost zero downtime (the flush and snapshot complete in about 1 second), and produces a 100% perfect copy of the source MySQL instance.

Anybody see any issues with this approach?

Marcelo_Altmann · February 17, 2022, 11:33am

Hi @geek_prophet .
I’m sorry to hear you gave up on xtrabackup. If you still want to, you can raise a bug at jira.percona.com with a reproducible test case in order for us to investigate it.
Just to clear the expectations (I see you have 3 forums questions on the same subject in a day or two) we do try to be as active and engaged here, but Please note that we do not offer any SLA for community bugs. So regarding xtrabackup I would advise you to go down that road.

Regarding LVM please make sure to test it with significant load such as the one that you will have facing in production. For example:
If an user open a transaction, make some changes to tableA and run another operation that takes time to complete or just doesn’t commit/rollback the transaction, your FTWRL(step 1) will lock part of the tables and will wait for the ones that cannot be locked, this will cause a downtime.

There is a good blog post about this subject that still applies that I indicate you to read - Using LVM for MySQL Backup and Replication Setup - Percona Database Performance Blog

geek_prophet · February 17, 2022, 6:27pm

Hi Marcelo,

We’d rather not give up on xtrabackup but we’re under time pressure. I’ve been messing with it for a week and can’t get it running reliably, which is why I have been bombarding the forum with questions. I don’t know if the problems are user error (typically true) or if there is something amiss in the code (seems unlikely) but I’m running out of time to figure it out. Like I said earlier, we can run a backup multiple times without a problem, and then suddenly it will start throwing “lsn is in the future” and “you may have a corrupt database” errors.

Regarding LVM, we’ve been using the method I described for years. Backups only kick off at night when user activity is minimal, but there are rare occasions when table locks may cause issues. That’s the main reason why we were eager to switch to xtrabackup. Thanks for the link.

geek_prophet · February 17, 2022, 6:32pm

Quick clarification. The link states…

Connect to MySQL and run FLUSH TABLES WITH READ LOCK
Note – this command may take a while to complete if you have long running queries. The catch here is FLUSH TABLES WITH READ LOCK actually waits for all statements to complete, even selects. So be careful if you have any long running queries. If you’re using only Innodb tables and do not need to synchronize binary log position with backup you can skip this step.

Just to be clear, is this saying that if all tables are on InnoDB, and there is no slave in production, then it is not necessary to FTWRL?

Topic		Replies	Views
Does LVM Snapshot Backup Technique Really Work? Other MySQL® Questions	0	569	December 16, 2016
Xtrabackup 8.0.32 locks up the database with Waiting for table flush Percona XtraBackup	4	716	October 10, 2023
Backup MySQL using LVM snapshot Percona Server for MySQL 5.7	7	1934	May 24, 2023
Xtrabackup locks tables/databases to write during backup? Percona XtraBackup	8	1859	July 19, 2019
innobackupex deprecated but xtrabackup lacks --rsync option = long TABLE LOCKS Percona XtraBackup	2	1157	August 28, 2018

Giving up on xtrabackup

Related topics