Xtrabackup 2.4.8 produces corrupt backups 90% of the time - reports "Completed OK!"

Hi guys

I’m using Percona 5.7.18-16 for my MySQL (Percona) DB server.

I’m using Percona xtrabackup 2.4.8 to make live backups of the running server.

I’ve set up a testing server to verify backups made from the above primary.

The testing server is also running Percona 5.7.18-16.

I’ve found that Percona xtrabackup 2.4.8 almost -never- makes a viable backup… even if it reports “Completed OK!” in the CLI when completing a backup phase. On the testing server, there are about 80% of the time corrupt indices, and sometimes the tables themselves are corrupt when the backup is restored.

Running a “check table tablename” on the verification server crashes the verification server in innodb with various table corruptions for each backup iteration.

The DB is about 400GB in size when live. I’ve ensured via a sha256sum that the .tar I restore on the testing server is binary-exact the same as on the source / main server. I use

rsync -avh --progress --no-whole-file --partial [email]user@source_server/path_to_tar.tar[/email] .

to transfer the .tar of the xtrabackup-produced directory to the verification/testing server.

I’m calling it via the innobackupex symlink, e. g.

innobackupex --user=myuser --password=mypass --parallel=4 --rsync ...

and then

innobackupex --apply-log --user=myuser --password=mypass --use-memory=2GB ...

About 92% of the backups produced this way are corrupt, with no errors or failures reported by xtrabackup after the backup is produced.

Any ideas on what I can try to address this? Xtrabackup appears a common and popular tool.

I have run a “check table …” on the live server’s tables, and it is clean. The 5.7 Percona instance on it is running for about 370 days at this point. It handles about 25 million inserts and about 30 million updates every 24 hours for the past year. It handles hundreds of thousands of selects every 24 hours, no Innodb crashes or problems.

But it is proving almost impossible to back up consistently with xtrabackup 2.4.8.

Any pointers? Anybody encounter this before?

Thanks!

Stefan

Hi Stefan, I noticed your discussion on Stack Exchange… I will see if I can get someone to help you out here. Very many sites rely on PXB so what you are experiencing is unusual, so I think we should try to figure it out.

Hi again, we could really do with seeing any logs that you have: xtrabackup log during backup, log during prepare phase, system wide logs and dmesg during these steps are also needed. We’d like to help get to the bottom of this in case the backup issue that’s being presented is masking another problem that needs attention.

Ah, one more thing. Could I just check in with you that you are using InnoDB as your storage engine for all tables and not MyISAM? Thanks!