Xtrabackup corrupts incremental backup when node changes.

I have 3 nodes PXC 5.6 cluster. Xtrabackup 2.3.10. Centos 7.6

Percona-XtraDB-Cluster-56-5.6.41-28.28.1.el7.x86_64

I am only single Node-1 for both read and write.

I use to take backup from another Node-2 which is not used for read or write.

Yesterday, my Node-1 went down and read/write switched to Node-2. Then we continued to take backup from another unused Node-3.

On my backup machine, I decided to extract and prepare backup. My backup preparation log shows everything fine until those backups that were taken from Node-2. All the incremental backups from Node-3 had errors.

Why does it seems like incremental backup should always be from the same node where the full backup was taken? Percona XtraDB Cluster is master-master and so all nodes must be same. Then why did this happen?

P.S Also, I failed to skip one incremental backup by deleting it and changed LSN number of next incremental backup so that remaining incremental backups have sequenced LSN numbers. Xtrabackup was hung in the middle for more than an hour during the preparation of the edited LSN number incremental backup until I did CTRL+C.

Please check attached logs file.

prepare-progress.txt (74.4 KB)

The answer is simple - the data in logical sense is supposed to be identical on all PXC nodes. But, in physical sense, it is NOT. Galera replication is not syncing on file system level, it is syncing data row images. InnoDB internal structures are still handled uniquely on each node, so what we have on disk is never a 1:1 copy.

With that said, yes, incremental backups (Xtrabackup) are always meant to be done against the very same node.

przemek

If I understood you correctly, do you suggest not to take backup from unused node ? And only take full or incremental backup from the node which is used for read and write?
I am a bit confused here.

Thanks

No, this is not what I meant.
Let me explain it on a simple example. You have a table name t1, which has 100 rows, and all PXC nodes have these identical 100 rows, there is no data difference. But, it does not mean necessary that the file t1.ibd is identical on all nodes. Each server may have it’s own, unique, binary representation of the data on disk.
Even if .ibd file is indeed identical, very likely there are some binary differences in the shared tablespace - ibdata1. But it is perfectly fine, it does not mean some node is better or worse, and if data is logically the same on all nodes, you may take backup from any of them and it will be valid.

So, when taking binary backups (and Xtrabackup does that), you need to be consistent. If you took initial full backup from node1, you cannot take incremental later from node2, it also has to be node1.