We use xtrabackup 2.4.9 for PXC 5.7.14(4 nodes) data backup.
Write operations are isolated to a single node(node 1), include pt-online-schema-change.
Xtrabackup runs in another node(node 4) for backup.
backup is controlled by crontab, and pt-online-schema-change is controlled by ops system.
According to https://www.percona.com/blog/2017/08…na-xtrabackup/ , we use --lock-ddl in xtrabackup for success backup.
Yesterday, we got a big problem.
Node 4 was running xtrabackup for backup. It need near 40 minutes for success backup( more than 1TB).
In node 1, ops system run pt-online-schema-change and success in node 1.
But in node 4, xtrabackup lock table for backup, drop table run failed, and sent flow control, clusters was going to down.
After backup success, recovered.
So is there some solutions for xtrabackup and pt-online-schema-change.
Sorry for my English, I’m not a native speaker.