XtraBackup 2.4 Full Backup Taking >24 Hours on 10TB MySQL 5.7 Galera Cluster over NFS (HDD Storage)

aycelen · February 17, 2026, 11:13am

Hello,

I am running a production MySQL 5.7 (Galera Cluster, 3 nodes) environment and facing extremely long backup times with XtraBackup (innobackupex 2.4.29). I would like to validate whether this behavior is expected given our architecture or if there are recommended optimizations.

Environment Details

MySQL Version: 5.7.44 (Galera)
XtraBackup Version: 2.4.29
Cluster Size: 3 nodes (wsrep_cluster_status = Primary)
Engine: InnoDB (file_per_table=1)
Datadir: /DB/mysql
Database Size: ~9.7 TB
OS: Linux (VMware Virtual Machines)
CPU: 24 vCPU
RAM: 78 GB (InnoDB Buffer Pool: 40 GB)

Storage Architecture

DB Node :

Datadir: Local LVM volume (/DB/mysql)
Disk Type: HDD (ROTA=1, virtual disks)
Size: ~11 TB

Backup Target:

Mounted via NFS v4.2 to /BACKUP
NFS Export: /DBBACKUP *(rw,sync,no_root_squash,insecure)
Filesystem: XFS on LVM (22 TB total, ~12 TB used)
Underlying disks: All HDD (multiple virtual disks, ROTA=1)
Network: 10 Gbps

Important: DB node and backup server are both virtual machines and may be on the same underlying storage pool (VMware datastore).

Current Backup Method

Backup is executed from one Galera node using:

innobackupex --user=root --password=XXX --no-timestamp /BACKUP/DBBackup/
innobackupex --apply-log --use-memory=4G /BACKUP/DBBackup/

Backups are written directly to the NFS mount.

Observed Behavior

Full backup duration: 24–30+ hours
Database is highly active (write-heavy Galera cluster)
NFS export is configured with sync
Disks on both DB and backup server are HDD (not SSD/NVMe)

Cluster status during backup:

wsrep_cluster_status = Primary
wsrep_local_state_comment = Synced
Flow control mostly OFF
One node handles heavy traffic (~280 connections), backup is taken from a lower-traffic node (~10–20 connections)

Constraints / Requirements

Production environment (cannot stop writes)
Must maintain Galera consistency
Prefer not to use aggressive settings that may cause replication lag
Need to ensure at least one valid full backup is always available (no destructive rotation)
Backup disk is remote NFS storage

Questions

Is >24 hour backup time expected for ~10 TB dataset over NFS (sync) on HDD storage?
Would switching NFS export from sync to async be considered safe/recommended for XtraBackup workloads?
Is it better practice to:
- Backup to local disk first, then rsync to NFS?
- Or write directly to NFS in large environments?
Are the following options recommended for large HDD + NFS setups?
- --parallel
- --rsync
- --throttle
- --compress / --stream
For Galera specifically, is there any official guidance on:
- Best node selection for backups
- Avoiding cluster performance impact during 10TB+ full backups

Additional Concern

Because backups take longer than 24 hours, cron jobs can overlap and risk deleting previous valid backups before a new one is fully completed. We are redesigning the rotation logic to be non-destructive and atomic.

Any best practices for very large (8–12 TB) Galera clusters using XtraBackup over NFS would be greatly appreciated.

They using same storage by the way, I will change the backup storage after 1 month.

Thank you.

matthewb · February 17, 2026, 4:37pm

Hello @aycelen,

production MySQL 5.7 (Galera Cluster, 3 nodes)

5.7 has been dead for many years. 8.0 will EOL in less than 60 days. You are missing out on many, many security fixes, performance improvement patches, etc.

XtraBackup (innobackupex 2.4.29)

This has also been dead for many years. Again, you’re missing out on new features in Xtrabackup 8.4 that improve backup processes.

both virtual machines and may be on the same underlying storage pool

That will hurt performance of everything. Not only are the PXC VMs fighting for disk IO, now your backup is also fighting for the same IO. And with HDDs, the IO is extremely low, and slow. Are you monitoring HDD performance anywhere?

Is >24 hour backup time expected for ~10 TB dataset over NFS (sync) on HDD storage?

10TB = 10,485,760MB
Let’s say your 7200RPM HDD writes at 120MB/s. That’s assuming absolutely nothing else is using that disk. 10,485,760 / 120 = 87,381s = 24.3hrs Now, add all the VM overhead, network overhead, application traffic, etc and you’ll see that >24hr backup time is absolutely expected.

Backup to local disk first, then rsync to NFS?

If you have the local disk space for this, this would be faster, but not by much considering the volume of data you have.

Any best practices for very large (8–12 TB) Galera clusters using XtraBackup over NFS would be greatly appreciated.

Stop using NFS. Instead, stream the backup directly to the storage server. This bypasses all the NFS network, and disk overhead. Docs: Take a streaming backup - Percona XtraBackup Look at the example “Send the backup to another server using netcat”
–parallel always 4 or 8, depending on how much CPU you have available
–compress always Modern versions of PXB use zstd which is much better/faster than older zlib
With datasets > 2TB, we recommend one of the following approaches to backups:

Use snapshots. Disk/VM snapshots are disk cheap (since they only store deltas), and should perform quickly.
Switch to using an incremental backup style. Example: Every Sunday, take a full backup. On Monday, take an incremental backup using Sunday as the base. The Monday backup will only backup what was changed since Sunday. On Tuesday, take another incremental using Monday as the base. The Tuesday backup will only be the difference since Monday, etc, etc. If you are only changing a few GB per day, the incremental backup will go blazingly fast.

anderson.nogueira · February 20, 2026, 4:09pm

Hi @aycelen,

As @matthewb described, your 24+ hour times are expected given the HDD/NFS path. I benchmarked PS 5.7.44 and XtraBackup 2.4.29 on a ~2 GB dataset (20 InnoDB tables, innodb_file_per_table=ON) to quantify his recommendations. The relative speedups scale to your 10 TB setup:

Flags	Time	Size	vs Baseline
(default, no flags)	25s	2583 MB	1.0x
`--parallel=4`	10s	2583 MB	2.5x
`--compress --compress-threads=4`	20s	63 MB	1.2x
Combined + `--stream=xbstream`	7s	62 MB	3.5x
Incremental (after 1% data change)	18s	42 MB	—

The 97% compression ratio here is inflated by synthetic data. Expect 60-80% on production InnoDB data, which still means writing 2-4 TB instead of 10 TB over your NFS link. Since HDD write throughput is the bottleneck, that alone could cut your backup time in half.

The biggest win is streaming directly to the backup server, bypassing NFS entirely as matthewb recommended:

innobackupex --user=root --password=XXX \
  --parallel=4 --compress --compress-threads=4 \
  --stream=xbstream --galera-info /tmp | \
  ssh backup-server "cat > /BACKUP/$(date +%F).xbstream"

The --galera-info flag captures the GTID position, which you will need for cluster-aware restores. For decompression and prepare on the backup server, see the streaming backup docs.

For the incremental strategy matthewb described: weekly full + daily incrementals. With 1% daily change on 10 TB, each incremental writes roughly 100 GB instead of 10 TB and finishes in minutes rather than hours.

If you must keep using NFS temporarily, check your mount options. The default rsize/wsize of 32 KB causes severe overhead on large sequential writes. Increasing them to 1 MB and switching to async makes a significant difference:

mount -t nfs4 -o rsize=1048576,wsize=1048576,async,noatime backup-server:/DBBACKUP /BACKUP

Your node selection (low-traffic node) is correct. If you see wsrep_flow_control_paused rising during backups, --throttle can cap XtraBackup’s read I/O rate at the cost of longer backup times.

Topic		Replies	Views
advice on incremental backups to NFS storage Percona XtraBackup	0	971	August 15, 2013
--paralllel 64 is not using that many threads but only few (max 4 to 5) Percona XtraBackup	1	129	June 4, 2024
Percona Cluster xtradb-cluster 8.0.27-18.1 backups strategy Percona XtraDB Cluster 8.x	4	809	September 22, 2022
Backup a cluster - beginner questions Percona XtraDB Cluster 8.x mysql , percona	8	401	April 2, 2025
Innobackupex taking huge time for preparing backup Percona XtraBackup	0	497	July 16, 2018