Decompress and copy in the same step

Hello,

Is it in a way possible to copy and decompress a backup in the same step?
By default, --decompress does not delete the old files. This is good, because I want the original backup to remain at the old location.

In other words, I would like to be able to decompress from a folder on a NFS into the (planned) data directory of the target restore instance.

 /usr/bin/xtrabackup --defaults-file=/root/.my.cnf --backup --target-dir /mnt/db_backup/instance_name/full-2024-11-04_15-24-12 --user=bgbackup --parallel=4 --compress --compress-threads=4 --no-lock

The documentation somewhat suggests that --datadir can be used for this in the decompress step. I tried that but it did not work.

Is there another way to do this? Perhaps by streaming the backup that already was created and decompress it at the receiving side?

Thank you,
Michaël

Hello @michaeldg,
You cannot copy and decompress at the same time with xtrabackup. You could probably do something with scripting, and iterating through the files.

Yes, you can do this. Example: desthost: socat - TCP-LISTEN:3306 | xbstream -vx -C /path/to datadir

Then you can do xtrabackup --decompress --remove-original and then xtrabackup prepare etc.

Thank you for the quick reply. I figured out how to do this:

backuppath="/mnt/db_backup/full-2024-11-06_04-07-02"; cd $backuppath; find . -type f| xargs xbstream -c | xbstream -vx --decompress -C /var/lib/mysql1/

For a 73GB compressed backup, this takes 13m29s to run. With 16 decompress threads it becomes 10 seconds faster.

To copy, and then run xtrabackup --decompress --parallel=16, it takes 5m34s (single threaded copy) + 7m13s, so in 2 steps it is faster.

I tried to make it parallel:

[root@server full-2024-11-06_04-07-02]# cat /root/stream-copy-decompress.sh
#!/bin/bash

xbstream -c $1| xbstream -vx --decompress -C $2

And the script:

find . -type f| xargs -P 16 -n 1 -I {} /root/stream-copy-decompress.sh {} /var/lib/mysql1/ryout_restore

It finishes in 10m9s, so this is a bit faster then the other.

I do get a bunch of errors from the parallel running stream processes:
xbstream: Can't create/write to file './xtrabackup_punch_hole' (OS errno 17 - File exists)

Where is this error coming from? Does it do any harm?

Do you see any improvements? I find the result not very satisfying and have the idea that if xtrabackup would be optimized to do this it would become a lot faster.