Optimise Xbcloud upload to GCS while streaming from xtrabackup

Xtrabackup Version: xtrabackup version 2.4.26
MySQL version: 5.7

Command used:

xtrabackup --backup --stream=xbstream --extra-lsndir=/tmp/xtrabackup-lsndir --target-dir=/tmp/xtrabackup --read-buffer-size=128M --parallel=10 | xbcloud put --storage=google --google-endpoint='storage.googleapis.com' --google-access-key='XXXXX' --google-secret-key='XXXXXX' --google-bucket='XXXXBucket' --google-storage-class='STANDARD' --google-region='asia-south1' --parallel=10 $(date "+%F-%H-%M-%S")-backup-full

For 53GB data this full backup is taking nearly ~30 minutes with a 8 core and 40GB VM.

Xtrabackup backup (without xbstream) is able to get completed within 3-4 minutes. Increasing --parallel does not help much in reducing the overall time.

What optimisation parameters can be used to improve the overall backup process?

Hi @Rachit_Saxena ,

Xtrabackup backup (without xbstream) is able to get completed within 3-4 minutes. Increasing --parallel does not help much in reducing the overall time.

Here you are comparing local writing vs WAN ( to GCP ). You might be looking at this from the wrong angle.

First I would ask what is the link speed between your machine to upload data to GCP.
When talking about xbcloud to Object storage, this is normally the bottleneck, the link speed from source to destination. Since you already mentioned that increasing parallelism does not help, we are not talking about disk/cpu bottleneck.

You can test it by creating a local file that is 53G and using the google cloud client to attempt to upload the file.

Hi @Marcelo_Altmann

I checked through the sar -d with the same command and i found that the the system is mostly idle and see ops while uploading the files. i.e. during below log events.

230807 20:24:18 xbcloud: successfully uploaded chunk: 2023-08-07-20-19-58-backup-full/sbtest/sbtest9.ibd.00000000000000000003, size: 134217780

Mostly the backup process spends time on log scanned up to

807 20:24:17 >> log scanned up to (102778278298)
230807 20:24:18 >> log scanned up to (102778278298)
230807 20:24:18 xbcloud: successfully uploaded chunk: 2023-08-07-20-19-58-backup-full/sbtest/sbtest9.ibd.00000000000000000003, size: 134217780
230807 20:24:19 >> log scanned up to (102778278298)
230807 20:24:20 >> log scanned up to (102778278298)
230807 20:24:20 xbcloud: successfully uploaded chunk: 2023-08-07-20-19-58-backup-full/sbtest/sbtest5.ibd.00000000000000000004, size: 134217780
230807 20:24:21 >> log scanned up to (102778278298)

Just to add about the data spread, I am having 10 tables, 5.3GB each.

We tried with an internal backup tool (python3.9, using GCS libraries, uses parallel process instead of threads) which is transferring file within 13 minutes.

Please refer to FIFO data sink - Percona XtraBackup , more specifically to https://docs.percona.com/percona-xtrabackup/8.0/_static/backup-streamed-to-object-storage.png

Mostly, the backup process spends time on log scanned up to

This is xtrabackup waiting for the STDOUT pipe to become free, meaning the xbcloud has its event queue full waiting for event handler slots to become free.

We tried with an internal backup tool (python3.9, using GCS libraries, uses parallel process instead of threads) which is transferring file within 13 minutes.

Can you detail how many parallel requests are you doing on this python script?

From you original command, I see you are doing --parallel=10 on xtrabackup, which in other words means you can stream up to 10 files in parallel (each file is split into 128M chunks --read-buffer-size=128M) , on the other side of the pipe, there is up to 10 parallel requests to GCP, 128MB each.
I would suggest you try to:
1 - leave the read size to default of 10M
2- increase the --parallel on xbcloud to a multiple of --parallel from xtrabackup, for example, if you want to upload 5 chunks of each file in parallel, and xtrabackup has parallel=10, you should use xbcloud --parallel=50

Hi @Marcelo_Altmann

Can you detail how many parallel requests are you doing on this python script?

5 Parallel requests for chunking
10 parallel requests for uploading

Sure will get back after trying out with 10M, parallelism (10,50)

Please refer to FIFO data sink - Percona XtraBackup ](FIFO data sink - Percona XtraBackup) , more specifically to https://docs.percona.com/percona-xtrabackup/8.0/_static/backup-streamed-to-object-storage.png

Is percona-xtrabackup 8.0 compatible with MySQL 5.7?

EDIT: I tried it doesn’t support.

2023-08-08T01:16:03.834200+05:30 0 [Note] [MY-011825] [Xtrabackup] Connecting to MySQL server host: localhost, user: not set, password: not set, port: not set, socket: not set
2023-08-08T01:16:03.837431+05:30 0 [ERROR] [MY-011825] [Xtrabackup] Unsupported server version: ‘5.7.39-42-log’
2023-08-08T01:16:03.837465+05:30 0 [ERROR] [MY-011825] [Xtrabackup] This version of Percona XtraBackup can only perform backups and restores against MySQL 8.0 and Percona Server 8.0
2023-08-08T01:16:03.837480+05:30 0 [Note] [MY-011825] [Xtrabackup] Please use Percona XtraBackup 2.4 for this database.

I would suggest you try to:
1 - leave the read size to default of 10M
2- increase the --parallel on xbcloud to a multiple of --parallel from xtrabackup, for example, if you want to upload 5 chunks of each file in parallel, and xtrabackup has parallel=10, you should use xbcloud --parallel=50

This configuration took 29 minutes, which is similar to original command timelines. Any thoughts?

Is percona-xtrabackup 8.0 compatible with MySQL 5.7?

No, its not compatible, however, 2.4 and 8.0 are identical in terms of xtrabackup and xbcloud streaming to STDOUT.

This configuration took 29 minutes, which is similar to original command timelines. Any thoughts?

I still don’t think there is an issue with xbcloud here. A single thread on xbcloud is capable of streaming 1.8Gbps.

Tried pushing to GCS with s3cmd (interoperability with GCS HMAC keys). I created 10 parallel calls like s3cmd -v put -r sbtest/test1.ibd s3://some-bucket/some-folder/

There were 10 files (test1.ibd to test10.ibd). The entire process in parallel transferred 53GB in 4 minutes

Xtrabackup backup with xbstream in not a problem as it is able to complete the stream of the same data in 3-4 minutes. It is Xbcloud which is getting bottlenecked. Disk being used is SSD and i checked with our platform that sufficient bandwidth is available between our infra and GCS, the slowness for this has been ruled out by 2 tools (internal python script and s3cmd) w.r.t bandwidth

Hi @Rachit_Saxena

Can you do an extra test. Download latest xtrabackup 8.0 tarball and execute xbcloud from 8.0 and validate if you get different results, something like:

/path/to/pxb_24/xtrabackup --backup --parallel=10 --stream=xbstream | /path/to/pxb_80/xbcloud put --parallel=50

xbstream format is compatible between 2.4 and 8.0, and recently we ran a performance test on xbcloud using STDOUT/STDIN and we were able to achieve a max throughput of 1.8Gbps. What I see different from your tests is that you are using 2.4, while we used 8.0.

Hi Marcelo,

is there a timeline when the FIFO Streams will be GA/Removed? We explored FIFO Streams and we see good backup timelines. Based on the GA of FIFO Streams, We will use it in our production environment.