Xtrabackup and xbcloud to S3 for migration


We are migrating our infrastructure from on-prem to AWS and we’re using xtrabackup to facilitate the migration. I have successfully used xtrabackup to create the files locally and then copied them to S3 via aws CLI and performed a “restore from S3” which worked perfectly. The command I used to perform the backup was:

xtrabackup --backup --compress --stream=xbstream --target-dir=/awsmysqlmigration | split -d --bytes=500MB - /awsmysqlmigration/backup.xbstream

What I would like to do for our larger databases is have the data streamed directly to S3 to avoid any disk space issues on-prem. I have tried piping to xbcloud and it seems to create a ton of files in the S3 bucket but none have the .xbstream extension. The files all have just long extensions like 000000000001. Is there a way to facilitate the proper creation of the backup files via streaming to S3 so I can do a restore from S3 like I did when copying the files in two steps? Here is the command I’m running that copies files but the restore process doesn’t seem to know how to process.

xtrabackup --backup --compress --stream=xbstream --target-dir=/awsmysqlmigration | xbcloud put TEST --parallel=8 --storage=S3 --s3-access-key=xxx --s3-secret-key=yyyy --s3-bucket=zzzzz

Thank you.


Hello @ghansen11,
You cannot use xbcloud to create a backup which can be used for ‘restore from S3’. xbcloud can only be used for your own backup/restore processes, not AWSs.

You need to use the AWS CLI like so: xtrabackup --backup --stream=xbstream | aws s3 cp - s3://bucket/backup.xbstream

If your backup is larger than 1TB, you will need to add an additional flag aws s3 --expected-size <sizeOfDBInBytes> cp - s3://…`

I preformed this exact procedure for a client back in January. They had a 5TB MySQL database moving from on-prem to RDS.

ah great thank you very much!


quick update and question. This worked very well and I was able to restore the DB. What I didn’t realize is that Aurora has a significantly smaller max_binlog_size and thus relay size compared to the defaults for my on-prem 5.7 mysql. Aurora limits it to 100 meg and the local is 100 gig. I’m hoping I can just adjust this via a set global statement on-prem without restarting. My question is, since I already have the large backup on S3, is there anyway to have an incremental backup run from the point where the replication stopped? It basically stopped in Aurora while it waited for more relay storage which won’t happen. Since I streamed the entire backup, I don’t see any of the checkpoint files to find the lsn entries but I do have the slave status so I’m wondering if there’s a way to do it without starting all over.

thank you,