Auto-create buckets for S3 backup

Hello,

I’m starting a new cluster afresh. I gave the CRD AWS keys with full S3 privileges that allow for bucket creation. However I’m getting backup error “The specified bucket does not exist” unless the bucket is pre-created.

Is there a way to make the backup pod auto-create the bucket in case it doesn’t exist? I’m aware that I could manually create the bucket beforehands but as I plan to run several databases it would be interesting to change the behaviour to automatically create the bucket if not found.

Here follows the full log of the backup container:

+ LIB_PATH=/usr/lib/pxc
+ . /usr/lib/pxc/backup.sh
++ set -o errexit
++ SST_INFO_NAME=sst_info
++ CURL_RET_ERRORS_ARG=--curl-retriable-errors=7
++ INSECURE_ARG=
++ '[' -n true ']'
++ [[ true == \f\a\l\s\e ]]
++ S3_BUCKET_PATH=cluster2-2023-06-14-01:35:47-full
+++ date +%F-%H-%M
++ BACKUP_PATH=cluster2-pxc-2023-06-14-01-38-xtrabackup.stream
+ GARBD_OPTS=
+ check_ssl
+ CA=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
+ '[' -f /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt ']'
+ SSL_DIR=/etc/mysql/ssl
+ '[' -f /etc/mysql/ssl/ca.crt ']'
+ CA=/etc/mysql/ssl/ca.crt
+ SSL_INTERNAL_DIR=/etc/mysql/ssl-internal
+ '[' -f /etc/mysql/ssl-internal/ca.crt ']'
+ CA=/etc/mysql/ssl-internal/ca.crt
+ KEY=/etc/mysql/ssl/tls.key
+ CERT=/etc/mysql/ssl/tls.crt
+ '[' -f /etc/mysql/ssl-internal/tls.key -a -f /etc/mysql/ssl-internal/tls.crt ']'
+ KEY=/etc/mysql/ssl-internal/tls.key
+ CERT=/etc/mysql/ssl-internal/tls.crt
+ '[' -f /etc/mysql/ssl-internal/ca.crt -a -f /etc/mysql/ssl-internal/tls.key -a -f /etc/mysql/ssl-internal/tls.crt ']'
+ GARBD_OPTS='socket.ssl_ca=/etc/mysql/ssl-internal/ca.crt;socket.ssl_cert=/etc/mysql/ssl-internal/tls.crt;socket.ssl_key=/etc/mysql/ssl-internal/tls.key;socket.ssl_cipher=;pc.weight=0;'
+ '[' -n giba-sample-apps-prod-cluster2-backup ']'
+ clean_backup_s3
+ mc_add_bucket_dest
+ echo '+ mc -C /tmp/mc  config host add dest https://s3.amazonaws.com ACCESS_KEY_ID SECRET_ACCESS_KEY '
+ mc -C /tmp/mc  config host add dest https://s3.amazonaws.com ACCESS_KEY_ID SECRET_ACCESS_KEY 
Added `dest` successfully.
+ is_object_exist giba-sample-apps-prod-cluster2-backup cluster2-2023-06-14-01:35:47-full.sst_info
+ local bucket=giba-sample-apps-prod-cluster2-backup
+ local object=cluster2-2023-06-14-01:35:47-full.sst_info
++ mc -C /tmp/mc --json ls dest/giba-sample-apps-prod-cluster2-backup/cluster2-2023-06-14-01:35:47-full.sst_info
++ jq .status
+ [[ -n "error" ]]
+ return 1
+ xbcloud delete --curl-retriable-errors=7 --storage=s3 --s3-bucket=giba-sample-apps-prod-cluster2-backup cluster2-2023-06-14-01:35:47-full.sst_info
230614 01:38:39 xbcloud: Successfully connected.
230614 01:38:39 xbcloud: Failed to list objects. Error message: The specified bucket does not exist
230614 01:38:39 xbcloud: Delete failed. Cannot list cluster2-2023-06-14-01:35:47-full.sst_info.
Stream closed EOF for giba/xb-cron-cluster2-s3-backup-202361413547-8fa30-gstx6 (xtrabackup)

I found a workaround. The idea is to use something like:

s3:
  bucket: bucketname/subfolder
...

Where bucketname should be pre-created as a bucket in S3 beforehands, and subfolder is dynamically created at the start of the backup.

Please let me know if there is any caveat with this workaround.

Thanks

Hello @gmautner ,

you are absolutely right, this is the intended way of using the single bucket for multiple clusters.

As for bucket creation - this operation can be automated, but it comes with a lot of IFs:

  1. proper access to the object storage
  2. name of the bucket (like on AWS S3 bucket name is unique, so we would have to be creative and create some random hashes)