How do you configure multiple backup targets to different s3 buckets at different schedules

Hey Guys

How do you configure Percona Backup for MongoDB to backup to multiple s3 buckets at different schedules?

We currently use strata with mongodb 3.4. Since strata does not work with mongodb 3.6+ because the rocksdb storage engine was removed, we had to investigate other backup options. It looks like Percona Backup for MongoDB is the only option around for mongodb 3.6+.

With strata, we used to have several s3 backup locations to achieve the following goals:

  • different backup schedules: half hourly, daily, monthly
  • different backup retention periods: 1 month for half hourly backups, several months for daily backups, several years for monthly backups
  • different backup regions: us, europe
  • different aws cloud accounts on different credit cards

With Percona Backup for MongoDB, none of this seems possible. I have tried the following:

  • different backup config on different nodes
  • separate pbm db user accounts used by different db nodes specified through --mongodb-uri command line argument
  • separate auth databases for those pbm db user accounts

Subsequent “pbm config” command line calls from different nodes all seem to modify the same config. i.e. calling the “pbm config” command from node2 with pbm_config2.yaml pointing to s3Bucket2 and pbmUser2 would lead to node1 also pointing to s3Bucket2.

Can anyone from Percona or anyone from the community think of any way to set up multiple backup schedules to different s3 buckets?

I should mention our strata backups are all incremental backups. So they are fairly quick to do. The storage size across all our dbs using the rocksdb storage engine is 1.2tb. The half hourly job takes 5-10 minutes to run with an incremental size of 5-10gb. The monthly job takes around 1-3 hours to run with an incremental size of 100-500gb.

I wonder how all this would work with Percona Backup for MongoDB. Has anyone taken this solution to production? Apart from the standard setup described in the Percona Backup for Mongodb docs, what other kinds of contingency have people in the community put in place for their production environments?

Thanks

1 Like

Hello @asdf01 ,

the only solution for now is to use different pbm configs. And you are correct that all nodes will switch to this config. To switch to another bucket you just change the config again.

We do something similar in our k8s operator where we have an option to use multiple buckets.

1 Like

Hi @Sergey_Pronin.

Thanks for your insight.

When I tried switching from s3Bucket1 to s3Bucket2, subsequent backups seem to be made ok to s3Bucket2. However the pbm status command reports that it is still aware of the backups made against s3Bucket1 but reports that it can’t find those backup files in s3Bucket2.

That’s the bit I’m concerned about. It seems that pbm wasn’t built with this config switching or multiple config use case in mind. My fear is that if we use pbm in a way that’s not intended, there might be consequences down the road.

The way I see it, for a backup tool, those consequences are, if the db fails, and the db restore tool fails, the company fails, everyone in the company updates their CVs and tries to get new jobs.

With PITR being turned off by pbm each time you do a full backup, my guess is that this config switching approach will not work with PITR incremental backups. So does everyone that uses this config switching strategy use full backups exclusively?

Multiple backup destinations aside, does everyone just do full backups daily? I can’t imagine a backup interval longer than daily backups being acceptable to management. The database will eventually grow to the extent that full backups take more than a day to create. Without the ability to do multiple backup configs and schedules, the backup interval will inevitably be pushed to be longer than a day as the db size grows larger.

Eventually we would get to a point where we are doing a new full backup as soon as the previous one finishes and it would no longer be often enough to satisfy some minimum data loss criteria of your organisation’s disaster recovery policy.

You would probably need a new dedicated db node to do backups specifically. If a secondary node is constantly doing those full backups, it probably won’t be able to keep up with the more important replication operations.

I hope I’m missing something here. Percona Backup for MongoDB sounds a lot better than whatever mongodb is offering, which seems to be nothing outside of their managed services. I don’t want to sound ungrateful, but it just doesn’t seem like Percona Backup is ready for production usage when the livelihood of all the company staff is on the line.

1 Like

Hi @Sergey_Pronin. Thanks for sharing your insight. My other longer reply has been deleted by the admins.

1 Like