How to restore from scheduled backups, using percona operator for mongodb and how to validate if the restoration is successful?

Description:

I am trying for restoration from scheduled backups, but its not successful and also please guide how to validate after restoration.

Steps to Reproduce:

I have the below:

  1. cr.yaml - configured for scheduled backup
  2. Backup.yaml - for on-demand backup
  3. restore.yaml - for on-demand restoration
  4. On-demand bkp is also available and scheduled backups are also available.
    5. I am able to restore using on-demand bkp, how to validate it ?? please share the steps.
    6. And I am unable to restore using schedule backups, please guide.

Version:

1.15.0

Logs:

Logs after restoration applied for on-demand backup (restore1)

kubectl get psmdb-restore -n psmdb
NAME       CLUSTER           STATUS   AGE
restore1   my-cluster-name   ready    14d
kubectl exec -it my-cluster-name-rs0-0 -c backup-agent -n psmdb -- bash
bash-4.4$ 
bash-4.4$ pbm logs
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] oplog replay finished on {1712671192 10}
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.pbmOpLog: opid_1_replset_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.system.roles: role_1_db_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.pbmBackups: name_1, start_ts_1_status_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.system.users: user_1_db_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.pbmLockOp: replset_1_type_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.pbmPITRChunks: rs_1_start_ts_1_end_ts_1, start_ts_1_end_ts_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for admin.pbmLock: replset_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for config.shards: host_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for config.chunks: uuid_1_min_1, uuid_1_shard_1_min_1, uuid_1_lastmod_1
2024-04-24T06:19:45Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] restoring indexes for config.tags: ns_1_min_1, ns_1_tag_1
2024-04-24T06:19:46Z I [cfg/my-cluster-name-cfg-1.my-cluster-name-cfg.psmdb.svc.cluster.local:27017] [restore/2024-04-24T06:19:36.910138296Z] recovery successfully finished

After applying restore3, I didn’t get any pbm logs. Only older logs available !!!

Expected Result:

restore3 should be in ready status

Actual Result:

kubectl get psmdb-restore -n psmdb
NAME       CLUSTER           STATUS   AGE
restore1   my-cluster-name   ready    14d
restore3   my-cluster-name            43h
restore4   my-cluster-name            77m

Additional Information:

kubectl get psmdb-backup -n psmdb
NAME                                        CLUSTER           STORAGE      DESTINATION                                  TYPE      STATUS   COMPLETED   AGE
backup-mycluster                            my-cluster-name   azure-blob   azure://percona/psmdb/2024-04-09T13:59:44Z   logical   ready    14d         14d
cron-my-cluster-name-20240422104500-m9nwh   my-cluster-name   azure-blob   azure://percona/psmdb/2024-04-22T10:45:21Z   logical   ready    42h         42h
cron-my-cluster-name-20240423104500-7qtzx   my-cluster-name   azure-blob   azure://percona/psmdb/2024-04-23T10:45:21Z   logical   ready    18h         18h
cron-my-cluster-name-20240423133000-n6jrv   my-cluster-name   azure-blob   azure://percona/psmdb/2024-04-23T13:30:21Z   logical   ready    15h         15h
cat restore.yaml 
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: restore3
spec:
  clusterName: my-cluster-name
  backupName: cron-my-cluster-name-20240422104500-m9nwh
  #pitr:
    #type: date
    #date: 2024-04-22T10:45:28Z
  backupSource:
    # type: physical
    destination: azure://percona/psmdb/
#    s3:
#      credentialsSecret: my-cluster-name-backup-s3
#      serverSideEncryption:
#        kmsKeyID: 1234abcd-12ab-34cd-56ef-1234567890ab
#        sseAlgorithm: AES256
#        sseCustomerAlgorithm: AES256
#        sseCustomerKey: Y3VzdG9tZXIta2V5
#      region: us-west-2
#      bucket: S3-BACKUP-BUCKET-NAME-HERE
#      endpointUrl: https://s3.us-west-2.amazonaws.com/
#      prefix: ""
    azure:
      credentialsSecret: my-cluster-azure-secret2
      prefix: psmdb
      container: percona

Hello @Raji ,

can you please show the output of these two commands:

kubectl describe psmdb-restore restore3
kubectl get psmdb-restore restore3 -o yaml 

I’m curious if it throws any errors there.

Also you might look into the Pods’ logs for errors:
kubectl logs my-cluster-name-rs0-X -c backup-agent for errors vs doing pbm logs. As operator might run PBM from a different pod (not only rs0-0).

Hi Sergey_Pronin,

I am able to restore from the scheduled backups, using the cron names.

Could you please let me know, the validation procedure for restoration.

Hello @Raji ,

I assume by cron names you mean custom resources - psmdb-backup and psmdb-restore.

As for validation - there are two things:

  1. Validating the restoration process itself - just check kubect get psmdb-restore RESTORENAME - it should show success status. It shows success only one backup was restored in full.

  2. Validating the data in the backup - not something that Operator does. It should be validated somehow through your internal procedures or application checks. We just simply don’t know what is it in your data.