Hi everyone
Here is my configuration of a test cluster for backup/restore/update:
Kubernetes : 1.29.10
Helm chart psmdb-operator : 1.18.0 (1.19.0 impossible to deploy)
Helm Chart psmdb-db : 1.18.0 (1.19.0 impossible to deploy)
Mongod : 6.0.19-16-multi
Operator : 1.18.0
PBM : 2.7.0-multi (2.8.0-multi error with index for restore)
In total 9 pods for 1 replicatset : 1 operator, 3 cfg, 3 rs0, 2 mongos router
So I want to do a complete restore of my mongos dbs.
For that I want to do it from a complete physical backup that is made, existing of 22Gio.
But if I launch a restoration with manifest K8s, it works at the beginning then after it never stops, I left running 24 hours and still the status of the restore in running
!
Here is the manifest k8s file that I used:
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
name: xxxxx-restaure-physical-full-test-1
spec:
clusterName: "psmdb-db-test"
backupSource:
type: "physical"
destination: "s3://xxxxx-mongo-backup-physical-test/2025-02-04T05:30:21Z"
s3:
credentialsSecret: "pbm-mongodb-xxxxxx-xx"
region: "fr-par"
bucket: "xxxxx-mongo-backup-physical-test"
endpointUrl: "https://s3.fr-par.xxx.xxxx"
And here is what happens after applying the manifest with kubectl
:
- The restore is in status
waiting
- The PBM containers of each pod stop
- The router mongos stop both
- The restore is in status
running
- The restoration begins …
The kubectl describe my-restore
:
Name: xxxxx-restaure-physical-full-test-1
Namespace: percona-mongodb-test
Labels: <none>
Annotations: <none>
API Version: psmdb.percona.com/v1
Kind: PerconaServerMongoDBRestore
Metadata:
Creation Timestamp: 2025-02-04T16:53:06Z
Generation: 1
Resource Version: 41116260131
UID: ed9749e0-6850-4b6f-b757-f81043750ed3
Spec:
Backup Source:
Destination: s3://xxxxx-mongo-backup-physical-test/2025-02-04T05:30:21Z
s3:
Bucket: xxxxx-mongo-backup-physical-test
Credentials Secret: pbm-mongodb-xxxxxx-xx
Endpoint URL: https://s3.fr-par.xxx.xxxx
Region: fr-par
Type: physical
Cluster Name: psmdb-db-xxxxx
Status:
Pbm Name: 2025-02-04T17:08:15.142876273Z
State: running
Events: <none>
And now the log of psmdb-operator :
2025-02-05T09:31:11.911Z DEBUG PBM restore status {"controller": "psmdbrestore-controller", "object": {"name":"xxxxx-restaure-physical-full-test-1","namespace":"percona-mongodb-test"}, "namespace": "percona-mongodb-test", "name": "xxxxx-restaure-physical-full-test-1", "reconcileID": "b9491d27-98f2-4fdc-a57d-4e1bcc98c0d7", "status": {"type":"physical","opid":"","name":"2025-02-04T17:08:15.142876273Z","replsets":[{"name":"cfg","start_ts":0,"status":"done","last_transition_ts":1738689045,"first_write_ts":{"T":0,"I":0},"last_write_ts":{"T":0,"I":0},"node":"","conditions":null},{"name":"rs0","start_ts":0,"status":"down","last_transition_ts":1738688969,"first_write_ts":{"T":0,"I":0},"last_write_ts":{"T":0,"I":0},"node":"","conditions":null}],"compression":"","store":{"type":""},"size":0,"mongodb_version":"","fcv":"","start_ts":0,"last_transition_ts":1738688924,"first_write_ts":{"T":0,"I":0},"last_write_ts":{"T":0,"I":0},"hb":{"T":0,"I":0},"status":"running","conditions":null,"n":null,"pbm_version":"","balancer":""}}
Can you help me understand what the problem is please and see how I can stop the physical restoration properly to return to a nominal operating state of the cluster?
Thank you