Restore to New Kubernetes cluster from S3 - No Oplog

Description:

I’m trying to restore to a new Kubernetes cluster from a backup in S3, but it is failing.

Steps to Reproduce:

  1. Create a backup from an existing cluster to S3 (PITR is enabled).
  2. Create a new PSMDB PerconaServerMongoDB resource new-mongo in a new Kubernetes cluster and wait from mongo to start successfully.
  3. Create a new PerconeServerMongoDBRestore resource like this:
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: manual-restore-85kdj
  namespace: newmongo
spec:
  backupSource:
    destination: s3://mongo-backup/2024-11-29T16:52:03Z
    s3:
      bucket: mongo-backup
      credentialsSecret: mongo-backup-s3
      endpointUrl: https://minio-ha-hl.svc.cluster.local:9000
      insecureSkipTLSVerify: true
      prefix: ''
      region: us-east-1
    type: logical
  clusterName: new-mongo
  pitr:
    type: latest

Version:

percona-server-mongodb-operator:1.16.2

Logs:

2024-12-02T11:15:08+01:00 2024-12-02T10:15:08.427Z INFO Warning: Reconciler returned both a non-zero result and a non-nil error. The result will always be ignored if the error is non-nil and the non-nil error causes reqeueuing with exponential backoff. For more details, see: https://pkg.go.dev/sigs.k8s.io/controller-runtime/pkg/reconcile#Reconciler {"controller": "psmdbrestore-controller", "object": {"name":"manual-restore-85kdj","namespace":"newmongo"}, "namespace": "newmongo", "name": "manual-restore-85kdj", "reconcileID": "560be844-5dc0-47d9-90b7-6b665bb453ae"}
2024-12-02T11:15:08+01:00 2024-12-02T10:15:08.427Z ERROR Reconciler error {"controller": "psmdbrestore-controller", "object": {"name":"manual-restore-85kdj","namespace":"newmongo"}, "namespace": "newmongo", "name": "manual-restore-85kdj", "reconcileID": "560be844-5dc0-47d9-90b7-6b665bb453ae", "error": "reconcile logical restore: there is no oplogs that can cover the date/time or no oplogs at all", "errorVerbose": "there is no oplogs that can cover the date/time or no oplogs at all\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup.init\n\t<autogenerated>:1\nruntime.doInit1\n\t/usr/local/go/src/runtime/proc.go:7176\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:7143\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:253\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695\nreconcile logical restore\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodbrestore.(*ReconcilePerconaServerMongoDBRestore).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodbrestore/perconaservermongodbrestore_controller.go:167\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:114\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:311\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:222\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"}
2024-12-02T11:15:08+01:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
2024-12-02T11:15:08+01:00 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:324
2024-12-02T11:15:08+01:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
2024-12-02T11:15:08+01:00 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:261
2024-12-02T11:15:08+01:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
2024-12-02T11:15:08+01:00 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.18.1/pkg/internal/controller/controller.go:222

Expected Result:

Since I specified the pitr.type as “latest”, I expect the database to be restored and then any oplogs available after the restore time to be applied, not an error message.

Actual Result:

An error: “There is no oplogs that can cover the date/time or no oplogs at all”

Additional Information:

I have also tried specifying a target date like this:

  pitr:
    date: '2024-11-29 16:52:12'
    type: date

and had the same result.
Here is an exerpt from the file 2024-11-29T16_52_03Z.pbm.json:

	"replsets": [
		{
			"name": "rs0",
			"backup_name": "2024-11-29T16:52:03Z/rs0/metadata.json",
			"oplog_name": "2024-11-29T16:52:03Z/rs0/oplog",
			"start_ts": 1732899117,
			"status": "done",
			"last_transition_ts": 1732899127,
			"first_write_ts": {
				"T": 1732899124,
				"I": 18
			},
			"last_write_ts": {
				"T": 1732899132,
				"I": 3
			},

Here is what I see in the oplog directory in S3:

20241129165204-18.20241129165212-3.gz

If I remove the pitr section from the restore resource, the restore is successful.

Hey @Stefan_Badenhorst ,

as some time passed I want to check - before I start reproducing it - have you figured it out?

I have not found a reason why this happens, but I had to find another solution since I had time pressure. I ended up deleting the cluster and starting over. So I no longer have it around to debug with.

So just to confirm - PITR works and the error you had before was a one time thing, you cant reporduce it anymore. Right?

I can try again to specifically test this.