Description:
Psmdb operator fails to complete smart update and 3 mongos pods and 2 cfg pods are in non ready state
Steps to Reproduce:
It happens on random occasions, when operator wants to run smart update
Operator Logs:
2023-10-04T15:01:37.167Z INFO StatefulSet is not up to date {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "73af19f4-7e82-4805-a3c9-9b56a741a0e0", "sts": "psmdb-db-mongos"}
2023-10-04T15:01:42.401Z INFO StatefulSet is changed, starting smart update {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b60264f8-fc2c-4ead-af8e-5eb22bc253e3", "name": "psmdb-db-mongos"}
2023-10-04T15:01:42.401Z INFO can't start/continue 'SmartUpdate': waiting for all replicas are ready {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b60264f8-fc2c-4ead-af8e-5eb22bc253e3"}
2023-10-04T15:01:42.401Z INFO StatefulSet is not up to date {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b60264f8-fc2c-4ead-af8e-5eb22bc253e3", "sts": "psmdb-db-mongos"}
2023-10-04T15:02:24.096Z INFO balancer enabled {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "c05b58d0-058a-4bf1-9c3a-4ca866f1c154"}
2023-10-04T15:02:24.098Z INFO Cluster state changed {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "c05b58d0-058a-4bf1-9c3a-4ca866f1c154", "previous": "initializing", "current": "ready"}
Version:
[Insert the version number of the software]
Logs:
2023-10-04T16:50:31+02:00 2023-10-04T14:50:31.704Z INFO doing step down... {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b8484a79-d1d9-4ffd-843f-501082f88774", "force": false}
2023-10-04T16:50:56+02:00 2023-10-04T14:50:56.508Z ERROR Reconciler error {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b8484a79-d1d9-4ffd-843f-501082f88774", "error": "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to do step down: replSetStepDown: connection pool for psmdb-db-cfg-0.psmdb-db-cfg.percona.svc.cluster.local:27017 was cleared because another operation failed with: (KeyNotFound) No keys found for HMAC that is valid for time: { ts: Timestamp(1696431014, 2) } with id: 7241585499830222852", "errorVerbose": "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to do step down: replSetStepDown: connection pool for psmdb-db-cfg-0.psmdb-db-cfg.percona.svc.cluster.local:27017 was cleared because another operation failed with: (KeyNotFound) No keys found for HMAC that is valid for time: { ts: Timestamp(1696431014, 2) } with id: 7241585499830222852\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:412\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"}
2023-10-04T16:50:56+02:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler2023-10-04T16:50:31+02:00 2023-10-04T14:50:31.704Z INFO doing step down... {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b8484a79-d1d9-4ffd-843f-501082f88774", "force": false}
2023-10-04T16:50:56+02:00 2023-10-04T14:50:56.508Z ERROR Reconciler error {"controller": "psmdb-controller", "object": {"name":"psmdb-db","namespace":"percona"}, "namespace": "percona", "name": "psmdb-db", "reconcileID": "b8484a79-d1d9-4ffd-843f-501082f88774", "error": "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to do step down: replSetStepDown: connection pool for psmdb-db-cfg-0.psmdb-db-cfg.percona.svc.cluster.local:27017 was cleared because another operation failed with: (KeyNotFound) No keys found for HMAC that is valid for time: { ts: Timestamp(1696431014, 2) } with id: 7241585499830222852", "errorVerbose": "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to do step down: replSetStepDown: connection pool for psmdb-db-cfg-0.psmdb-db-cfg.percona.svc.cluster.local:27017 was cleared because another operation failed with: (KeyNotFound) No keys found for HMAC that is valid for time: { ts: Timestamp(1696431014, 2) } with id: 7241585499830222852\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:412\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594"}
2023-10-04T16:50:56+02:00 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
Expected Result:
Psmdb database normal function
Actual Result:
Psmdb cluster and database are in not ready state