Hi guys.
Recently I installed the Mongo Percona Operator 1.13 version on Openshift and almost montly just the mongos instances is lost connection in database.
Reading Percona Operator logs I saw this:
*2023-04-24T16:58:04.386Z ERROR controller.psmdb-controller Reconciler error {“name”: “mongo-cluster”, “namespace”: “mongodb-ebagagem”, “error”: “reconcile StatefulSet for cfg: failed to run smartUpdate: failed to stop balancer: failed to get mongos connection: ping mongo: server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: mongo-cluster-mongos.mongodb-ebagagem.svc.cluster.local:27017, Type: Unknown }, ] }”, “errorVerbose”: "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to stop balancer: failed to get mongos connection: ping mongo: server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: mongo-cluster-mongos.mongodb-ebagagem.svc.cluster.local:27017, Type: Unknown }, ] }\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermong…
*46sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem
47/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266
*48sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2
this seems to be causing a series of cascading errors, ending at mongos pod, where the database turn down. Log mongos:
“id”:20883, “ctx”:“conn18227628”,“msg”:“Interrupted operation as its client disconnected”,“attr”:{“opId”:51384861}}
15335{“t”:{“$date”:“2023-04-24T20:09:20.010+00:00”},“s”:“I”, “c”:“NETWORK”, “id”:22944, “ctx”:“conn18227629”,“msg”:“Connection ended”,“attr”:{“remote”:“127.0.0.1:52016”,“uuid”:“53938424-f855-4643-b99a-faa1f69e5e55”,“connectionId”:18227629,“connectionCount”:2}}
Event mongos:
(combined from similar events): Readiness probe failed: {“level”:“info”,“msg”:“Running Kubernetes readiness check for mongos”,“time”:“2023-04-24T20:07:31Z”} {“level”:“error”,“msg”:“Member failed Kubernetes readiness check: run listDatabases: (Unauthorized) command listDatabases requires authentication”,“time”:“2023-04-24T20:07:31Z”}
Is there anybody can help me?
Regards.