Controller.psmdb-controller Reconciler error

fjareta · April 24, 2023, 8:08pm

Hi guys.

Recently I installed the Mongo Percona Operator 1.13 version on Openshift and almost montly just the mongos instances is lost connection in database.
Reading Percona Operator logs I saw this:

*2023-04-24T16:58:04.386Z ERROR controller.psmdb-controller Reconciler error {“name”: “mongo-cluster”, “namespace”: “mongodb-ebagagem”, “error”: “reconcile StatefulSet for cfg: failed to run smartUpdate: failed to stop balancer: failed to get mongos connection: ping mongo: server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: mongo-cluster-mongos.mongodb-ebagagem.svc.cluster.local:27017, Type: Unknown }, ] }”, “errorVerbose”: "reconcile StatefulSet for cfg: failed to run smartUpdate: failed to stop balancer: failed to get mongos connection: ping mongo: server selection error: context deadline exceeded, current topology: { Type: Unknown, Servers: [{ Addr: mongo-cluster-mongos.mongodb-ebagagem.svc.cluster.local:27017, Type: Unknown }, ] }\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermong…

*46sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).processNextWorkItem

47/go/src/github.com/percona/percona-server-mongodb-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266

*48sigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Start.func2.2

this seems to be causing a series of cascading errors, ending at mongos pod, where the database turn down. Log mongos:

“id”:20883, “ctx”:“conn18227628”,“msg”:“Interrupted operation as its client disconnected”,“attr”:{“opId”:51384861}}

15335{“t”:{“$date”:“2023-04-24T20:09:20.010+00:00”},“s”:“I”, “c”:“NETWORK”, “id”:22944, “ctx”:“conn18227629”,“msg”:“Connection ended”,“attr”:{“remote”:“127.0.0.1:52016”,“uuid”:“53938424-f855-4643-b99a-faa1f69e5e55”,“connectionId”:18227629,“connectionCount”:2}}

Event mongos:

(combined from similar events): Readiness probe failed: {“level”:“info”,“msg”:“Running Kubernetes readiness check for mongos”,“time”:“2023-04-24T20:07:31Z”} {“level”:“error”,“msg”:“Member failed Kubernetes readiness check: run listDatabases: (Unauthorized) command listDatabases requires authentication”,“time”:“2023-04-24T20:07:31Z”}

Is there anybody can help me?

Regards.

fjareta · April 27, 2023, 6:19pm

Well, even that no one answered me I would like to share the solution in this case.

I figure out that the error was ocurring because during the cluster instalation I make a mistake creating other secret. So a mongo namespace had two or more secret to the users environment. Sometimes for some reason the operator confused what secret file to use as reference. Then the database service stopped.

The solution was delete all secrets leaving just the secret created during installation by default.
I also had to change at instance mongo yaml from operator, appointing what secret I should to use.

Sergey_Pronin · April 28, 2023, 11:37am

Thank you for sharing the solution @fjareta! Real community spirit!

I’m not sure though how operator can pickup some random secret. Seems you used the name that is used by default in the operator or specified one in Custom Resource.

Topic		Replies	Views
MongoS error in database	2	1399	April 27, 2023
Mongo doesn't work Percona Operator for MongoDB	2	483	February 15, 2024
Upgrade operator from 1.12.0 to 1.14.0 Reconciler error Percona Operator for MongoDB	3	1067	May 9, 2023
Mongodb cluster in restart loop Percona Operator for MongoDB	2	1309	March 1, 2023
Percona Server MongoDB stuck in initializing Percona Operator for MongoDB percona , mongodb , psmdb-operator	4	1854	February 21, 2023

Controller.psmdb-controller Reconciler error

Related topics