Description:
MongoDB Cluster cannot failover when down time all pods
Steps to Reproduce:
Kubectl delete (all pods in replicaset) --force -n namespace
or shutdown all nodes in K8s cluster
Version:
Percona Operator for MongoDB* 1.15.0
Logs:
Logs in Operator:
2024-04-25T08:48:42.241Z ERROR failed to reconcile cluster {“controller”: “psmdb-controller”, “object”: {“name”:“mongo-psmdb-db”,“namespace”:“tungdt”}, “namespace”: “tungdt”, “name”: “mongo-psmdb-db”, “reconcileID”: “7614bf82-1bf1-43e0-b4ab-f07b6c5a358c”, “replset”: “rs0”, “error”: “dial: ping mongo: server selection error: context deadline exceeded, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: mongo-psmdb-db-rs0-0.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-0.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, { Addr: mongo-psmdb-db-rs0-1.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-1.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, { Addr: mongo-psmdb-db-rs0-2.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-2.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, ] }”, “errorVerbose”: “server selection error: context deadline exceeded, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: mongo-psmdb-db-rs0-0.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-0.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, { Addr: mongo-psmdb-db-rs0-1.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-1.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, { Addr: mongo-psmdb-db-rs0-2.mongo-psmdb-db-rs0.tungdt.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp: lookup mongo-psmdb-db-rs0-2.mongo-psmdb-db-rs0.tungdt.svc.cluster.local on 10.43.0.10:53: no such host }, ] }\nping mongo\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.Dial\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:112\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb.MongoClient\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/client.go:62\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*mongoClientProvider).Mongo\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:38\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).mongoClientWithRole\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/connections.go:60\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:87\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:498\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\ndial\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:93\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:498\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.1/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598”} github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile
/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:500
Expected Result:
The cluster returns to normal operation and rs.status() displays information about the ready state of the cluster
Actual Result:
The cluster enters the RS Ghost state and becomes inoperable