Non-sharding cluster with exposed pods not working properly

jamoser · December 21, 2022, 4:48pm

Hello

Looks there is a severe bug with cluster in non-sharding setup:

expose:
  enabled: true
  exposeType: LoadBalancer

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 14m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.9.235 34.65.55.114 27017:31821/TCP 10m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.12.110 34.65.86.207 27017:30988/TCP 10m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.4.11 34.65.88.39 27017:31658/TCP 10m

After deletion and recreation

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 24m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.1.199 34.65.55.114 27017:31842/TCP 24m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.3.199 34.65.23.217 27017:32569/TCP 22m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.11.209 34.65.188.191 27017:31526/TCP 17m

As you can see the external IP are different after recreation - which is not a problem per se … but why is the cluster still looking for the old IPs

{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.88.39:27017”}}

{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.86.207:27017”}}

=> How can this be fixed ?

jamoser · December 22, 2022, 8:04am

It looks like these external IPs are persisted somewhere in the “Pecona env”. The whole issue could be solved, if that information could be resetted/deleted.

Sergey_Pronin · December 26, 2022, 9:27am

Hey @jamoser ,

the problem is that these old IPs are still present on PVCs and new IPs are not used.

I’m not sure what is the solution to this problem yet and how we can keep the IPs intact. We will analyze what can be done.

At the same time we had the same issue when pausing the cluster: [K8SPSMDB-423] Unpause of a cluster doen't work with enabled replsets expose set to LB - Percona JIRA

So the recommended workaround for you is to pause the cluster, not delete. Will it work for you?

jamoser · December 27, 2022, 8:47am

Hi @Sergey_Pronin

The solution would be to remove the old IPs - it would be sufficient if this could be done manually.

The issue is, if there is whatever reason, that the cluster needs to be recreated, that basically “kills” the whole thing.

Re Pause - yes already noticed that but as I said above, the cluster must be able to be recovered after a “delete”.

Thanks & Regards
John

jamoser · December 28, 2022, 5:11pm

Tried to fix it with rs.reconfig()

> rs.reconfig(cfg);

{

	"topologyVersion" : {

		"processId" : ObjectId("63ac6f8eca8b6bf258024f65"),

		"counter" : NumberLong(1)

	},

	"ok" : 0,

	"errmsg" : "New config is rejected :: caused by :: replSetReconfig should only be run on a writable PRIMARY. Current state REMOVED;",

	"code" : 10107,

	"codeName" : "NotWritablePrimary"

}

But seems another guy using Percona distro had the same problem

I mean we are going in circles … ?!

jamoser · December 29, 2022, 1:53pm

Pls close this ticket - it’s really not going to work (soon), unless MongoDB Team takes some action.

Topic		Replies	Views
Expose mongo as NodePort Percona Operator for MongoDB	9	1850	June 9, 2021
Is exposing mongo replicaset with NodePort working? Percona Operator for MongoDB percona , mongodb	28	2729	September 27, 2021
Mongodb cluster, external load balancer created Percona Operator for MongoDB	2	577	June 4, 2021
Cluster with floating IP Percona XtraDB Cluster 5.x	5	1505	July 18, 2012
Can't get simple deployment to work without errors Percona Operator for MongoDB	4	1056	July 20, 2021

Non-sharding cluster with exposed pods not working properly

Related topics