Non-sharding cluster with exposed pods not working properly

Hello

Looks there is a severe bug with cluster in non-sharding setup:

expose:
  enabled: true
  exposeType: LoadBalancer

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 14m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.9.235 34.65.55.114 27017:31821/TCP 10m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.12.110 34.65.86.207 27017:30988/TCP 10m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.4.11 34.65.88.39 27017:31658/TCP 10m

After deletion and recreation

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 24m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.1.199 34.65.55.114 27017:31842/TCP 24m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.3.199 34.65.23.217 27017:32569/TCP 22m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.11.209 34.65.188.191 27017:31526/TCP 17m

As you can see the external IP are different after recreation - which is not a problem per se … but why is the cluster still looking for the old IPs

{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.88.39:27017”}}

{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.86.207:27017”}}

=> How can this be fixed ?

1 Like

It looks like these external IPs are persisted somewhere in the “Pecona env”. The whole issue could be solved, if that information could be resetted/deleted.

1 Like

Hey @jamoser ,

the problem is that these old IPs are still present on PVCs and new IPs are not used.

I’m not sure what is the solution to this problem yet and how we can keep the IPs intact. We will analyze what can be done.

At the same time we had the same issue when pausing the cluster: [K8SPSMDB-423] Unpause of a cluster doen't work with enabled replsets expose set to LB - Percona JIRA

So the recommended workaround for you is to pause the cluster, not delete. Will it work for you?

1 Like

Hi @Sergey_Pronin

The solution would be to remove the old IPs - it would be sufficient if this could be done manually.

The issue is, if there is whatever reason, that the cluster needs to be recreated, that basically “kills” the whole thing.

Re Pause - yes already noticed that but as I said above, the cluster must be able to be recovered after a “delete”.

Thanks & Regards
John

1 Like

Tried to fix it with rs.reconfig()

> rs.reconfig(cfg);

{

	"topologyVersion" : {

		"processId" : ObjectId("63ac6f8eca8b6bf258024f65"),

		"counter" : NumberLong(1)

	},

	"ok" : 0,

	"errmsg" : "New config is rejected :: caused by :: replSetReconfig should only be run on a writable PRIMARY. Current state REMOVED;",

	"code" : 10107,

	"codeName" : "NotWritablePrimary"

}

But seems another guy using Percona distro had the same problem

I mean we are going in circles … ?!

1 Like

Pls close this ticket - it’s really not going to work (soon), unless MongoDB Team takes some action.

1 Like