jamoser
December 21, 2022, 4:48pm
1
Hello
Looks there is a severe bug with cluster in non-sharding setup:
expose:
enabled: true
exposeType: LoadBalancer
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 14m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.9.235 34.65.55.114 27017:31821/TCP 10m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.12.110 34.65.86.207 27017:30988/TCP 10m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.4.11 34.65.88.39 27017:31658/TCP 10m
After deletion and recreation
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/mongodb00-cluster-rs ClusterIP None 27017/TCP 24m
service/mongodb00-cluster-rs-0 LoadBalancer 10.16.1.199 34.65.55.114 27017:31842/TCP 24m
service/mongodb00-cluster-rs-1 LoadBalancer 10.16.3.199 34.65.23.217 27017:32569/TCP 22m
service/mongodb00-cluster-rs-2 LoadBalancer 10.16.11.209 34.65.188.191 27017:31526/TCP 17m
As you can see the external IP are different after recreation - which is not a problem per se … but why is the cluster still looking for the old IPs
{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.88.39 :27017”}}
{“t”:{“$date”:“2022-12-21T16:45:54.788+00:00”},“s”:“I”, “c”:“CONNPOOL”, “id”:22576, “ctx”:“ReplicaSetMonitor-TaskExecutor”,“msg”:“Connecting”,“attr”:{“hostAndPort”:“34.65.86.207 :27017”}}
=> How can this be fixed ?
1 Like
jamoser
December 22, 2022, 8:04am
2
It looks like these external IPs are persisted somewhere in the “Pecona env”. The whole issue could be solved, if that information could be resetted/deleted.
1 Like
Hey @jamoser ,
the problem is that these old IPs are still present on PVCs and new IPs are not used.
I’m not sure what is the solution to this problem yet and how we can keep the IPs intact. We will analyze what can be done.
At the same time we had the same issue when pausing the cluster: [K8SPSMDB-423] Unpause of a cluster doen't work with enabled replsets expose set to LB - Percona JIRA
So the recommended workaround for you is to pause the cluster, not delete. Will it work for you?
1 Like
jamoser
December 27, 2022, 8:47am
4
Hi @Sergey_Pronin
The solution would be to remove the old IPs - it would be sufficient if this could be done manually .
The issue is, if there is whatever reason, that the cluster needs to be recreated, that basically “kills” the whole thing.
Re Pause - yes already noticed that but as I said above, the cluster must be able to be recovered after a “delete”.
Thanks & Regards
John
1 Like
jamoser
December 28, 2022, 5:11pm
5
Tried to fix it with rs.reconfig()
> rs.reconfig(cfg);
{
"topologyVersion" : {
"processId" : ObjectId("63ac6f8eca8b6bf258024f65"),
"counter" : NumberLong(1)
},
"ok" : 0,
"errmsg" : "New config is rejected :: caused by :: replSetReconfig should only be run on a writable PRIMARY. Current state REMOVED;",
"code" : 10107,
"codeName" : "NotWritablePrimary"
}
But seems another guy using Percona distro had the same problem
Hi there,
When iam trying the add the 2nd node to replicaset ,reported the below error… and also how do i reset the replicaset .
When adding the secondary node
> rs.add("psmdb2")
{
"topologyVersion" : {
"processId" : ObjectId("61d5abf2404700b3e321b7b7"),
"counter" : NumberLong(1)
},
"ok" : 0,
` "errmsg" : "New config is rejected :: caused by :: replSetReconfig should only be run on a writable PRIMARY. Current state REMOVED;",`
…
I mean we are going in circles … ?!
1 Like
jamoser
December 29, 2022, 1:53pm
6
Pls close this ticket - it’s really not going to work (soon), unless MongoDB Team takes some action.
1 Like