PerconaMongo Operator status is always in progressing

sri.kalapala · March 18, 2025, 6:33am

Description:

Trying cross cluster replication in our environment with two different clusters. We used Istio mesh instead of load balancer due to internal restrictions to expose the replica sets.
As it is cross cluster, we have enabled tls and got ca certs manually. We have configured ServiceEntry and transport layer for smooth connections between replica sets.

Once all the changes applying in steps, we see replica set pod (only one rs configured in secondary where as 3 in primary) keeps crashing.

Steps to Reproduce:

This can be reproduced by deploying the operator again

Version:

mongo: 1.18.0

Logs: kubectl logs mongopod -n --tail=50

{"t":{"$date":"2025-03-18T06:28:26.514+00:00"},"s":"I",  "c":"-",        "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":7400}}
{"t":{"$date":"2025-03-18T06:28:26.815+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn92","msg":"client metadata","attr":{"remote":"127.0.0.6:33885","client":"conn92","negotiatedCompressors":[],"doc":{"driver":{"name":"mongo-go-driver","version":"1.17.1"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.22.8","env":{"container":{"orchestrator":"kubernetes"}}}}}
{"t":{"$date":"2025-03-18T06:28:26.826+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn93","msg":"client metadata","attr":{"remote":"127.0.0.6:49163","client":"conn93","negotiatedCompressors":[],"doc":{"driver":{"name":"mongo-go-driver","version":"1.17.1"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.22.8","env":{"container":{"orchestrator":"kubernetes"}}}}}
{"t":{"$date":"2025-03-18T06:28:27.003+00:00"},"s":"W",  "c":"QUERY",    "id":23799,   "ctx":"ftdc","msg":"Aggregate command executor error","attr":{"error":{"code":26,"codeName":"NamespaceNotFound","errmsg":"Unable to retrieve storageStats in $collStats stage :: caused by :: Collection [local.oplog.rs] not found."},"stats":{},"cmd":{"aggregate":"oplog.rs","cursor":{},"pipeline":[{"$collStats":{"storageStats":{"waitForLock":false,"numericOnly":true}}}],"$db":"local"}}}

Expected Result:

Primary and secondary should be on sync. In primary replication, we need to see replia pod as secondary and healthy

Actual Result:

rs.status() in primary:
_id: 10,
      name: 'fqdn:27015',
      health: 0,
      state: 8,
      stateStr: '(not reachable/healthy)',
      uptime: 0,
      optime: { ts: Timestamp({ t: 0, i: 0 }), t: Long('-1') },
      optimeDurable: { ts: Timestamp({ t: 0, i: 0 }), t: Long('-1') },
      optimeDate: ISODate('1970-01-01T00:00:00.000Z'),
      optimeDurableDate: ISODate('1970-01-01T00:00:00.000Z'),
      lastAppliedWallTime: ISODate('1970-01-01T00:00:00.000Z'),
      lastDurableWallTime: ISODate('1970-01-01T00:00:00.000Z'),
      lastHeartbeat: ISODate('2025-03-18T06:32:07.834Z'),
      lastHeartbeatRecv: ISODate('1970-01-01T00:00:00.000Z'),
      pingMs: Long('0'),
      lastHeartbeatMessage: "Couldn't get a connection within the time limit",
      syncSourceHost: '',
      syncSourceId: -1,
      infoMessage: '',
      configVersion: -1,
      configTerm: -1
    }
  ],
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1742279523, i: 2 }),
    signature: {
      hash: Binary.createFromBase64('tWV/cYSENh+gC788OopzcvEkcQc=', 0),
      keyId: Long('7480054805797273607')

Topic		Replies	Views
Can't get simple deployment to work without errors Percona Operator for MongoDB	4	1061	July 20, 2021
Getting CrashLoopBackoff in rs pods when installing to vanilla k8s Percona Operator for MongoDB	6	2542	July 22, 2020
Primary replicaset constantly restarts Percona Operator for MongoDB percona , mongodb	4	1178	July 12, 2021
Unable to Install Percona server for MongoDB on Kubernetes Percona Operator for MongoDB percona , mongodb	1	1618	September 14, 2021
Create Replicasset MongoDB avec Kuberntes Operator Percona Operator for MongoDB	1	679	June 8, 2020