PSMDB 7.0.4 SOMETIMES starts without replicaset while has Liveness Probe in Openshift 3 and 4

Iliterallyneedhelp · January 19, 2024, 7:48am

HI,

I’m trying to add Liveness Probe to my MongoDB in Openshift. Without these parameters Mongo can be deployed perfectly every single time. I had to create single node replica set as pbm-agent requires it, so I need to run Mongo with replicaset.
I’m creating my replicaset like this

"rs.initiate(
 {
 _id: 'rs0',
 members: [
 { _id: 0, host: '(openshift-service-name):27017'},
 ]
 });"

Then I’m about to add Liveness Probe in helm chart

           livenessProbe:
             exec:
               command: [ "mongosh", "--host",  "localhost", "--port", "27017", "-u", "admin", "-p", "'admin'", "--authenticationDatabase", "'admin'", "--eval", "'db.getSiblingDB(\"admin\").runCommand({ replSetGetStatus: 1 }).ok ? 0 : 2'" ]
             initialDelaySeconds: 70         
             periodSeconds: 60
             failureThreshold: 3
             timeoutSeconds: 20

It’s interesting as Liveness check with the configuration given above also “corrupts” database, but at the first run it always run normally. However if I kill the pod it’s going to be gone. It’s also confusing as if the pod is going to be killed by stateful set due to achieving failure threshold in liveness probe it will start normally almost all the time

On the other hand it’s impossible to run Readiness Probe as container is permanently in not ready state, does not matter what command I’m going to provide here. For example echo is impossible as well. I’ve tried to remove initial delay, set is as 10 seconds, have wider timeout frame etc.

I believe there are most important logs:

{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I",  "c":"CONTROL",  "id":20711,   "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I",  "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
{"t":{"$date":"2024-01-18T06:15:34.307+00:00"},"s":"I",  "c":"NETWORK",  "id":5693100, "ctx":"ReplCoord-0","msg":"Asio socket.set_option failed with std::system_error","attr":{"note":"connect (sync) TCP fast open","option":{"level":6,"name":30,"data":"01 00 00 00"},"error":{"what":"set_option: Protocol not available","message":"Protocol not available","category":"asio.system","value":92}}}
{"t":{"$date":"2024-01-18T06:15:34.466+00:00"},"s":"I",  "c":"-",        "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I",  "c":"REPL",     "id":21394,   "ctx":"ReplCoord-0","msg":"This node is not a member of the config"}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I",  "c":"REPL",     "id":21358,   "ctx":"ReplCoord-0","msg":"Replica set state transition","attr":{"newState":"REMOVED","oldState":"STARTUP"}}
{"t":{"$date":"2024-01-18T06:16:14.306+00:00"},"s":"I",  "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}

I consider it as a bug as it’s really random if Mongo is going to start properly

Slava_Sarzhan · January 23, 2024, 7:37am

@Iliterallyneedhelp, the PSMDB operator, does not support PSMDB 7 at all. We plan to add it in the next PSDMB operator release.

Iliterallyneedhelp · January 24, 2024, 6:58am

Hi,
Thanks for your response. So deploying PSMDB on OpenShift/K8s by the hand is not supported at all? Do I need to use PSMDB operator instead?

Anyway it’s still not solving my issue. Could you please advise how should I troubleshoot it?

Slava_Sarzhan · January 24, 2024, 8:29am

We support OpenShift, but you can’t use PSMDB v7 with our operator. You can use v6, v5 and v4.4 but not v7. We have official documentation how to deploy operator on OpenShift Install on OpenShift - Percona Operator for MongoDB . Try to use the supported version of PSMDB and inform us about the results.

Iliterallyneedhelp · January 24, 2024, 10:16am

I’ve some requirements that need to be fulfilled

Mongo DB 7
LDAP connected with AD
Incremental backups ( as I checked it’s not available in operator)

I achieved that with 2 separated containers(PSMDB and PBM agent) modified in docker and tested locally. Everything works perfect in docker( except incremental backup restore[ it does not work because mongod is not reachable for pbm agent that needs to stop daemon to perform backup. If you have an idea how can I fix it I’d appreciate it]), unfortunately after deployment in OpenShift. Server is going to be corrupted eventually until next reboot.

I do not think I can use Operator PSMDB instead of modified containers in Openshift at the moment

Sumeet_Chaudhari · February 14, 2024, 9:35pm

Do we have timeline when new PSDMB operator would be released which supports 7 version?

Slava_Sarzhan · March 12, 2024, 6:58pm

Hi @Sumeet_Chaudhari, we will include PSMDB 7 version in the next operator release. I hope we will have it in one month.

Sumeet_Chaudhari · May 17, 2024, 2:38pm

Do we know when this would support 7 version?

Tomislav_Plavcic · May 20, 2024, 7:27am

Hi @Sumeet_Chaudhari !
I think the release should happen this or next week, so pay attention to release notes and announcements.

Sergey_Pronin · May 29, 2024, 7:17am

@Sumeet_Chaudhari please try Operator 1.16, it now supports MongoDB version 7.

Release notes: Percona Operator for MongoDB 1.16.0 (2024-05-24) - Percona Operator for MongoDB

Sumeet_Chaudhari · May 29, 2024, 1:35pm

Nice, is there a upgrade guide for both operator and mongodb

Sergey_Pronin · May 29, 2024, 1:37pm

Have a look here: Upgrade MongoDB and the Operator - Percona Operator for MongoDB

Topic		Replies	Views
Mongo pods constantly restart due to failed liveness probe Percona Operator for MongoDB percona , mongodb	9	2399	January 9, 2025
Kubernetes PSMDB shutdown signal 15 Percona Operator for MongoDB percona , mongodb , kubernetes	11	2383	September 7, 2021
Liveness probe failed causing production pod to reboot Percona Operator for MongoDB	4	364	July 19, 2024
Failed liveness probe in an unmanaged cluster Percona Operator for MongoDB closed-no-reply	0	1083	October 28, 2022
Mongodb container is not coming up and psmdb is showing stopping state Percona Operator for MongoDB mongodb	4	587	February 28, 2024

PSMDB 7.0.4 SOMETIMES starts without replicaset while has Liveness Probe in Openshift 3 and 4

Related topics