HI,
I’m trying to add Liveness Probe to my MongoDB in Openshift. Without these parameters Mongo can be deployed perfectly every single time. I had to create single node replica set as pbm-agent requires it, so I need to run Mongo with replicaset.
I’m creating my replicaset like this
"rs.initiate(
{
_id: 'rs0',
members: [
{ _id: 0, host: '(openshift-service-name):27017'},
]
});"
Then I’m about to add Liveness Probe in helm chart
livenessProbe:
exec:
command: [ "mongosh", "--host", "localhost", "--port", "27017", "-u", "admin", "-p", "'admin'", "--authenticationDatabase", "'admin'", "--eval", "'db.getSiblingDB(\"admin\").runCommand({ replSetGetStatus: 1 }).ok ? 0 : 2'" ]
initialDelaySeconds: 70
periodSeconds: 60
failureThreshold: 3
timeoutSeconds: 20
It’s interesting as Liveness check with the configuration given above also “corrupts” database, but at the first run it always run normally. However if I kill the pod it’s going to be gone. It’s also confusing as if the pod is going to be killed by stateful set due to achieving failure threshold in liveness probe it will start normally almost all the time
On the other hand it’s impossible to run Readiness Probe as container is permanently in not ready state, does not matter what command I’m going to provide here. For example echo is impossible as well. I’ve tried to remove initial delay, set is as 10 seconds, have wider timeout frame etc.
I believe there are most important logs:
{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I", "c":"CONTROL", "id":20711, "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-18T06:15:34.306+00:00"},"s":"I", "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
{"t":{"$date":"2024-01-18T06:15:34.307+00:00"},"s":"I", "c":"NETWORK", "id":5693100, "ctx":"ReplCoord-0","msg":"Asio socket.set_option failed with std::system_error","attr":{"note":"connect (sync) TCP fast open","option":{"level":6,"name":30,"data":"01 00 00 00"},"error":{"what":"set_option: Protocol not available","message":"Protocol not available","category":"asio.system","value":92}}}
{"t":{"$date":"2024-01-18T06:15:34.466+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I", "c":"REPL", "id":21394, "ctx":"ReplCoord-0","msg":"This node is not a member of the config"}
{"t":{"$date":"2024-01-18T06:15:36.800+00:00"},"s":"I", "c":"REPL", "id":21358, "ctx":"ReplCoord-0","msg":"Replica set state transition","attr":{"newState":"REMOVED","oldState":"STARTUP"}}
{"t":{"$date":"2024-01-18T06:16:14.306+00:00"},"s":"I", "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
I consider it as a bug as it’s really random if Mongo is going to start properly