No replicaset available on and off

Hi dears,

I’ve a problem that I’m facing for about 2 months. The problem is that my customized mongodb server SOMETIMES starts without replicaset while being hosted in Openshift( both 3.11 and 4). As I checked logs it looks like server is looking for replicaset so it attempts to create new one but it fails. There are most important logs

{"t":{"$date":"2024-01-30T08:52:25.330+00:00"},"s":"I",  "c":"REPL",     "id":6005300, "ctx":"initandlisten","msg":"Starting up replica set aware services"}
{"t":{"$date":"2024-01-30T08:52:25.331+00:00"},"s":"I",  "c":"REPL",     "id":4280500, "ctx":"initandlisten","msg":"Attempting to create internal replication collections"}
{"t":{"$date":"2024-01-30T08:52:25.331+00:00"},"s":"I",  "c":"-",        "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":200}}
{"t":{"$date":"2024-01-30T08:52:25.334+00:00"},"s":"I",  "c":"REPL",     "id":4280501, "ctx":"initandlisten","msg":"Attempting to load local voted for document"}
{"t":{"$date":"2024-01-30T08:52:25.335+00:00"},"s":"I",  "c":"REPL",     "id":4280502, "ctx":"initandlisten","msg":"Searching for local Rollback ID document"}
{"t":{"$date":"2024-01-30T08:52:25.336+00:00"},"s":"I",  "c":"REPL",     "id":21529,   "ctx":"initandlisten","msg":"Initializing rollback ID","attr":{"rbid":14}}
{"t":{"$date":"2024-01-30T08:52:25.336+00:00"},"s":"I",  "c":"REPL",     "id":4280504, "ctx":"initandlisten","msg":"Cleaning up any partially applied oplog batches & reading last op from oplog"}
{"t":{"$date":"2024-01-30T08:52:25.337+00:00"},"s":"I",  "c":"CONTROL",  "id":6608200, "ctx":"initandlisten","msg":"Initializing cluster server parameters from disk"}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I",  "c":"REPL",     "id":21544,   "ctx":"initandlisten","msg":"Recovering from stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1706604728,"i":2}},"topOfOplog":{"ts":{"$timestamp":{"t":1706604728,"i":2}},"t":41},"appliedThrough":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I",  "c":"REPL",     "id":21545,   "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1706604728,"i":2}}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I",  "c":"REPL",     "id":5466604, "ctx":"initandlisten","msg":"Start point for recovery oplog application exists in oplog. No adjustment necessary","attr":{"startPoint":{"$timestamp":{"t":1706604728,"i":2}}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I",  "c":"REPL",     "id":21549,   "ctx":"initandlisten","msg":"No oplog entries to apply for recovery. Start point is at the top of the oplog"}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I",  "c":"REPL",     "id":4280505, "ctx":"initandlisten","msg":"Creating any necessary TenantMigrationAccessBlockers for unfinished migrations"}
{"t":{"$date":"2024-01-30T08:52:25.339+00:00"},"s":"I",  "c":"TENANT_M", "id":6531504, "ctx":"initandlisten","msg":"Clearing serverless operation lock registry on shutdown","attr":{"ns":null}}
{"t":{"$date":"2024-01-30T08:52:25.340+00:00"},"s":"I",  "c":"REPL",     "id":4280506, "ctx":"initandlisten","msg":"Reconstructing prepared transactions"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I",  "c":"REPL",     "id":4280507, "ctx":"initandlisten","msg":"Loaded replica set config, scheduled callback to set local config"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I",  "c":"REPL",     "id":4280508, "ctx":"ReplCoord-0","msg":"Attempting to set local replica set config; validating config for startup"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I",  "c":"STORAGE",  "id":22262,   "ctx":"initandlisten","msg":"Timestamp monitor starting"}
{"t":{"$date":"2024-01-30T08:52:25.342+00:00"},"s":"I",  "c":"QUERY",    "id":7080100, "ctx":"ChangeStreamExpiredPreImagesRemover","msg":"Starting Change Stream Expired Pre-images Remover thread"}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I",  "c":"CONTROL",  "id":20710,   "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I",  "c":"CONTROL",  "id":20711,   "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I",  "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
"t":{"$date":"2024-01-30T08:52:25.532+00:00"},"s":"I",  "c":"-",        "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}

It’s hilarious because if I kill pod a few times it’s going to run properly eventually, I thought it may be caused by difference in signals sent to pod by openshift during killing process but with my observations it does not matter if I send SIGKILL or SIGTERM but as for me it’s dead end.

Please see my full configuration:


EXPOSE 27017

USER root

RUN yum update && yum upgrade -y
RUN yum install -y cyrus-sasl cyrus-sasl-plain
RUN yum install openldap-clients -y 

RUN rm -f etc/mongod.conf

COPY mongodb_keyfile /etc/mongodb/keys/
COPY file.js . 
COPY admessage.js . 
COPY saslauthd.conf /etc 
COPY mongod.conf /etc 
COPY mongodb.conf /etc/sasl2
COPY custommongod /usr/local/bin

RUN chmod +rx  / && chown -R mongodb:root /
RUN chmod +x / && mkdir -p /etc/mongodb/keys/

RUN chmod 0400 /etc/mongodb/keys/mongodb_keyfile  && chown -R mongodb:root /etc/mongodb/keys/mongodb_keyfile

RUN chmod +x \
    && chown -R mongodb:root data 

RUN chmod +x file.js
RUN chmod +x admessage.js

RUN sed -i -e s/^MECH=pam/MECH=ldap/g /etc/sysconfig/saslauthd 

RUN chmod 644 /etc/saslauthd.conf && chown mongodb:root /etc/saslauthd.conf

RUN chmod 755 run && chown mongodb:root run \
&& chmod 755 etc && chown mongodb:root etc

RUN touch /var/log/audit.json \
    && chmod 777 /var/log/audit.json && chown mongodb:root /var/log/audit.json

RUN chmod +x
RUN chmod +x

RUN chmod 755 /usr/local/bin/custommongod
RUN chown mongodb:root /usr/local/bin/custommongod

RUN touch /var/log/forklog \
    && chmod 777 /var/log/forklog && chown mongodb:root /var/log/forklog

USER mongodb

CMD ["custommongod"]



echo "🔥"

if [ "$MONGO_INIT_ENABLED" = "false" ] ||  [ -f "$FILE" ]; then
  echo "🔥🔥🔥init enabled, initializing"
  mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip  "$MONGO_BIND_IP" &
  #--config /etc/mongod.conf &
  bash / 
  mongosh admin file.js
  mongod --shutdown;
  echo "Second run🔥🔥🔥🔥🔥🔥🔥🔥"
  mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip  "$MONGO_BIND_IP" --auth --keyFile /etc/mongodb/keys/mongodb_keyfile &
  sleep 10
  mongosh percona admessage.js -u admin -p admin --authenticationDatabase admin
  mongod --shutdown;
  echo "Third run🔥🔥"
  mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip  "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile &
  sleep 10
  mongod --shutdown;
  export MONGO_INIT_ENABLED=false
  touch /data/db/inittialized
  echo "🔥🔥🔥🔥Setup completed"

echo "🔥🔥Running MongoDB"
saslauthd -a ldap 
# mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip  "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile --dbpath data/db --auditDestination file --auditFormat JSON --auditPath /var/log/audit.json

exec "$@"


echo "EXEC: mu script. 🚀🚀🚀🚀🚀 "
check_connection() {
    mongosh --eval "print('Connected')" 2>/dev/null
    return $?

while true; do
    echo "Awaiting 5s to ensure mongod is up"
    sleep 5
    if check_connection; then
        echo "Connected🔥🔥🔥🔥"
        mongosh --eval "rs.initiate(
 _id: 'rs0',
 members: [
 { _id: 0, host: 'mongodb-server:27017'},
        mongosh --eval "db.getSiblingDB('admin').createRole({ 'role': 'pbmAnyAction',
      'privileges': [
         { 'resource': { 'anyResource': true },
           'actions': [ 'anyAction' ]
      'roles': []
        mongosh --eval "db.getSiblingDB('admin').createUser({user: 'pbmuser',
       'pwd': 'secretpwd',
       'roles' : [
          { 'db' : 'admin', 'role' : 'readWrite', 'collection': '' },
          { 'db' : 'admin', 'role' : 'backup' },
          { 'db' : 'admin', 'role' : 'clusterMonitor' },
          { 'db' : 'admin', 'role' : 'restore' },
          { 'db' : 'admin', 'role' : 'pbmAnyAction' }
        echo "Awaiting 5s"


db.createUser({'user': 'admin', 'pwd': 'admin', 'roles': ['root']})


db.test.insert({"message": "Active Directory user success!"})

I read multiple threads in this matter and the only possible solution I found is disabling the fork in processManagement but it’s disabled by default.

The custom custommongod command is

exec mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip  "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile --dbpath data/db --auditDestination file --auditFormat JSON --auditPath /var/log/audit.json

I think it’s important to mention that whenever I try to execute replicaset while it’s corrupted and replicaset shall not be available, I’m getting such message

test> rs.initiate()
MongoServerError: already initialized