Hi dears,
I’ve a problem that I’m facing for about 2 months. The problem is that my customized mongodb server SOMETIMES starts without replicaset while being hosted in Openshift( both 3.11 and 4). As I checked logs it looks like server is looking for replicaset so it attempts to create new one but it fails. There are most important logs
{"t":{"$date":"2024-01-30T08:52:25.330+00:00"},"s":"I", "c":"REPL", "id":6005300, "ctx":"initandlisten","msg":"Starting up replica set aware services"}
{"t":{"$date":"2024-01-30T08:52:25.331+00:00"},"s":"I", "c":"REPL", "id":4280500, "ctx":"initandlisten","msg":"Attempting to create internal replication collections"}
{"t":{"$date":"2024-01-30T08:52:25.331+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":200}}
{"t":{"$date":"2024-01-30T08:52:25.334+00:00"},"s":"I", "c":"REPL", "id":4280501, "ctx":"initandlisten","msg":"Attempting to load local voted for document"}
{"t":{"$date":"2024-01-30T08:52:25.335+00:00"},"s":"I", "c":"REPL", "id":4280502, "ctx":"initandlisten","msg":"Searching for local Rollback ID document"}
{"t":{"$date":"2024-01-30T08:52:25.336+00:00"},"s":"I", "c":"REPL", "id":21529, "ctx":"initandlisten","msg":"Initializing rollback ID","attr":{"rbid":14}}
{"t":{"$date":"2024-01-30T08:52:25.336+00:00"},"s":"I", "c":"REPL", "id":4280504, "ctx":"initandlisten","msg":"Cleaning up any partially applied oplog batches & reading last op from oplog"}
{"t":{"$date":"2024-01-30T08:52:25.337+00:00"},"s":"I", "c":"CONTROL", "id":6608200, "ctx":"initandlisten","msg":"Initializing cluster server parameters from disk"}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I", "c":"REPL", "id":21544, "ctx":"initandlisten","msg":"Recovering from stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1706604728,"i":2}},"topOfOplog":{"ts":{"$timestamp":{"t":1706604728,"i":2}},"t":41},"appliedThrough":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I", "c":"REPL", "id":21545, "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1706604728,"i":2}}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I", "c":"REPL", "id":5466604, "ctx":"initandlisten","msg":"Start point for recovery oplog application exists in oplog. No adjustment necessary","attr":{"startPoint":{"$timestamp":{"t":1706604728,"i":2}}}}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I", "c":"REPL", "id":21549, "ctx":"initandlisten","msg":"No oplog entries to apply for recovery. Start point is at the top of the oplog"}
{"t":{"$date":"2024-01-30T08:52:25.338+00:00"},"s":"I", "c":"REPL", "id":4280505, "ctx":"initandlisten","msg":"Creating any necessary TenantMigrationAccessBlockers for unfinished migrations"}
{"t":{"$date":"2024-01-30T08:52:25.339+00:00"},"s":"I", "c":"TENANT_M", "id":6531504, "ctx":"initandlisten","msg":"Clearing serverless operation lock registry on shutdown","attr":{"ns":null}}
{"t":{"$date":"2024-01-30T08:52:25.340+00:00"},"s":"I", "c":"REPL", "id":4280506, "ctx":"initandlisten","msg":"Reconstructing prepared transactions"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I", "c":"REPL", "id":4280507, "ctx":"initandlisten","msg":"Loaded replica set config, scheduled callback to set local config"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I", "c":"REPL", "id":4280508, "ctx":"ReplCoord-0","msg":"Attempting to set local replica set config; validating config for startup"}
{"t":{"$date":"2024-01-30T08:52:25.341+00:00"},"s":"I", "c":"STORAGE", "id":22262, "ctx":"initandlisten","msg":"Timestamp monitor starting"}
{"t":{"$date":"2024-01-30T08:52:25.342+00:00"},"s":"I", "c":"QUERY", "id":7080100, "ctx":"ChangeStreamExpiredPreImagesRemover","msg":"Starting Change Stream Expired Pre-images Remover thread"}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I", "c":"CONTROL", "id":20710, "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I", "c":"CONTROL", "id":20711, "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2024-01-30T08:52:25.343+00:00"},"s":"I", "c":"SHARDING", "id":7012500, "ctx":"QueryAnalysisConfigurationsRefresher","msg":"Failed to refresh query analysis configurations, will try again at the next interval","attr":{"error":"PrimarySteppedDown: No primary exists currently"}}
"t":{"$date":"2024-01-30T08:52:25.532+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"ReadConcernMajorityNotAvailableYet: Read concern majority reads are currently not possible.","nextWakeupMillis":400}}
It’s hilarious because if I kill pod a few times it’s going to run properly eventually, I thought it may be caused by difference in signals sent to pod by openshift during killing process but with my observations it does not matter if I send SIGKILL or SIGTERM but as for me it’s dead end.
Please see my full configuration:
Dockerfile:
FROM docker.io/percona/percona-server-mongodb:7.0.4
EXPOSE 27017
USER root
RUN yum update && yum upgrade -y
RUN yum install -y cyrus-sasl cyrus-sasl-plain
RUN yum install openldap-clients -y
RUN rm -f etc/mongod.conf
ENV MONGO_BIND_IP="0.0.0.0"
ENV MONGO_INIT_ENABLED="true"
COPY entrypoint_rs.sh /
COPY mongodb_keyfile /etc/mongodb/keys/
COPY script.sh .
COPY file.js .
COPY admessage.js .
COPY saslauthd.conf /etc
COPY mongod.conf /etc
COPY mongodb.conf /etc/sasl2
COPY custommongod /usr/local/bin
RUN chmod +rx /entrypoint_rs.sh && chown -R mongodb:root /entrypoint_rs.sh
RUN chmod +x /entrypoint_rs.sh && mkdir -p /etc/mongodb/keys/
RUN chmod 0400 /etc/mongodb/keys/mongodb_keyfile && chown -R mongodb:root /etc/mongodb/keys/mongodb_keyfile
RUN chmod +x script.sh \
&& chown -R mongodb:root data
RUN chmod +x file.js
RUN chmod +x admessage.js
RUN sed -i -e s/^MECH=pam/MECH=ldap/g /etc/sysconfig/saslauthd
RUN chmod 644 /etc/saslauthd.conf && chown mongodb:root /etc/saslauthd.conf
RUN chmod 755 run && chown mongodb:root run \
&& chmod 755 etc && chown mongodb:root etc
RUN touch /var/log/audit.json \
&& chmod 777 /var/log/audit.json && chown mongodb:root /var/log/audit.json
RUN chmod +x addusers.sh
RUN chmod +x newjoiner.sh
RUN chmod 755 /usr/local/bin/custommongod
RUN chown mongodb:root /usr/local/bin/custommongod
RUN touch /var/log/forklog \
&& chmod 777 /var/log/forklog && chown mongodb:root /var/log/forklog
USER mongodb
ENTRYPOINT ["/entrypoint_rs.sh"]
CMD ["custommongod"]
Entrypoint:
#!/bin/bash
echo "🔥entrypoint_rs.sh"
FILE=/data/db/inittialized
if [ "$MONGO_INIT_ENABLED" = "false" ] || [ -f "$FILE" ]; then
echo $MONGO_INIT_ENABLED
else
echo "🔥🔥🔥init enabled, initializing"
mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip "$MONGO_BIND_IP" &
#--config /etc/mongod.conf &
bash /script.sh
mongosh admin file.js
mongod --shutdown;
echo "Second run🔥🔥🔥🔥🔥🔥🔥🔥"
mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip "$MONGO_BIND_IP" --auth --keyFile /etc/mongodb/keys/mongodb_keyfile &
sleep 10
mongosh percona admessage.js -u admin -p admin --authenticationDatabase admin
mongod --shutdown;
echo "Third run🔥🔥"
mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile &
sleep 10
mongod --shutdown;
export MONGO_INIT_ENABLED=false
touch /data/db/inittialized
echo "🔥🔥🔥🔥Setup completed"
fi
echo "🔥🔥Running MongoDB"
saslauthd -a ldap
# mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile --dbpath data/db --auditDestination file --auditFormat JSON --auditPath /var/log/audit.json
exec "$@"
script.sh
#!/bin/bash
echo "EXEC: mu script. 🚀🚀🚀🚀🚀 "
check_connection() {
mongosh --eval "print('Connected')" 2>/dev/null
return $?
}
while true; do
echo "Awaiting 5s to ensure mongod is up"
sleep 5
if check_connection; then
echo "Connected🔥🔥🔥🔥"
mongosh --eval "rs.initiate(
{
_id: 'rs0',
members: [
{ _id: 0, host: 'mongodb-server:27017'},
]
});"
mongosh --eval "db.getSiblingDB('admin').createRole({ 'role': 'pbmAnyAction',
'privileges': [
{ 'resource': { 'anyResource': true },
'actions': [ 'anyAction' ]
}
],
'roles': []
});"
mongosh --eval "db.getSiblingDB('admin').createUser({user: 'pbmuser',
'pwd': 'secretpwd',
'roles' : [
{ 'db' : 'admin', 'role' : 'readWrite', 'collection': '' },
{ 'db' : 'admin', 'role' : 'backup' },
{ 'db' : 'admin', 'role' : 'clusterMonitor' },
{ 'db' : 'admin', 'role' : 'restore' },
{ 'db' : 'admin', 'role' : 'pbmAnyAction' }
]
});"
break
else
echo "Awaiting 5s"
fi
done
file.js
db.createUser({'user': 'admin', 'pwd': 'admin', 'roles': ['root']})
addmessage.js
db.test.insert({"message": "Active Directory user success!"})
I read multiple threads in this matter and the only possible solution I found is disabling the fork in processManagement but it’s disabled by default.
The custom custommongod command is
exec mongod --wiredTigerCacheSizeGB 0.25 --replSet rs0 --bind_ip "$MONGO_BIND_IP" --setParameter authenticationMechanisms=PLAIN,SCRAM-SHA-1 --auth --keyFile /etc/mongodb/keys/mongodb_keyfile --dbpath data/db --auditDestination file --auditFormat JSON --auditPath /var/log/audit.json
EDIT1
I think it’s important to mention that whenever I try to execute replicaset while it’s corrupted and replicaset shall not be available, I’m getting such message
test> rs.initiate()
MongoServerError: already initialized