Mongodb error code 127

Description:

Dear Comminuty,
I have a kubernetes cluster (6 node) deployed the operator with offical documentation

and then try to create Percona Server for mongodb based on cr.yaml But the pods are not running.

Version:

Kubernetes version: 1.26.3
Operator: 1.14.0

Logs:

Image:         percona/percona-server-mongodb-operator:1.14.0
Image ID:      docker.io/percona/percona-server-mongodb-operator@sha256:b5db0eae838e338f43633163a43579f57468b05144bde1fa161825a132b29bd2
Port:          <none>
Host Port:     <none>
Command:
  /init-entrypoint.sh
State:          Waiting
  Reason:       CrashLoopBackOff
Last State:     Terminated
  Reason:       Error
  Exit Code:    127

Can you please find what is the problem?

and sorry but I am a new user and not able to provide more information because of the 2 link limitation

Hi Mateus,
welcome to the Percona forum community.

It’s definitely possible to run mongodb operator with k8s 1.26.3 in other environments.
Let’s debug your issue:
kubectl describe pod
kubectl --all-containers --prefix logs

In addition the operator pod logs could contain important messages.

The exit code 127 usually means file not found. Please check storage classes: kubectl describe sc
Sometimes k8s admins are not configuring default storage class properly and /data/db directory is root owned. If you need to use non-default storage class, set it in cr.yaml with storageClassName options for each container.

Best regards,
Nickolay

Hi Nickolay,

Thank you for investigate my issue, I really appreciate it.
Pod describe:

kubectl describe pod/my-cluster-name-rs0-0 -n psmdb
Name: my-cluster-name-rs0-0
Namespace: psmdb
Priority: 0
Service Account: default
Node: node2/178.0.3.102
Start Time: Mon, 31 Jul 2023 09:43:08 +0200
Labels: app.kubernetes.io/component=mongod
app.kubernetes.io/instance=my-cluster-name
app.kubernetes.io/managed-by=percona-server-mongodb-operator
app.kubernetes.io/name=percona-server-mongodb
app.kubernetes.io/part-of=percona-server-mongodb
app.kubernetes.io/replset=rs0
controller-revision-hash=my-cluster-name-rs0-57f5bcdbdf
statefulset.kubernetes.io/pod-name=my-cluster-name-rs0-0
Annotations: cni.projectcalico.org/containerID: 64c6e2ddefa4b22add3b3df35194adf20d85c3f5afb8b9a0dd7b75e3d7959cbe
cni.projectcalico.org/podIP: 10.233.96.76/32
cni.projectcalico.org/podIPs: 10.233.96.76/32
Open Source Database Software Support & Services | Percona
Open Source Database Software Support & Services | Percona
Status: Pending
IP: 10.233.96.76
IPs:
IP: 10.233.96.76
Controlled By: StatefulSet/my-cluster-name-rs0
Init Containers:
mongo-init:
Container ID: containerd://533d56b5aedf8658d6329f0c9636601f1fa9c9c561f2566bbfc9a51fa3c22b5f
Image: percona/percona-server-mongodb-operator:1.14.0
Image ID: docker.io/percona/percona-server-mongodb-operator@sha256:b5db0eae838e338f43633163a43579f57468b05144bde1fa161825a132b29bd2
Port:
Host Port:
Command:
/init-entrypoint.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 127
Started: Mon, 31 Jul 2023 12:28:49 +0200
Finished: Mon, 31 Jul 2023 12:28:49 +0200
Ready: False
Restart Count: 37
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Environment:
Mounts:
/data/db from mongod-data (rw)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kh74v (ro)
Containers:
mongod:
Container ID:
Image: percona/percona-server-mongodb:4.4
Image ID:
Port: 27017/TCP
Host Port: 0/TCP
Command:
/opt/percona/ps-entry.sh
Args:
–bind_ip_all
–auth
–dbpath=/data/db
–port=27017
–replSet=rs0
–storageEngine=wiredTiger
–relaxPermChecks
–sslAllowInvalidCertificates
–clusterAuthMode=keyFile
–keyFile=/etc/mongodb-secrets/mongodb-key
–shardsvr
–enableEncryption
–encryptionKeyFile=/etc/mongodb-encryption/encryption-key
–wiredTigerCacheSizeGB=0.25
–wiredTigerIndexPrefixCompression=true
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Liveness: exec [/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200] delay=60s timeout=10s period=30s #success=1 #fa
ilure=4
Readiness: tcp-socket :27017 delay=10s timeout=2s period=3s #success=1 #failure=8
Environment Variables from:
internal-my-cluster-name-users Secret Optional: false
Environment:
SERVICE_NAME: my-cluster-name
NAMESPACE: psmdb
MONGODB_PORT: 27017
MONGODB_REPLSET: rs0
Mounts:
/data/db from mongod-data (rw)
/etc/mongodb-encryption from my-cluster-name-mongodb-encryption-key (ro)
/etc/mongodb-secrets from my-cluster-name-mongodb-keyfile (ro)
/etc/mongodb-ssl from ssl (ro)
/etc/mongodb-ssl-internal from ssl-internal (ro)
/etc/users-secret from users-secret-file (rw)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kh74v (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
mongod-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mongod-data-my-cluster-name-rs0-0
ReadOnly: false
my-cluster-name-mongodb-keyfile:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-mongodb-keyfile
Optional: false
bin:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
my-cluster-name-mongodb-encryption-key:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-mongodb-encryption-key
Optional: false
ssl:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl
Optional: true
ssl-internal:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl-internal
Optional: true
users-secret-file:
Type: Secret (a volume populated by a Secret)
SecretName: internal-my-cluster-name-users
Optional: false
kube-api-access-kh74v:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning BackOff 14m (x716 over 169m) kubelet Back-off restarting failed container mongo-init in pod my-cluster-name-rs0-0_psmdb(bab2e95a-e1c1-4591-bb0b-67fb369b68d8)
Normal Pulling 4m17s (x38 over 169m) kubelet Pulling image “percona/percona-server-mongodb-operator:1.14.0”
PS C:\Users\mkapas> kubectl describe pod/my-cluster-name-mongos-0 -n psmdb
Name: my-cluster-name-mongos-0
Namespace: psmdb
Priority: 0
Service Account: default
Node: node3/178.0.3.103
Start Time: Fri, 28 Jul 2023 14:45:54 +0200
Labels: app.kubernetes.io/component=mongos
app.kubernetes.io/instance=my-cluster-name
app.kubernetes.io/managed-by=percona-server-mongodb-operator
app.kubernetes.io/name=percona-server-mongodb
app.kubernetes.io/part-of=percona-server-mongodb
controller-revision-hash=my-cluster-name-mongos-56d7865c99
statefulset.kubernetes.io/pod-name=my-cluster-name-mongos-0
Annotations: cni.projectcalico.org/containerID: 80d529ad482895cd5a12d8327bed8d278466bb05dd25a6308ade0e4574b7a16a
cni.projectcalico.org/podIP: 10.233.92.74/32
cni.projectcalico.org/podIPs: 10.233.92.74/32
Open Source Database Software Support & Services | Percona
Open Source Database Software Support & Services | Percona
Status: Pending
IP: 10.233.92.74
IPs:
IP: 10.233.92.74
Controlled By: StatefulSet/my-cluster-name-mongos
Init Containers:
mongo-init:
Container ID: containerd://22962b646e111dfdf24d9a860f075e8514e2a1cafb08fcb7c37405ef86051028
Image: percona/percona-server-mongodb-operator:1.14.0
Image ID: docker.io/percona/percona-server-mongodb-operator@sha256:b5db0eae838e338f43633163a43579f57468b05144bde1fa161825a132b29bd2
Port:
Host Port:
Command:
/init-entrypoint.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 127
Started: Mon, 31 Jul 2023 12:29:01 +0200
Finished: Mon, 31 Jul 2023 12:29:01 +0200
Ready: False
Restart Count: 819
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Environment:
Mounts:
/data/db from mongod-data (rw)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vghbt (ro)
Containers:
mongos:
Container ID:
Image: percona/percona-server-mongodb:4.4
Image ID:
Port: 27017/TCP
Host Port: 0/TCP
Command:
/opt/percona/ps-entry.sh
Args:
mongos
–bind_ip_all
–port=27017
–sslAllowInvalidCertificates
–configdb
cfg/my-cluster-name-cfg-0.my-cluster-name-cfg.psmdb.svc.cluster.local:27017
–relaxPermChecks
–clusterAuthMode=keyFile
–keyFile=/etc/mongodb-secrets/mongodb-key
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Liveness: exec [/opt/percona/mongodb-healthcheck k8s liveness --component mongos --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 10] delay=60s timeout=10s period=3
0s #success=1 #failure=4
Readiness: exec [/opt/percona/mongodb-healthcheck k8s readiness --component mongos --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem] delay=10s timeout=1s period=1s #success=1 #failure=3
Environment Variables from:
my-cluster-name-secrets Secret Optional: false
internal-my-cluster-name-users Secret Optional: false
Environment:
MONGODB_PORT: 27017
Mounts:
/data/db from mongod-data (rw)
/etc/mongodb-secrets from my-cluster-name-mongodb-keyfile (ro)
/etc/mongodb-ssl from ssl (ro)
/etc/mongodb-ssl-internal from ssl-internal (ro)
/etc/users-secret from users-secret-file (ro)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vghbt (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
my-cluster-name-mongodb-keyfile:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-mongodb-keyfile
Optional: false
ssl:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl
Optional: true
ssl-internal:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl-internal
Optional: true
mongod-data:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
users-secret-file:
Type: Secret (a volume populated by a Secret)
SecretName: internal-my-cluster-name-users
Optional: false
bin:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
kube-api-access-vghbt:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning BackOff 2m24s (x19170 over 2d21h) kubelet Back-off restarting failed container mongo-init in pod my-cluster-name-mongos-0_psmdb(1453b7e3-73e4-4770-a94a-98e7d53b3c8d)
PS C:\Users\mkapas> kubectl describe pod/my-cluster-name-cfg-0 -n psmdb
Name: my-cluster-name-cfg-0
Namespace: psmdb
Priority: 0
Service Account: default
Node: node3/178.0.3.103
Start Time: Fri, 28 Jul 2023 14:45:44 +0200
Labels: app.kubernetes.io/component=cfg
app.kubernetes.io/instance=my-cluster-name
app.kubernetes.io/managed-by=percona-server-mongodb-operator
app.kubernetes.io/name=percona-server-mongodb
app.kubernetes.io/part-of=percona-server-mongodb
app.kubernetes.io/replset=cfg
controller-revision-hash=my-cluster-name-cfg-65c78c4446
statefulset.kubernetes.io/pod-name=my-cluster-name-cfg-0
Annotations: cni.projectcalico.org/containerID: 923767083f612cfb3be715a8ca42f9fc1cdcff3801f850a9955cef18934194db
cni.projectcalico.org/podIP: 10.233.92.75/32
cni.projectcalico.org/podIPs: 10.233.92.75/32
Open Source Database Software Support & Services | Percona
Open Source Database Software Support & Services | Percona
Status: Pending
IP: 10.233.92.75
IPs:
IP: 10.233.92.75
Controlled By: StatefulSet/my-cluster-name-cfg
Init Containers:
mongo-init:
Container ID: containerd://8c9214ab5462a3731c5d3151f741912ec29586cd6e2b0535671e79b104f8d6eb
Image: percona/percona-server-mongodb-operator:1.14.0
Image ID: docker.io/percona/percona-server-mongodb-operator@sha256:b5db0eae838e338f43633163a43579f57468b05144bde1fa161825a132b29bd2
Port:
Host Port:
Command:
/init-entrypoint.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 127
Started: Mon, 31 Jul 2023 12:30:58 +0200
Finished: Mon, 31 Jul 2023 12:30:58 +0200
Ready: False
Restart Count: 820
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Environment:
Mounts:
/data/db from mongod-data (rw)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gl892 (ro)
Containers:
mongod:
Container ID:
Image: percona/percona-server-mongodb:4.4
Image ID:
Port: 27017/TCP
Host Port: 0/TCP
Command:
/opt/percona/ps-entry.sh
Args:
–bind_ip_all
–auth
–dbpath=/data/db
–port=27017
–replSet=cfg
–storageEngine=wiredTiger
–relaxPermChecks
–sslAllowInvalidCertificates
–clusterAuthMode=keyFile
–keyFile=/etc/mongodb-secrets/mongodb-key
–configsvr
–enableEncryption
–encryptionKeyFile=/etc/mongodb-encryption/encryption-key
–wiredTigerCacheSizeGB=0.25
–wiredTigerIndexPrefixCompression=true
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 300m
memory: 500M
Requests:
cpu: 300m
memory: 500M
Liveness: exec [/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200] delay=60s timeout=10s period=30s #success=1 #fa
ilure=4
Readiness: tcp-socket :27017 delay=10s timeout=2s period=3s #success=1 #failure=3
Environment Variables from:
internal-my-cluster-name-users Secret Optional: false
Environment:
SERVICE_NAME: my-cluster-name
NAMESPACE: psmdb
MONGODB_PORT: 27017
MONGODB_REPLSET: cfg
Mounts:
/data/db from mongod-data (rw)
/etc/mongodb-encryption from my-cluster-name-mongodb-encryption-key (ro)
/etc/mongodb-secrets from my-cluster-name-mongodb-keyfile (ro)
/etc/mongodb-ssl from ssl (ro)
/etc/mongodb-ssl-internal from ssl-internal (ro)
/etc/users-secret from users-secret-file (rw)
/opt/percona from bin (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gl892 (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
mongod-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mongod-data-my-cluster-name-cfg-0
ReadOnly: false
my-cluster-name-mongodb-keyfile:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-mongodb-keyfile
Optional: false
bin:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit:
my-cluster-name-mongodb-encryption-key:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-mongodb-encryption-key
Optional: false
ssl:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl
Optional: true
ssl-internal:
Type: Secret (a volume populated by a Secret)
SecretName: my-cluster-name-ssl-internal
Optional: true
users-secret-file:
Type: Secret (a volume populated by a Secret)
SecretName: internal-my-cluster-name-users
Optional: false
kube-api-access-gl892:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning BackOff 2m22s (x19223 over 2d21h) kubelet Back-off restarting failed container mongo-init in pod my-cluster-name-cfg-0_psmdb(fe7ebf01-93a3-4c61-a543-6cca33a1432d)

curruntly I am using Longhorn Storage class, here the describe for SC:
kubectl describe sc
Name: longhorn
IsDefaultClass: Yes
Annotations: longhorn.io/last-applied-configmap=kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: longhorn
annotations:
storageclass.kubernetes.io/is-default-class: “true”
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: “Delete”
volumeBindingMode: Immediate
parameters:
numberOfReplicas: “3”
staleReplicaTimeout: “30”
fromBackup: “”
fsType: “ext4”
dataLocality: “disabled”
,storageclass.kubernetes.io/is-default-class=true
Provisioner: driver.longhorn.io
Parameters: dataLocality=disabled,fromBackup=,fsType=ext4,numberOfReplicas=3,staleReplicaTimeout=30
AllowVolumeExpansion: True
MountOptions:
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events:

in cr.yaml i set the storageclassname to longhorn.

In Operator i can see only one error message:
2023-07-31T10:40:18.156Z ERROR failed to reconcile cluster {“controller”: “psmdb-controller”, “object”: {“name”:“my-cluster-name”,“namespace”:“psmdb”}, “namespace”: “psmdb”, “name”: “my-cluster-name”, “reconcileID”: “3332bb6b-4515-4200-bef1-edb24ea329c4”, “replset”: “cfg”, “error”: “handleReplsetInit: no mongod containers in running state”, “errorVerbose”: “no mongod containers in running state\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.init\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:432\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6329\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6306\nruntime.doInit\n\t/usr/local/go/src/runtime/proc.go:6306\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:233\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nhandleReplsetInit\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).reconcileCluster\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/mgo.go:99\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:487\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594”}

github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile

/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:489

sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile

/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122

sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler

/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323

sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem

/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274

sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2

/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235

Please kindly let me know if anything else is needed from my side.

Best regards,
Mate

The problem happens here. Let’s see what’s inside the script:

docker run --rm -it --entrypoint /bin/sh percona/percona-server-mongodb-operator:1.14.0
sh-5.1$ cat /init-entrypoint.sh
#!/bin/bash

set -o errexit
set -o xtrace

install -o "$(id -u)" -g "$(id -g)" -m 0755 -D /ps-entry.sh /opt/percona/ps-entry.sh
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D /physical-restore-ps-entry.sh /opt/percona/physical-restore-ps-entry.sh
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D /mongodb-healthcheck /opt/percona/mongodb-healthcheck
install -o "$(id -u)" -g "$(id -g)" -m 0755 -D /pbm-entry.sh /opt/percona/pbm-entry.sh

Normally the container produces output like:

$ kubectl -n psmdb logs cluster1-cfg-0 -c mongo-init
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /ps-entry.sh /opt/percona/ps-entry.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /physical-restore-ps-entry.sh /opt/percona/physical-restore-ps-entry.sh
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /mongodb-healthcheck /opt/percona/mongodb-healthcheck
++ id -u
++ id -g
+ install -o 2 -g 2 -m 0755 -D /pbm-entry.sh /opt/percona/pbm-entry.sh

Please check what you have on your side?

As I can see from your output you are using percona/percona-server-mongodb-operator:1.14.0 image.
This image has /init-entrypoint.sh script and /ps-entry.sh /physical-restore-ps-entry.sh /mongodb-healthcheck /pbm-entry.sh files. “no such file or directory” problem could happen only with /opt/percona. This directory mounted from EmptyDir volume.

Let’s create a simple test case:

apiVersion: v1
kind: Pod
metadata:
  name: emptydirtest
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000
  containers:
  - name: tst
    image: busybox
    command:
    - /bin/sh  
    - -c  
    - |  
      echo test > /opt/test.txt
      cat /opt/test.txt
      sleep 600
    volumeMounts:
    - name: tmpdir
      mountPath: /opt
  volumes:
    - name: tmpdir
      emptyDir: {}
  restartPolicy: Never

Apply and check emptydir ownship:

$ kubectl exec -it emptydirtest -- /bin/sh
~ $ 
~ $ ls /opt
test.txt
~ $ ls -l /opt
total 4
-rw-r--r--    1 1000     2000             5 Jul 31 12:42 test.txt
~ $ ls -la /opt
total 12
drwxrwsrwx    2 root     2000          4096 Jul 31 12:42 .
drwxr-xr-x    1 root     root          4096 Jul 31 12:42 ..
-rw-r--r--    1 1000     2000             5 Jul 31 12:42 test.txt

Hello Nickolay,

when I try to run the command I got an following error:

kubectl -n psmdb logs my-cluster-name-cfg-0 -c mongo-init
Fatal glibc error: CPU does not support x86-64-v2

I checked the physical node CPU:
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Silver 4208 CPU @ 2.10GHz
stepping : 7
microcode : 0xffffffff
cpu MHz : 2095.079
cache size : 11264 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 21
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl xtopology cpuid pni cx16 hypervisor lahf_lm pti ssbd ibrs ibpb md_clear flush_l1d arch_capabilities
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_stale_data retbleed
bogomips : 4190.15
clflush size : 64
cache_alignment : 64
address sizes : 44 bits physical, 48 bits virtual
power management:

I think this CPU should support the x86-64-v2

There is a similar issue reported for our other project:
https://jira.percona.com/browse/PS-8757

Are you using virtualization engine (e.g. qemu) and limiting cpu capabilities?

Level 2 should have:
if (level == 2 && /avx/&&/avx2/&&/bmi1/&&/bmi2/&&/f16c/&&/fma/&&/abm/&&/movbe/&&/xsave/) level = 3
if (level == 3 && /avx512f/&&/avx512bw/&&/avx512cd/&&/avx512dq/&&/avx512vl/) level = 4

But there is no AVX in the list, but it exists for the same processor from different sources:

Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
      pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp
      lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid
      aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16
      xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand
      lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs
      ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust
      bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx
      smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1
      xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku
      ospke avx512_vnni flush_l1d arch_capabilities

Hi Nickolay,

Thank you for your help. I am able to change the cluster configuration. It was the Hyper-v cluster limitation of the CPU.

I just re-deployed the cr.yaml and it is looks much better, pods are running but always restarting.

psmdb stuck initializing:
kubectl get psmdb -n psmdb
NAME ENDPOINT STATUS AGE
mongodb-percona mongodb-percona-mongos.psmdb.svc.cluster.local initializing 17m

mongodb-percona-mongos-0:
Readiness probe failed: command “/opt/percona/mongodb-healthcheck k8s readiness --component mongos --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem” timed out

Events:
Type Reason Age From Message


Normal Scheduled 9m49s default-scheduler Successfully assigned psmdb/mongodb-percona-mongos-0 to node6
Normal Pulling 9m48s kubelet Pulling image “percona/percona-server-mongodb-operator:1.14.0”
Normal Pulled 9m47s kubelet Successfully pulled image “percona/percona-server-mongodb-operator:1.14.0” in 942.8629ms (942.876ms including waiting)
Normal Created 9m47s kubelet Created container mongo-init
Normal Started 9m47s kubelet Started container mongo-init
Normal Pulling 9m46s kubelet Pulling image “percona/percona-server-mongodb:6.0”
Normal Pulled 9m45s kubelet Successfully pulled image “percona/percona-server-mongodb:6.0” in 889.5536ms (889.5783ms including waiting)
Normal Created 9m45s kubelet Created container mongos
Normal Started 9m45s kubelet Started container mongos
Warning Unhealthy 4m47s (x259 over 9m34s) kubelet Readiness probe failed: command “/opt/percona/mongodb-healthcheck k8s readiness --component mongos --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPE
MKeyFile /tmp/tls.pem” timed out

percona-rs0-0:
Warning Unhealthy 13m (x3 over 16m) kubelet Readiness probe failed: dial tcp 10.233.108.117:27017: connect: connection refused
Normal Created 13m (x2 over 16m) kubelet Created container mongod
Normal Started 13m (x2 over 16m) kubelet Started container mongod
Normal Pulled 13m kubelet Successfully pulled image “percona/percona-server-mongodb:6.0” in 901.3136ms (901.3254ms including waiting)
Warning Unhealthy 2m16s (x19 over 15m) kubelet Liveness probe failed: command “/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyF
ile /tmp/tls.pem --startupDelaySeconds 7200” timed out

[mongodb@mongodb-percona-rs0-0 db]$ mongosh
Current Mongosh Log ID: 64c8c929da4eb0e95cd06e0f
Connecting to: mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.8.1
MongoServerSelectionError: Server selection timed out after 2000 ms
[mongodb@mongodb-percona-rs0-0 db]$ command terminated with exit code 137

percona-cfg-0:
Events:
Type Reason Age From Message


Warning FailedScheduling 22m (x2 over 22m) default-scheduler 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/6 nodes are available: 6 No preemption victims found for in
coming pod…
Normal Scheduled 22m default-scheduler Successfully assigned psmdb/mongodb-percona-cfg-0 to node6
Normal SuccessfulAttachVolume 22m attachdetach-controller AttachVolume.Attach succeeded for volume “pvc-0dc8cd90-28dd-4716-abac-2d832160437e”
Warning FailedMount 20m kubelet Unable to attach or mount volumes: unmounted volumes=[mongod-data], unattached volumes=[users-secret-file mongod-data bin kube-api-access-fzkfr mong
odb-percona-mongodb-keyfile ssl ssl-internal my-cluster-name-mongodb-encryption-key]: timed out waiting for the condition
Warning FailedMount 20m kubelet MountVolume.MountDevice failed for volume “pvc-0dc8cd90-28dd-4716-abac-2d832160437e” : rpc error: code = DeadlineExceeded desc = context deadline ex
ceeded
Normal Pulling 19m kubelet Pulling image “percona/percona-server-mongodb-operator:1.14.0”
Normal Started 19m kubelet Started container mongo-init
Normal Created 19m kubelet Created container mongo-init
Normal Pulled 19m kubelet Successfully pulled image “percona/percona-server-mongodb-operator:1.14.0” in 914.7034ms (914.7121ms including waiting)
Normal Pulled 19m kubelet Successfully pulled image “percona/percona-server-mongodb:6.0” in 1.1235468s (1.1235692s including waiting)
Normal Created 16m (x2 over 19m) kubelet Created container mongod
Normal Started 16m (x2 over 19m) kubelet Started container mongod
Normal Pulled 16m kubelet Successfully pulled image “percona/percona-server-mongodb:6.0” in 888.1978ms (888.2044ms including waiting)
Normal Killing 13m (x2 over 16m) kubelet Container mongod failed liveness probe, will be restarted
Normal Pulling 13m (x3 over 19m) kubelet Pulling image “percona/percona-server-mongodb:6.0”
Warning Unhealthy 13m kubelet Readiness probe failed: dial tcp 10.233.108.91:27017: connect: connection refused
Warning Unhealthy 14s (x25 over 18m) kubelet Liveness probe failed: command “/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFil
e /tmp/tls.pem --startupDelaySeconds 7200” timed out

but I am able to login monogsh:
[mongodb@mongodb-percona-cfg-0 db]$ mongosh
Current Mongosh Log ID: 64c8c9586a7ed3af5c59ec27
Connecting to: mongodb://127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+1.8.1
Using MongoDB: 6.0.6-5
Using Mongosh: 1.8.1

For mongosh info see: https://docs.mongodb.com/mongodb-shell/

To help improve our products, anonymous usage data is collected and sent to MongoDB periodically (Privacy Policy | MongoDB).
You can opt-out by running the disableTelemetry() command.

test>

How can I fix this problem?

Best Regard,
Mate

Hi Mate,
It’s great to see the progress. Now let’s see what happening with mongod. cfg and rs servers are similar, so we can check what’s going on with cfg-0 server.

Warning FailedMount 20m kubelet MountVolume.MountDevice failed for volume “pvc-0dc8cd90-28dd-4716-abac-2d832160437e” : rpc error: code = DeadlineExceeded desc = context deadline ex
You need a fast storage (directly attached for the best performance) allocated automatically with corresponding storage provisioner (select a required one with storage class or set a proper default sc).

For a simple setups I like local path provisioner from rancher GitHub - rancher/local-path-provisioner: Dynamically provisioning persistent local storage with Kubernetes .
Sophisticated on-premise clusters with many disks and storage types could be cooked with OpenEBS https://openebs.io/

kubectl -n psmdb get pvc
If pvc are in a pending state, check kubectl describe for pvc and corresponding persistentvolumes.
Finally if everything is good with volumes, check mongod logs
kubectl -n psmdb logs -c mongod my-cluster-name-cfg-0

Thanks,
Nickolay

Hi Nickolay,

PVC looks fine:
kubectl -n psmdb get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mongod-data-minimal-cluster-cfg-0 Bound pvc-99d7da54-254d-4e9f-a9a7-44b5bc3a03ca 3Gi RWO longhorn 174m
mongod-data-minimal-cluster-rs0-0 Bound pvc-7c3b948d-fad6-4686-bda6-f4ca01fcb42d 3Gi RWO longhorn 174m
mongod-data-mongodb-percona-cfg-0 Bound pvc-0dc8cd90-28dd-4716-abac-2d832160437e 3Gi RWO longhorn 62m
mongod-data-mongodb-percona-cfg-1 Bound pvc-8ff346ae-c742-4501-819f-5ef15181c27e 3Gi RWO longhorn 59m
mongod-data-mongodb-percona-cfg-2 Bound pvc-2c9a71d8-9529-4ef5-8bf1-87f54a1fe050 3Gi RWO longhorn 55m
mongod-data-mongodb-percona-rs0-0 Bound pvc-5afb482f-82dd-4e73-98c3-8c20b324f438 3Gi RWO longhorn 62m
mongod-data-mongodb-percona-rs0-1 Bound pvc-1c8456ca-3695-4bc6-9532-49368452343b 3Gi RWO longhorn 59m
mongod-data-mongodb-percona-rs0-2 Bound pvc-1019c210-38e5-49a9-80e7-bb3609121750 3Gi RWO longhorn 55m
PS C:\Work\Grepton\mongodb\percona-server-mongodb-operator\deploy> kubectl -n psmdb get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-0d096c26-022c-41a2-bc8f-849dc709d935 9538Mi RWO Delete Bound inf-dev/data-volume-inf-dev-mongodb112-0 longhorn 6d21h
pvc-0dc8cd90-28dd-4716-abac-2d832160437e 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-cfg-0 longhorn 62m
pvc-0ef25856-f3d1-49cc-b6e9-5da854cad648 1908Mi RWO Delete Bound inf-dev/logs-volume-inf-dev-mongodb112-0 longhorn 6d21h
pvc-1019c210-38e5-49a9-80e7-bb3609121750 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-rs0-2 longhorn 56m
pvc-1c8456ca-3695-4bc6-9532-49368452343b 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-rs0-1 longhorn 59m
pvc-2c9a71d8-9529-4ef5-8bf1-87f54a1fe050 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-cfg-2 longhorn 56m
pvc-3460303c-9a0c-499f-854a-801f9c36a47f 9538Mi RWO Delete Bound inf-dev/data-volume-inf-dev-mongodb608-0 longhorn 6d21h
pvc-5afb482f-82dd-4e73-98c3-8c20b324f438 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-rs0-0 longhorn 62m
pvc-7c3b948d-fad6-4686-bda6-f4ca01fcb42d 3Gi RWO Delete Bound psmdb/mongod-data-minimal-cluster-rs0-0 longhorn 174m
pvc-8ff346ae-c742-4501-819f-5ef15181c27e 3Gi RWO Delete Bound psmdb/mongod-data-mongodb-percona-cfg-1 longhorn 59m
pvc-99d7da54-254d-4e9f-a9a7-44b5bc3a03ca 3Gi RWO Delete Bound psmdb/mongod-data-minimal-cluster-cfg-0 longhorn 174m
pvc-a0d5a3a5-5023-4663-ad98-e0bed5ebb743 1908Mi RWO Delete Bound inf-dev/logs-volume-inf-dev-mongodb608-0 longhorn 6d21h

Please find the log in attachment.
log.txt (40.5 KB)

Thank you for you tips for the local path provisioner :slight_smile:

Best Regards,
Mate

MongoDB 6.0 replicaset requires majority to start:

{“t”:{“$date”:“2023-08-01T09:45:35.698+00:00”},“s”:“I”, “c”:“SHARDING”, “id”:22727, “ctx”:“ShardRegistryUpdater”,“msg”:“Error running periodic reload of shard registry”,“attr”:{“error”:“ReadConcernMajorityNotAvailableYet: cou
ld not get updated shard list from config server :: caused by :: Read concern majority reads are currently not possible.”,“shardRegistryReloadIntervalSeconds”:30}}

cfg-0 waits for cfg-1 and cfg-2, please check logs from other two pods.

Here the cfg-1 and cfg-2 log files:
cfg-1-log.txt (106.0 KB)
cfg-2-log.txt (25.5 KB)

Thanks,
Mate

{“t”:{“$date”:“2023-08-01T10:55:20.250+00:00”},“s”:“W”, “c”:“NETWORK”, “id”:21207, “ctx”:“conn9”,“msg”:“getaddrinfo() failed”,“attr”:{“host”:“mongodb-percona-cfg-1.mongodb-percona-cfg.psmdb.svc.cluster.local”,“error”:“Name o
r service not known”}}

There is an option to customize you kubernetes cluster domain name:

clusterServiceDNSSuffix: svc.cluster.local

MongoDB requires a fully qualified domain names saved in certificates and in replica config.
For the clusters powered by coredns you can use the command:

kubectl get cm coredns -n kube-system -o jsonpath="{.data.Corefile}"

Alternatively check resolv.conf from any kubernetes pods:

Hi Nickolay,

this is our dns config:
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes grk8s01.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
prefer_udp
max_concurrent 1000
}
cache 30

loop
reload
loadbalance

}

re-deployed the cr.yaml with DNSSuffix: clusterServiceDNSSuffix: grk8s01.local

then it is looks better, but pods are still restarting

I attached the new log files from cfg-0, cfg-1, cfg-2
cfg-0-log.txt (170.0 KB)
cfg-1-log.txt (84.9 KB)
cfg-2-log.txt (15.7 KB)

also i can see this connection refused:

Best regards,
Mate

Hi Mate,

grk8s01.local is a k8s cluster domain name, but all “Services” in k8s have following fully qualified DNS:

service-name.namespace.svc.cluster.domain.

In your case it should be:

clusterServiceDNSSuffix: svc.grk8s01.local

Hi Nickolay,

Thank you so much your help :slight_smile:
Now every pod is running, psmdb is also fine, I can connect to the mongodb

I really appreciate your help and I happy to choose the Percona :slight_smile:

Best Regards,
Mate