Liveness probe failed replisetStatus command need authentication

I am using the operator 1.8.0 to deploy a 3 nodes MongoDB cluster with LPV. in a 3 nodes k8s

All the pods are running when I firstly deploy.

When I see the logs of one pod, I can see the error log shows me that clusterMonitor authentication failed:

{"t":{"$date":"2021-05-20T01:17:44.934+00:00"},"s":"I",  "c":"ACCESS",   "id":20249,   "ctx":"conn33","msg":"Authentication failed","attr":{"mechanism":"SCRAM-SHA-1","principalName":"clusterMonitor","authenticationDatabase":"admin","client":"127.0.0.1:36118","result":"UserNotFound: Could not find user \"clusterMonitor\" for db \"admin\""}}

and describe the pod shows the error I put in the title:

Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  12m (x7 over 17m)  default-scheduler  0/3 nodes are available: 1 node(s) had taint {node.kubernetes.io/disk-pressure: }, that the pod didn't tolerate, 2 node(s) had volume node affinity conflict.
  Normal   Scheduled         12m                default-scheduler  Successfully assigned percona/my-cluster-name-rs0-2 to k8s-workernode-1
  Normal   Pulling           12m                kubelet            Pulling image "percona/percona-server-mongodb-operator:1.8.0"
  Normal   Pulled            10m                kubelet            Successfully pulled image "percona/percona-server-mongodb-operator:1.8.0" in 1m13.862005939s
  Normal   Created           10m                kubelet            Created container mongo-init
  Normal   Started           10m                kubelet            Started container mongo-init
  Warning  Unhealthy         9m43s              kubelet            Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:15:54Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:15:54Z"}
  Warning  Unhealthy  9m14s  kubelet  Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:16:23Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:16:23Z"}
  Warning  Unhealthy  8m44s  kubelet  Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:16:53Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:16:53Z"}
  Warning  Unhealthy  8m14s  kubelet  Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:17:23Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:17:23Z"}
  Normal   Created    8m13s (x2 over 10m)  kubelet  Created container mongod
  Normal   Pulled     8m13s (x2 over 10m)  kubelet  Container image "percona/percona-server-mongodb:4.4.5-7" already present on machine
  Normal   Started    8m12s (x2 over 10m)  kubelet  Started container mongod
  Warning  Unhealthy  6m44s                kubelet  Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:18:53Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:18:53Z"}
  Warning  Unhealthy  6m14s  kubelet  Liveness probe failed: {"level":"info","msg":"Running Kubernetes liveness check for mongod","time":"2021-05-20T01:19:23Z"}
{"level":"error","msg":"replSetGetStatus returned error command replSetGetStatus requires authentication","time":"2021-05-20T01:19:23Z"}

When I attached into the pod then execute mongo shell to connect local mongodb instance, I can only see 1 user in admin database, that is ‘userAdmin’

When I try to connect another pod using statefulset stable network ID, for example execute below command in pod my-cluster-name-rs0-0:

mongo "mongodb://my-cluster-name-rs0-1.my-cluster-name-rs0.percona.svc.cluster.local" --verbose

percona is my namespace

I got Host not found exception.

Someone could please help on this issue?

Thanks in advance.

1 Like

any additional info I need to attach, please let me know

1 Like

redeploy again, same issue

1 Like

Hello @wenjian ,

could you please share the cr.yaml that you use to deploy the cluster?
Are you changing the secrets when deploying?

1 Like

@Sergey_Pronin Does usage of existing secrets cause any issue to the cluster?

1 Like

@SpoorthiPalakshaiah what do you mean by “existing” ?

But changing system user secrets is a process that might trigger the pods restart or containers reload.

1 Like

I’m using same secrets as in the secrets.yaml in documentation; But still the cluster is unstable;
Liveness and readiness probes fails few times but the pod stays in running state; What could be the reason for this behavior in the cluster

1 Like

The failures of liveness and readiness probes are not triggered by secrets.
The most likely reason would be resources saturation.
What does your monitoring tells you? What is the current utilization (CPU/RAM) for Mongo related containers?

1 Like

I changed to using hostPath, now everything is fine.

I assumed that the issue was caused by some miss configured PV / PVC maybe

1 Like

To top up the conversation, I sometimes have the same issue. Turns out it’s because I redeploy the CR many times without cleaning up the PVC.

1 Like