CrashLoopBackOff after deploying Mongodb Operator psmdb-operator-1.16.3

Description:

Hi All,
I deployed Percona operator for MongoDB using the helm chart and I noticed that the operator is restarting intermittently and crashloopbackoff due to the following leader election error logs,

Steps to Reproduce:

Here are the installation steps that I follow
helm install psmdb-db psmdb-db-1.16.3.tgz -n psmdb --create-namespace || true
Values.yaml:

replicaCount: 1
image:
repository: percona/percona-server-mongodb-operator
tag: 1.16.2
pullPolicy: IfNotPresent
disableTelemetry: true
watchAllNamespaces: true
rbac:
create: true
serviceAccount:
create: true
imagePullSecrets:
nameOverride: “”
fullnameOverride: “”
env:
resyncPeriod: 5s
tolerations:
logStructured: true
logLevel: “DEBUG”

Version:

psmd-operator - helm chart version 1.16.3
psmd-operator version 1.16.2
psmd-db - helm chart version 1.16.3
psmd-db version 1.16.2
kubernetes version: 1.29.5
kubectl version: v1.31.0

Logs:

{“level”:“info”,“ts”:1723890615.2566066,“logger”:“setup”,“msg”:“Manager starting up”,“gitCommit”:“13627d423321257e18b77d270af922c6cd17c8f0”,“gitBranch”:“release-1-16-2”,“goVersion”:“go1.22.5”,“os”:“linux”,“arch”:“amd64”}
{“level”:“info”,“ts”:1723890615.2936742,“msg”:“server version”,“platform”:“kubernetes”,“version”:“v1.29.5”}
{“level”:“info”,“ts”:1723890615.3044705,“logger”:“controller-runtime.metrics”,“msg”:“Starting metrics server”}
{“level”:“info”,“ts”:1723890615.3045084,“msg”:“starting server”,“name”:“health probe”,“addr”:“[::]:8081”}
{“level”:“info”,“ts”:1723890615.3045788,“logger”:“controller-runtime.metrics”,“msg”:“Serving metrics server”,“bindAddress”:“:8080”,“secure”:false}
I0817 10:30:15.305233 1 leaderelection.go:250] attempting to acquire leader lease psmdb-operator/08db0feb.percona.com…
I0817 10:30:31.245952 1 leaderelection.go:260] successfully acquired lease psmdb-operator/08db0feb.percona.com
{“level”:“debug”,“ts”:1723890631.2460227,“logger”:“events”,“msg”:“psmdb-operator-7dcdd78d99-f7n86_aca193f9-5ee0-484d-b50e-3ee0d499819b became leader”,“type”:“Normal”,“object”:{“kind”:“Lease”,“namespace”:“psmdb-operator”,“name”:“08db0feb.percona.com”,“uid”:“ae92da5a-afd1-4749-a980-258e93120b61”,“apiVersion”:“coordination.k8s.io/v1",“resourceVersion”:“470766”},“reason”:"LeaderElection”}
{“level”:“info”,“ts”:1723890631.2465332,“msg”:“Starting EventSource”,“controller”:“psmdbbackup-controller”,“source”:“kind source: *v1.PerconaServerMongoDBBackup”}
{“level”:“info”,“ts”:1723890631.2465675,“msg”:“Starting EventSource”,“controller”:“psmdbbackup-controller”,“source”:“kind source: *v1.Pod”}
{“level”:“info”,“ts”:1723890631.2465749,“msg”:“Starting Controller”,“controller”:“psmdbbackup-controller”}
{“level”:“info”,“ts”:1723890631.246808,“msg”:“Starting EventSource”,“controller”:“psmdbrestore-controller”,“source”:“kind source: *v1.PerconaServerMongoDBRestore”}
{“level”:“info”,“ts”:1723890631.2468803,“msg”:“Starting EventSource”,“controller”:“psmdbrestore-controller”,“source”:“kind source: *v1.Pod”}
{“level”:“info”,“ts”:1723890631.2468925,“msg”:“Starting Controller”,“controller”:“psmdbrestore-controller”}
{“level”:“info”,“ts”:1723890631.2470984,“msg”:“Starting EventSource”,“controller”:“psmdb-controller”,“source”:“kind source: *v1.PerconaServerMongoDB”}
{“level”:“info”,“ts”:1723890631.2471287,“msg”:“Starting Controller”,“controller”:“psmdb-controller”}
{“level”:“info”,“ts”:1723890631.3539019,“msg”:“Starting workers”,“controller”:“psmdb-controller”,“worker count”:1}
{“level”:“info”,“ts”:1723890631.35402,“msg”:“Starting workers”,“controller”:“psmdbrestore-controller”,“worker count”:1}
{“level”:“info”,“ts”:1723890631.3540661,“msg”:“Starting workers”,“controller”:“psmdbbackup-controller”,“worker count”:1}
E0817 10:42:41.055838 1 leaderelection.go:340] Failed to update lock optimitically: Put “https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/psmdb-operator/leases/08db0feb.percona.com”: context deadline exceeded, falling back to slow path
E0817 10:42:41.055947 1 leaderelection.go:347] error retrieving resource lock psmdb-operator/08db0feb.percona.com: client rate limiter Wait returned an error: context deadline exceeded
I0817 10:42:41.055963 1 leaderelection.go:285] failed to renew lease psmdb-operator/08db0feb.percona.com: timed out waiting for the condition
{“level”:“debug”,“ts”:1723891361.056032,“logger”:“events”,“msg”:“psmdb-operator-7dcdd78d99-f7n86_aca193f9-5ee0-484d-b50e-3ee0d499819b stopped leading”,“type”:“Normal”,“object”:{“kind”:“Lease”,“namespace”:“psmdb-operator”,“name”:“08db0feb.percona.com”,“uid”:“ae92da5a-afd1-4749-a980-258e93120b61”,“apiVersion”:“coordination.k8s.io/v1",“resourceVersion”:“474711”},“reason”:"LeaderElection”}
{“level”:“error”,“ts”:1723891361.0560184,“logger”:“setup”,“msg”:“problem running manager”,“error”:“leader election lost”,“stacktrace”:“main.main\n\t/go/src/github.com/percona/percona-server-mongodb-operator/cmd/manager/main.go:161\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:271”}

Actual Result:

Operator pod crash loop back off

Additional Information:

Any idea how to fix this issue?
Is this operator version “1.16.3” production ready?

Hi, at this time the latest supported release is 1.16.2 please try that one. You can check this page for info about supported releases:Release notes index - Percona Operator for MongoDB

1 Like

Hi all, I managed to resolve this issue by downgrading Kubernetes version to 1.29.0. it seems the Percona operator doesn’t like Kubernetes 1.29.5 :slightly_smiling_face: