Percona postgresql operator frequent crashing due to leader election failure

vasanthnataraj · December 15, 2025, 5:00pm

Description:

The Percona PostgreSQL operator is crashing due to leader election failure - it’s unable to communicate with the Kubernetes API server to maintain its leadership lease. if i try to change the operator lease duration time , renewal time or , retry i am unable to change it. Also i am using one replica of my operator i do not need a leader election and there is no way to disable it either, chruch data pgo has a way to do this bu setting PGO_CONTROLLER_LEASE_NAME to empty

Steps to Reproduce:

During cluster node autoscaling up when the Kubernetes API server throttles the postgresql operator crashes i am using percona postgresql operator 2.8.0 . when the operator continiously restarts since the lease is not renewed it crashes and gives the below error and crashes frequently Version:

percona postgresql 2.8.0

Logs:

..E1122 19:58:05.112391 Failed to update lock optimistically:Put “https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/default/leases/08db3feb.percona.com?timeout=5s”:net/http: request canceled (Client.Timeout exceeded while awaiting headers)

E1122 19:58:10.112084 error retrieving resource lock default/08db3feb.percona.com:Get “https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/default/leases/08db3feb.percona.com?timeout=5s”:context deadline exceeded

panic: leader election lost.

Expected Result:

No restart should occur or it need to retry or give a way to change lease retry logic or allow to change lease renewal duration. it would be more helpful need a way to turn of leader election and lease like crunchy data pgo

Actual Result:

Frequent crashing

Additional Information:

kubernetes version 1.34.1-k3s1

Slava_Sarzhan · December 16, 2025, 11:21am

Hi @vasanthnataraj , I agree that we need to have a more flexible configuration for the leader election process for our operators. We need to make a possible to configure these options by users:

LeaseDuration: 60 * time.Second,
RenewDeadline: 40 * time.Second,
RetryPeriod: 10 * time.Second,

I will create a task to add the improvement. Thanks.

Slava_Sarzhan · December 16, 2025, 11:30am

The task number is Jira

Topic		Replies	Views
CrashLoopBackOff after deploying Mongodb Operator psmdb-operator-1.16.3 Percona Operator for MongoDB	2	283	August 26, 2024
CrashLoopBackOff when install psmdb-operator v1.15.0 Percona Operator for MongoDB psmdb-operator	6	1123	October 26, 2023
Operator leader election hangs often Percona Operator for MySQL	1	121	September 17, 2024
High memory usage of PostgreSQL leader pod Percona Operator for PostgreSQL postgresql	11	119	May 21, 2026
Operator shuts down leader pod when setting standby count to 0 Percona Operator for PostgreSQL	3	870	September 11, 2023