Description:
The Percona PostgreSQL operator is crashing due to leader election failure - it’s unable to communicate with the Kubernetes API server to maintain its leadership lease. if i try to change the operator lease duration time , renewal time or , retry i am unable to change it. Also i am using one replica of my operator i do not need a leader election and there is no way to disable it either, chruch data pgo has a way to do this bu setting PGO_CONTROLLER_LEASE_NAME to empty
Steps to Reproduce:
During cluster node autoscaling up when the Kubernetes API server throttles the postgresql operator crashes i am using percona postgresql operator 2.8.0 . when the operator continiously restarts since the lease is not renewed it crashes and gives the below error and crashes frequently Version:
percona postgresql 2.8.0
Logs:
..E1122 19:58:05.112391 Failed to update lock optimistically:Put “https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/default/leases/08db3feb.percona.com?timeout=5s”:net/http: request canceled (Client.Timeout exceeded while awaiting headers)
E1122 19:58:10.112084 error retrieving resource lock default/08db3feb.percona.com:Get “https://10.43.0.1:443/apis/coordination.k8s.io/v1/namespaces/default/leases/08db3feb.percona.com?timeout=5s”:context deadline exceeded
panic: leader election lost.
Expected Result:
No restart should occur or it need to retry or give a way to change lease retry logic or allow to change lease renewal duration. it would be more helpful need a way to turn of leader election and lease like crunchy data pgo
Actual Result:
Frequent crashing
Additional Information:
kubernetes version 1.34.1-k3s1