High memory usage of PostgreSQL leader pod

an-toine · April 29, 2026, 3:04pm

Description:

Hello !

We are using the Percona PostgreSQL operator to instanciate databases in a Kubernetes cluster.

Lately, we have received alerts about the leader pod being evicted because of memory pressure on a node.

After some investigations, we have found the memory usage of the pod was steadily increasing (and conversely, the node available memory was decreasing), leading to this eviction :

How can we investigate this issue ?

Steps to Reproduce:

Create a 3 nodes cluster using CRD `perconapgclusters.pgv2.percona.com` :

apiVersion: pgv2.percona.com/v2
kind: PerconaPGCluster
metadata:
  annotations:
    argocd.argoproj.io/tracking-id: artifactory:pgv2.percona.com/PerconaPGCluster:artifactory/artifactory-pg-db
    current-primary: artifactory-pg-db
    freelens.app/resource-version: v2
    pgv2.percona.com/patroni-version: 4.1.0
    postgres-operator.crunchydata.com/trigger-switchover: Tue Mar 31 10:19:18 AM CEST
      2026
  labels:
    app.kubernetes.io/instance: artifactory
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: pg-db
    app.kubernetes.io/version: 2.8.2
    argocd.argoproj.io/instance: artifactory
    crunchy-pgha-scope: artifactory-pg-db
    deployment-name: artifactory-pg-db
    helm.sh/chart: pg-db-2.8.2
    name: artifactory-pg-db
    pg-cluster: artifactory-pg-db
    pgo-version: 2.8.2
    pgouser: admin
  name: artifactory-pg-db
  namespace: artifactory
spec:
  backups:
    enabled: true
    pgbackrest:
      global:
        archive-push-queue-max: 5G
        repo1-retention-full: "1"
        repo1-retention-full-type: count
      image: docker.io/percona/percona-pgbackrest:2.57.0-1
      manual:
        options:
        - --type=full
        - --annotation="percona.com/backup-name"="artifactory-pg-db-repo1-full-pknd9"
        repoName: repo1
      metadata:
        labels:
          pgv2.percona.com/version: 2.5.0
      repoHost:
        affinity:
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    postgres-operator.crunchydata.com/data: pgbackrest
                topologyKey: kubernetes.io/hostname
              weight: 1
        priorityClassName: low
      repos:
      - name: repo1
        schedules:
          full: 0 0 * * *
        volume:
          volumeClaimSpec:
            accessModes:
            - ReadWriteOnce
            resources:
              requests:
                storage: 500Gi
            storageClassName: portworx-pso-fb-v3
    trackLatestRestorableTime: true
  crVersion: 2.8.2
  extensions:
    builtin:
      pg_audit: true
      pg_stat_monitor: true
  image: docker.io/percona/percona-distribution-postgresql:16.11-2
  imagePullPolicy: Always
  instances:
  - affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
        - podAffinityTerm:
            labelSelector:
              matchLabels:
                postgres-operator.crunchydata.com/data: postgres
                postgres-operator.crunchydata.com/instance-set: jfrog-platform
            topologyKey: kubernetes.io/hostname
          weight: 1
    dataVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 500Gi
      storageClassName: portworx-pso-fb-v3
    metadata:
      labels:
        pgv2.percona.com/version: 2.5.0
    name: jfrog-platform
    replicas: 3
    walVolumeClaimSpec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 800Gi
      storageClassName: portworx-pso-fb-v3
  patroni:
    dynamicConfiguration:
      postgresql:
        parameters:
          max_connections: 200
        pg_hba:
        - local   all all trust
        - host    all all 10.10.0.0/16 md5
    leaderLeaseDurationSeconds: 30
    port: 8008
    switchover:
      enabled: true
      targetInstance: artifactory-pg-db-jfrog-platform-sgtv
      type: Switchover
    syncPeriodSeconds: 10
  pause: false
  pmm:
    enabled: false
    image: docker.io/percona/pmm-client:3.4.1
    querySource: pgstatmonitor
    secret: artifactory-pg-db-pmm-secret
    serverHost: monitoring-service
  port: 5432
  postgresVersion: 16
  proxy:
    pgBouncer:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  postgres-operator.crunchydata.com/cluster: artifactory-pg-db
                  postgres-operator.crunchydata.com/role: pgbouncer
              topologyKey: kubernetes.io/hostname
            weight: 1
      exposeSuperusers: true
      image: docker.io/percona/percona-pgbouncer:1.25.0-1
      metadata:
        labels:
          pgv2.percona.com/version: 2.5.0
      port: 5432
      replicas: 3
  standby:
    enabled: false
  unmanaged: false
  users:
  - databases:
    - artifactory
    name: artifactory
    options: SUPERUSER
    secretName: artifactory-db-secret
  - databases:
    - xray
    name: xray
    options: SUPERUSER
    secretName: xray-db-secret

Version:

Kubernetes 1.34.3

Percona PostgreSQL operator 2.8.2

Percona distribution PostgreSQL 16.11-2

Logs:

n/a

Expected Result:

The database memory usage should not increase over time

Actual Result:

The database memory usage increased to the point the pod got evicted

Additional Information:

n/a

Ege_Gunes · April 30, 2026, 7:01am

@an-toine I assume your workload patterns didn’t change much? I’ll deploy a cluster with a similar configuration to yours to see if I observe the same pattern.

Ege_Gunes · April 30, 2026, 7:08am

Just remembered that pg_stat_monitor might create a significant memory overhead. Have you tried disabling pg_stat_monitor to see if it helps?

an-toine · April 30, 2026, 7:34am

There was no recent workload change that could explain this new behavior.
We have not updated the client application and the usage remained stable across time.

I confirm extension `pg_stat_monitor` is enabled, we could try disabling it to check if situation improves :

➜  ~ k get perconapgclusters.pgv2.percona.com -o yaml artifactory-pg-db | grep -B 5 pg_stat_monitor
    trackLatestRestorableTime: true
  crVersion: 2.8.2
  extensions:
    builtin:
      pg_audit: true
      pg_stat_monitor: true

Antoine

Ege_Gunes · May 6, 2026, 2:17pm

@an-toine did you have the chance to disable it? any updates?

an-toine · May 7, 2026, 6:36am

To investigate further, we chose to work on enabling PMM for this instance : will disabling this module impact PMM collected data ?

As an aside, I saw we were not setting limits memory and cpu requests/limits for this instance, and I was wondering if it could impact the memory usage in any way ?

an-toine · May 12, 2026, 6:47am

For your information, we have disabled module pg_stat_monitor and set a 8GB memory limit/2GB memory request.

We are closely monitoring the instance to evaluate the effectiveness of the change.

Ege_Gunes · May 21, 2026, 6:57am

@an-toine do you see any difference?

an-toine · May 21, 2026, 7:04am

We think we have solved our issue by setting memory limit and requests on the instance :

We now observe a ceiling in memory usage, there is no constant increase over time.

For us, this issue is solved

Thank you for your help !

Ege_Gunes · May 21, 2026, 7:06am

@an-toine I’m happy to hear that. Is this with pg_stat_monitor disabled or enabled?

an-toine · May 21, 2026, 7:10am

This behavior is observed with pg_stat_monitor plugin disabled.

Ege_Gunes · May 21, 2026, 7:16am

Thanks @an-toine!

This issue is solved by disabling pg_stat_monitor and setting memory limits.

Topic		Replies	Views
Percona Operator for PostgresSQL takes too much CPU and RAM usage while running on OpenShift Cluster Percona Operator for PostgreSQL	7	1172	August 26, 2022
Percona Postgres takes too much CPU and RAM usage while running on OpenShift Cluster Percona Distribution for PostgreSQL postgres	2	1173	August 16, 2022
Error in percona logs - pg_stat_monitor: pg: out of memory PostgreSQL	2	327	November 21, 2025
Percona postgresql operator frequent crashing due to leader election failure Percona Operator for PostgreSQL percona , postgresql	2	227	December 16, 2025
PostgreSQL + PMM - memory usage problem PostgreSQL	1	1102	July 28, 2022