So, we solved the pod restarting problem. Somehow, in the bundle.yaml,
image: docker.io/percona/percona-postgresql-operator:2.8.2 got set to 2.9 and since there is no 2.9, the docker image pull failed.
But now they are crash looping.
% kubectl get po -n 2percona
NAME READY STATUS RESTARTS AGE
cluster1-backup-n7m4-pc7fx 0/1 Completed 0 22h
cluster1-instance1-2lmb-0 4/5 CrashLoopBackOff 349 (4m59s ago) 22h
cluster1-instance1-8xbb-0 4/5 CrashLoopBackOff 351 (57s ago) 22h
cluster1-instance1-mvj8-0 4/5 CrashLoopBackOff 349 (3m47s ago) 22h
cluster1-pgbouncer-fd5d99bcc-6g8f7 2/2 Running 0 22h
cluster1-pgbouncer-fd5d99bcc-pfclc 2/2 Running 0 22h
cluster1-pgbouncer-fd5d99bcc-shq5g 2/2 Running 0 22h
cluster1-repo-host-0 2/2 Running 0 22h
percona-postgresql-operator-6b887b756d-lzqn4 1/1 Running 0 24h
I see the following containers:
% kubectl describe po -n 2percona cluster1-instance1-8xbb-0| fgrep -B1 containerd | fgrep -v containerd | sort
database-init:
database:
nss-wrapper-init:
pgbackrest-config:
pgbackrest:
pmm-client:
postgres-startup:
replication-cert-copy:
On the leader, 2lmb, I see (after removing the leader messages) in the database log:
2026-02-24 00:30:21,734 ERROR: ObjectCache.run ProtocolError(“Connection broken: ConnectionResetError(104, ‘Connection reset by peer’)”, ConnectionResetError(104, ‘Connection reset by peer’))
2026-02-24 05:03:34,674 ERROR: ObjectCache.run ProtocolError(“Connection broken: InvalidChunkLength(got length b’', 0 bytes read)”, InvalidChunkLength(got length b’‘, 0 bytes read))
2026-02-24 05:04:11,992 ERROR: ObjectCache.run ProtocolError("Connection broken: InvalidChunkLength(got length b’‘, 0 bytes read)", InvalidChunkLength(got length b’', 0 bytes read))
In the replication-cert-copy log, I see:
error: container replicationh-cert-copy is not valid for pod cluster1-instance1-2lmb-0
In the pmm-agent log, I see:
Checking local pmm-agent status…
pmm-agent is not running.
Config file /usr/local/percona/pmm/config/pmm-agent.yaml is not writable: no such file or directory.
time=“2026-02-24T18:10:49.950+00:00” level=info msg=“‘pmm-agent setup’ exited with 1” component=entrypoint
time=“2026-02-24T18:10:49.950+00:00” level=info msg=“Restarting pmm-agent setup in 5 seconds because PMM_AGENT_SIDECAR is enabled…” component=entrypoint
time=“2026-02-24T18:10:54.951+00:00” level=info msg=“Starting ‘pmm-agent setup’…” component=entrypoint
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Loading configuration file /usr/local/percona/pmm/config/pmm-agent.yaml.” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/node_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/mysqld_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/mongodb_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/postgres_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/proxysql_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/rds_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/azure_exporter” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Using /usr/local/percona/pmm2/exporters/vmagent” component=setup
time=“2026-02-24T18:10:54.965+00:00” level=info msg=“Updating PMM Server address from "10.20.3.20" to "10.20.3.20:443".” component=setup
Checking local pmm-agent status…
And of course, nothing shows up in the PMM admin page.