Pmm-client is not starting up

Description:

I would like to monitor the postgresql cluster with pmm, but the client is not starting. I am using a bare-bones k8s cluster administered with Rancher.

Steps to Reproduce:

I followed the steps explained here:

I installed the pmm server in a separate namespace pmm and the service is coming up fine:

$ kubectl -n pmm get svc
NAME                 TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
monitoring-service   NodePort   10.43.226.58   <none>        443:30975/TCP,80:31827/TCP   47h

I create an api key in the ui frontend (no expiry date) and base64-encode the string. I create a secret with the base64-encoded string:

apiVersion: v1
kind: Secret
metadata:
  name: cluster1-pmm-secret
type: Opaque
stringData:
  PMM_SERVER_KEY: "base64_encoded_string"

I apply the secret to the namespace where the PostgreSQL Operator and DB is running:

$ kubectl -n pgsql apply -f secret.yaml
$ kubectl -n pgsql get secret | grep cluster1
cluster1-pmm-secret                               Opaque                                1      47h
$ kubectl -n pgsql get svc | grep pg
pgsql-db-pg-db-ha                        ClusterIP   10.43.204.246   <none>        5432/TCP             37d
pgsql-db-pg-db-ha-config                 ClusterIP   None            <none>        <none>               37d
pgsql-db-pg-db-pgbouncer                 ClusterIP   10.43.29.204    <none>        5432/TCP             37d
pgsql-db-pg-db-pods                      ClusterIP   None            <none>        <none>               37d
pgsql-db-pg-db-primary                   ClusterIP   None            <none>        5432/TCP             37d
pgsql-db-pg-db-replicas                  ClusterIP   10.43.58.9      <none>        5432/TCP             37d

I then change the deploy/cr.yaml of pgsql-db and add the following:

pmm:
  enabled: true
  image:
    repository: percona/pmm-client
    tag: 2.44.0
  secret: cluster1-pmm-secret
  serverHost: monitoring-service.pmm.svc.cluster.local

I deploy the application. Nothing happens:

$ kubectl -n pgsql get pod | grep pg
pgsql-db-pg-db-instance1-zsjw-0                           4/4     Running     0              46h
pgsql-db-pg-db-pgbouncer-557765ffdf-6rtgv                 2/2     Running     2 (34d ago)    37d
pgsql-db-pg-db-repo-host-0                                2/2     Running     2 (34d ago)    37d
pgsql-db-pg-db-repo1-full-28988520-gkzhs                  0/1     Completed   0              29h
pgsql-pg-operator-7d55db47c-9449v                         1/1     Running     0              47h
$ kubectl -n pgsql logs -f pgsql-db-pg-db-instance1-zsjw-0 -c 
database               pgbackrest             postgres-startup       
nss-wrapper-init       pgbackrest-config      replication-cert-copy

Version:

PMM Server: v2.44.0, Helm Chart v1.3.21
PostgreSQL Operator: v2.3.1
PostgreSQL DB: v2.3.6
Kubernetes: v1.24.17
Rancher: v2.7.10

Expected Result:

A new pod for pmm-client (or a separate container within pgsql-db-pg-db-instance1-zsjw-0) should start in the namespace pgsql.

Actual Result:

Nothing happens, no client is started, pmm-server cannot monitor the db.

Hi @coaler, I need to see log from operator pgsql-pg-operator-7d55db47c-9449v to help you.

Here is an extract of the log:

2025-03-11T08:45:38.825Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "25d533aa-7992-4d25-90bf-d1a012c20ae2"}
2025-03-11T08:45:51.112Z	INFO	Can't enable PMM: pgsql-db-pg-db-pmm-secret secret doesn't exist	{"controller": "perconapgcluster", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGCluster", "PerconaPGCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "bde5af07-5c9d-45a9-be91-7491a8c2b442"}
2025-03-11T08:45:51.758Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "9e1ac0b5-556c-4d1e-a60b-9cfc6b011759"}
2025-03-11T08:52:55.382Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "11b9df07-8010-4cde-b3d4-139966ad01d0"}
2025-03-11T08:52:56.046Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "69a608b9-61dc-40c2-a4f9-0238f5db2e53"}
2025-03-11T08:52:56.391Z	INFO	Can't enable PMM: pgsql-db-pg-db-pmm-secret secret doesn't exist	{"controller": "perconapgcluster", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGCluster", "PerconaPGCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "ae44978b-9b10-4314-93d7-2f7b29fdb36c"}
2025-03-11T08:52:57.103Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "bea57d54-9ee1-4635-b439-cfbb68e26f31"}
2025-03-11T08:52:58.084Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "d9fe62a0-acf8-46fd-a4c7-78872a3063b1"}
2025-03-11T08:53:19.592Z	ERROR	Reconciler error	{"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"pgsql-db-pg-db-repo1-full-fxjqf","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db-repo1-full-fxjqf", "reconcileID": "6ecd5e39-f560-4d23-8d86-722c5fb9f16c", "error": "get backup job: Job.batch \"pgsql-db-pg-db-backup-smq7\" not found", "errorVerbose": "Job.batch \"pgsql-db-pg-db-backup-smq7\" not found\nget backup job\ngithub.com/percona/percona-postgresql-operator/percona/controller/pgbackup.(*PGBackupReconciler).Reconcile\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/controller/pgbackup/controller.go:138\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227
2025-03-11T08:56:32.155Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "eda3cd48-e21e-400e-bccb-4436e462c5ba"}
2025-03-11T08:57:27.489Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "3ea70f53-28cd-4d9d-9ad3-8a306726ce8b"}
2025-03-11T08:57:28.235Z	INFO	Superusers are exposed through PGBouncer	{"controller": "postgrescluster", "controllerGroup": "postgres-operator.crunchydata.com", "controllerKind": "PostgresCluster", "PostgresCluster": {"name":"pgsql-db-pg-db","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db", "reconcileID": "059fb95e-377a-4564-9acf-2d6dd4513875"}
2025-03-11T09:09:59.593Z	ERROR	Reconciler error	{"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"pgsql-db-pg-db-repo1-full-fxjqf","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db-repo1-full-fxjqf", "reconcileID": "50ae6af6-5239-4f03-8d64-551c8e9aabf3", "error": "get backup job: Job.batch \"pgsql-db-pg-db-backup-smq7\" not found", "errorVerbose": "Job.batch \"pgsql-db-pg-db-backup-smq7\" not found\nget backup job\ngithub.com/percona/percona-postgresql-operator/percona/controller/pgbackup.(*PGBackupReconciler).Reconcile\n\t/go/src/github.com/percona/percona-postgresql-operator/percona/controller/pgbackup/controller.go:138\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227

Plus, every second there are several (unrelated) errors of this kind that I filtered from the result:

2025-03-11T10:32:20.898Z	INFO	Waiting for backup to start	{"controller": "perconapgbackup", "controllerGroup": "pgv2.percona.com", "controllerKind": "PerconaPGBackup", "PerconaPGBackup": {"name":"pgsql-db-pg-db-repo1-full-4qnxp","namespace":"pgsql"}, "namespace": "pgsql", "name": "pgsql-db-pg-db-repo1-full-4qnxp", "reconcileID": "330fe6f2-d8fb-414a-b7f6-89b63cbadbd2", "request": {"name":"pgsql-db-pg-db-repo1-full-4qnxp","namespace":"pgsql"}}

I then created a secret with the name pgsql-db-pg-db-pmm-secret with the same content as cluster1-pmm-secret and applied to the namespace pgsql, but still nothing happened.

This is the config of the operator:

backups:
  pgbackrest:
    image: percona/percona-postgresql-operator:2.3.1-ppg15-pgbackrest
    manual:
      options:
        - '--type=full'
      repoName: repo2
    repoHost:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    postgres-operator.crunchydata.com/data: pgbackrest
                topologyKey: kubernetes.io/hostname
              weight: 1
    repos:
      - name: repo1
        schedules:
          full: "0 22 * * *"
        volume:
          volumeClaimSpec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 1Gi
crVersion: 2.3.1
customReplicationTLSSecret:
  name: ''
customTLSSecret:
  name: ''
finalizers: null
image: percona/percona-postgresql-operator:2.3.1-ppg15-postgres
imagePullPolicy: Always
instances:
  - affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  postgres-operator.crunchydata.com/data: postgres
              topologyKey: kubernetes.io/hostname
            weight: 1
    dataVolumeClaimSpec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 5Gi
          limits:
            cpu: 1
            memory: 1Gi
    name: instance1
    replicas: 1
#patroni:
#  dynamicConfiguration:
#    postgresql:
#      parameters:
#        max_connections: 50
#        shared_buffers: 80MB
pause: false
pmm:
  enabled: true
  image:
    repository: percona/pmm-client
    tag: 2.44.0
  secret: cluster1-pmm-secret
  serverHost: monitoring-service.pmm.svc.cluster.local
postgresVersion: 15
proxy:
  pgBouncer:
    affinity:
      podAntiAffinity:
        preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  postgres-operator.crunchydata.com/role: pgbouncer
              topologyKey: kubernetes.io/hostname
            weight: 1
    image: percona/percona-postgresql-operator:2.3.1-ppg15-pgbouncer
    replicas: 1
    config:
      global:
#        pool_mode: "session"
        default_pool_size: "100"
    resources:
      limits:
        cpu: 200m
        memory: 128Mi
repository: percona/percona-postgresql-operator
secrets:
  name: null
  pgbouncer: null
  pguser: null
  postgres: null
  primaryuser: null
standby:
  enabled: false
unmanaged: false
users:
  - name: postgres

I have to correct myself, the pmm-client has now started up as a container inside of pgsql-db-pg-db-instance1-zsjw-0.

It can’t authenticate against the pmm server, however:

time="2025-03-11T10:48:54.050+00:00" level=info msg="Using /usr/local/percona/pmm2/exporters/azure_exporter" component=setup
Checking local pmm-agent status...
time="2025-03-11T10:48:54.050+00:00" level=info msg="Using /usr/local/percona/pmm2/exporters/vmagent" component=setup
time="2025-03-11T10:48:54.050+00:00" level=info msg="Updating PMM Server address from \"monitoring-service.pmm.svc.cluster.local\" to \"monitoring-service.pmm.svc.cluster.local:443\"." component=setup
2025-03-11T10:48:54.051734354Z pmm-agent is running.
Registering pmm-agent on PMM Server...
Failed to register pmm-agent on PMM Server: invalid API key
Please check username and password.
time="2025-03-11T10:48:54.067+00:00" level=info msg="'pmm-agent setup' exited with 1" component=entrypoint

In the secret, I have specified the variable PMM_SERVER_KEY, which is the base64-encoded API key that I retrieved from the PMM server frontend. Is this correct?