Supervisord process `pmm-update-perform-init` causing postgresql within PMM to break

The log lines within pmm-update-perform-init.log indicate that the ansible playbook at /usr/share/pmm-update/ansible/playbook/tasks/init.yml is being executed.

These tasks (that are executed as part of the ansible playbook)

TASK [postgres : Stop Postgres 14 database without supervisor] *****************
changed: [localhost]

TASK [postgres : Rename old Postgres directory] ********************************
changed: [localhost]

TASK [postgres : Remove old Postgres direcroty] ********************************
changed: [localhost]

TASK [postgres : Reread supervisord configuration] *****************************
changed: [localhost]

TASK [postgres : Restart Postgres] *********************************************
changed: [localhost] => (item=stop)
changed: [localhost] => (item=remove)
changed: [localhost] => (item=add)

result in the postgres directory /srv/postgres being removed.

The supervisord config for the postgresql process that’s there in /etc/supervisord.d/pmm.ini looks like so:

[program:postgresql]
priority = 1
command =
    /usr/pgsql-11/bin/postgres
        -D /srv/postgres
        -c shared_preload_libraries=pg_stat_statements
        -c pg_stat_statements.max=10000
        -c pg_stat_statements.track=all
        -c pg_stat_statements.save=off
user = postgres
autorestart = true
autostart = true
startretries = 10
startsecs = 1
stopsignal = INT  ; Fast Shutdown mode
stopwaitsecs = 300
; postgresql.conf contains settings to log to stdout,
; so we delegate logfile management to supervisord
stdout_logfile = /srv/logs/postgresql.log
stdout_logfile_maxbytes = 30MB
stdout_logfile_backups = 2
redirect_stderr = true

postgresql looks at /srv/postgres (as indicated by the -D flag passed to /usr/pgsql-11/bin/postgres, but those ansible tasks remove that directory, and consequently, postgresql fails to start and then enters BACKOFF.

This, in turn, causes PMM to break, and results in nodes not being monitored.

We are running pmm-server 2.25’s docker image, and have been doing running this image for the past few months now.

1 Like

I am facing the exact same issue. Did you happen to find a temporary fix for this?

1 Like

I’m not exactly sure which version it was, but postgres was updated from 11 to 14 at some point. here’s my pmm.ini:

priority = 1
command =
    /usr/pgsql-14/bin/postgres
        -D /srv/postgres14
        -c shared_preload_libraries=pg_stat_statements
        -c pg_stat_statements.max=10000
        -c pg_stat_statements.track=all
        -c pg_stat_statements.save=off
user = postgres
autorestart = true
autostart = true
startretries = 10
startsecs = 1
stopsignal = INT  ; Fast Shutdown mode
stopwaitsecs = 300
; postgresql.conf contains settings to log to stdout,
; so we delegate logfile management to supervisord
stdout_logfile = /srv/logs/postgresql14.log
stdout_logfile_maxbytes = 30MB
stdout_logfile_backups = 2
redirect_stderr = true

in my /srv directory I have both /srv/postgres11 and /srv/postgres14 but no /srv/postgres. I recall some issues with 2.25.0 but not sure if this was one of them.

1 Like