Rootless podman based pmm server ver 3.4.1 remains unhealthy

Hi

WHen I try to startup PMM server container ver 3.4.1 on OL9 the container status remains unhealthy. There are no further errors. The container startup is success. pmm-admin status also is clean. Appreciate help

podman inspect command " podman inspect --format=‘{{json .State.Health.Log}}’ pmm-server | jq . [ { “Start”: “2025-11-10T09:37:52.957726152Z”, “End”: “2025-11-10T09:37:53.086056494Z”, “ExitCode”: 1, “Output”: “” }, { “Start”: “2025-11-10T09:37:56.959437327Z”, “End”: “2025-11-10T09:37:57.076073872Z”, “ExitCode”: 1, “Output”: “” }, { “Start”: “2025-11-10T09:38:00.977739671Z”, “End”: “2025-11-10T09:38:01.068718548Z”, “ExitCode”: 1, “Output”: “” }, { “Start”: “2025-11-10T09:38:04.957728267Z”, “End”: “2025-11-10T09:38:05.083186322Z”, “ExitCode”: 1, “Output”: “” }, { “Start”: “2025-11-10T09:38:08.955157394Z”, “End”: “2025-11-10T09:38:09.053908439Z”, “ExitCode”: 1, “Output”: “” } ] "

podman exec -it pmm-server pmm-admin status Agent ID : pmm-server Node ID : pmm-server Node name: pmm-server PMM Server: URL : https://127.0.0.1:8443/ Version: 3.4.1 PMM Client: Connected : true Time drift : 59.857µs Latency : 160.431µs Connection uptime: 100 pmm-admin version: 3.4.1 pmm-agent version: 3.4.1 Agents: 94535680-d7fb-46c0-be56-f16c6970853e node_exporter Running 42000 f4349f6e-5aed-40ae-b4ff-ed4fdb8b5b12 postgresql_pgstatements_agent Running 0 fed375f8-61b0-4f9d-a150-1ba062b023cd postgres_exporter Running 42001 "

podman logs pmm-server
Running as UID 1000
Checking /usr/share/pmm-server directory structure…
Creating nginx temp directories…
Generating self-signed certificates for nginx…
Checking nginx configuration…
nginx: [alert] could not open error log file: open() “/var/log/nginx/error.log” failed (2: No such file or directory)
2025/11/10 09:49:22 [warn] 10#10: “ssl_stapling” ignored, issuer certificate not found for certificate “/srv/nginx/certificate.crt”
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/grafana.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/nomad-server.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/pmm.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/qan-api2.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/supervisord.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/victoriametrics.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/vmalert.ini” during parsing
2025-11-10 09:49:22,894 INFO Included extra file “/etc/supervisord.d/vmproxy.ini” during parsing
2025-11-10 09:49:22,901 INFO RPC interface ‘supervisor’ initialized
2025-11-10 09:49:22,902 INFO supervisord started with pid 1
2025-11-10 09:49:23,905 INFO spawned: ‘pmm-init’ with pid 17
2025-11-10 09:49:23,908 INFO spawned: ‘postgresql’ with pid 18
2025-11-10 09:49:23,910 INFO spawned: ‘clickhouse’ with pid 19
2025-11-10 09:49:23,917 INFO spawned: ‘grafana’ with pid 20
2025-11-10 09:49:23,924 INFO spawned: ‘nginx’ with pid 21
2025-11-10 09:49:23,933 INFO spawned: ‘victoriametrics’ with pid 22
2025-11-10 09:49:23,943 INFO spawned: ‘vmalert’ with pid 26
2025-11-10 09:49:23,955 INFO spawned: ‘vmproxy’ with pid 31
2025-11-10 09:49:23,959 INFO spawned: ‘qan-api2’ with pid 32
2025-11-10 09:49:23,980 INFO spawned: ‘pmm-managed’ with pid 44
2025-11-10 09:49:24,055 INFO exited: qan-api2 (exit status 1; not expected)
2025-11-10 09:49:24,904 INFO success: pmm-init entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,918 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,918 INFO success: clickhouse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,918 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,923 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,962 INFO success: victoriametrics entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,962 INFO success: vmalert entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:24,962 INFO success: vmproxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:25,021 INFO success: pmm-managed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:25,132 INFO spawned: ‘qan-api2’ with pid 364
2025-11-10 09:49:25,298 INFO exited: qan-api2 (exit status 1; not expected)
2025-11-10 09:49:27,399 INFO spawned: ‘qan-api2’ with pid 443
2025-11-10 09:49:28,571 INFO success: qan-api2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:49:29,992 INFO exited: pmm-init (exit status 0; expected)
2025-11-10 09:49:54,352 INFO spawned: ‘pmm-agent’ with pid 532
2025-11-10 09:49:55,533 INFO success: pmm-agent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-10 09:51:39,006 WARN received SIGTERM indicating exit request
2025-11-10 09:51:39,007 INFO waiting for postgresql, clickhouse, grafana, nginx, victoriametrics, vmalert, vmproxy, qan-api2, pmm-managed, pmm-agent to die
2025-11-10 09:51:39,013 INFO stopped: pmm-agent (exit status 0)
2025-11-10 09:51:39,168 INFO stopped: pmm-managed (exit status 0)
2025-11-10 09:51:39,171 INFO stopped: qan-api2 (exit status 0)
2025-11-10 09:51:39,172 INFO stopped: vmproxy (terminated by SIGINT)
2025-11-10 09:51:39,173 INFO stopped: vmalert (exit status 0)
2025-11-10 09:51:39,220 INFO stopped: victoriametrics (exit status 0)
2025-11-10 09:51:41,237 INFO stopped: nginx (exit status 0)
2025-11-10 09:51:41,255 INFO stopped: grafana (exit status 0)
2025-11-10 09:51:42,388 INFO waiting for postgresql, clickhouse to die
2025-11-10 09:51:43,462 INFO stopped: clickhouse (exit status 0)
2025-11-10 09:51:43,481 INFO stopped: postgresql (exit status 0)

Hi,

What is the output of `podman exec -t supervisorctl status` command after your container stays up for a couple of minutes?

Also, is there something that sends the “SIGTERM” signal to your container, as in this log line:

”2025-11-10 09:51:39,006 WARN received SIGTERM indicating exit request”?

[pmm@77e2056ce32e opt] # supervisorctl status
clickhouse RUNNING pid 26, uptime 0:26:26
grafana RUNNING pid 27, uptime 0:26:26
nginx RUNNING pid 28, uptime 0:26:26
nomad-server STOPPED Not started
pmm-agent RUNNING pid 481, uptime 0:26:22
pmm-init EXITED Nov 10 10:49 AM
pmm-managed RUNNING pid 33, uptime 0:26:26
postgresql RUNNING pid 25, uptime 0:26:26
qan-api2 RUNNING pid 462, uptime 0:26:23
victoriametrics RUNNING pid 29, uptime 0:26:26
vmalert RUNNING pid 30, uptime 0:26:26
vmproxy RUNNING pid 31, uptime 0:26:26
[pmm@77e2056ce32e opt] #

Yes there is a terminate signal. Is this coming from entrypoint script? I can see no error on the console log before that.

Looks like this HC command is being run and is failing. See the port 8080 fails but 80 succeeds.

podman inspect pmm-server | jq -r ‘.[0].Config.Healthcheck’
{
“Test”: [
“CMD-SHELL”,
“curl -sf http://127.0.0.1:8080/v1/server/readyz”
],
“StartPeriod”: 10000000000,
“Interval”: 3000000000,
“Timeout”: 2000000000,
“Retries”: 3
}

podman exec -it pmm-server bash
[pmm@77e2056ce32e opt] # curl -sf http://127.0.0.1:8080/v1/server/readyz
[pmm@77e2056ce32e opt] # echo $?
22
[pmm@77e2056ce32e opt] # curl -sf http://127.0.0.1:80/v1/server/readyz

Redirection

Redirect

[pmm@77e2056ce32e opt] # echo $? 0

The healcheck may fail a few times when the container starts, but then you should be able to see a sequence of 200 healthchecks.

As far as I can see your container is OK, all services are running. Are you able to access the UI?

My container is not exiting or crashing, but remains in there with status unhealthy. I can see the healthchecks never complete in success even after waiting for many minutes.

I can’t connect to web UI either. It errors out.

[pmm@blt21943001 user]$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
25ea8088adf6 docker.io/percona/pmm-server:3.4.1 /opt/entrypoint.s… 5 minutes ago Up 5 minutes (unhealthy) 0.0.0.0:443->443/tcp, 8080/tcp, 8443/tcp pmm-server
[pmm@blt21943001 user]$

[pmm@blt21943001 user]$ podman logs pmm-server
Running as UID 1000
Checking /usr/share/pmm-server directory structure…
Creating nginx temp directories…
Generating self-signed certificates for nginx…
Checking nginx configuration…
nginx: [alert] could not open error log file: open() “/var/log/nginx/error.log” failed (2: No such file or directory)
2025/11/11 04:18:05 [warn] 11#11: “ssl_stapling” ignored, issuer certificate not found for certificate “/srv/nginx/certificate.crt”
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
2025-11-11 04:18:06,070 INFO Included extra file “/etc/supervisord.d/grafana.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/nomad-server.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/pmm.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/qan-api2.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/supervisord.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/victoriametrics.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/vmalert.ini” during parsing
2025-11-11 04:18:06,071 INFO Included extra file “/etc/supervisord.d/vmproxy.ini” during parsing
2025-11-11 04:18:06,077 INFO RPC interface ‘supervisor’ initialized
2025-11-11 04:18:06,077 INFO supervisord started with pid 1
2025-11-11 04:18:07,080 INFO spawned: ‘pmm-init’ with pid 18
2025-11-11 04:18:07,081 INFO spawned: ‘postgresql’ with pid 19
2025-11-11 04:18:07,083 INFO spawned: ‘clickhouse’ with pid 20
2025-11-11 04:18:07,084 INFO spawned: ‘grafana’ with pid 21
2025-11-11 04:18:07,086 INFO spawned: ‘nginx’ with pid 22
2025-11-11 04:18:07,088 INFO spawned: ‘victoriametrics’ with pid 23
2025-11-11 04:18:07,090 INFO spawned: ‘vmalert’ with pid 24
2025-11-11 04:18:07,091 INFO spawned: ‘vmproxy’ with pid 25
2025-11-11 04:18:07,093 INFO spawned: ‘qan-api2’ with pid 26
2025-11-11 04:18:07,094 INFO spawned: ‘pmm-managed’ with pid 27
2025-11-11 04:18:07,355 INFO exited: qan-api2 (exit status 1; not expected)
2025-11-11 04:18:08,153 INFO success: pmm-init entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,153 INFO success: postgresql entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: clickhouse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: grafana entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: victoriametrics entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: vmalert entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: vmproxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,154 INFO success: pmm-managed entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:08,360 INFO spawned: ‘qan-api2’ with pid 92
2025-11-11 04:18:08,408 INFO exited: qan-api2 (exit status 1; not expected)
2025-11-11 04:18:08,645 INFO spawned: ‘pmm-agent’ with pid 106
2025-11-11 04:18:09,689 INFO success: pmm-agent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-11-11 04:18:10,459 INFO spawned: ‘qan-api2’ with pid 483
2025-11-11 04:18:10,575 INFO exited: qan-api2 (exit status 1; not expected)
2025-11-11 04:18:12,785 INFO exited: pmm-init (exit status 0; expected)
2025-11-11 04:18:13,687 INFO spawned: ‘qan-api2’ with pid 604
2025-11-11 04:18:15,085 INFO success: qan-api2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
[pmm@blt21943001 user]$

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
25ea8088adf6 docker.io/percona/pmm-server:3.4.1 /opt/entrypoint.s… About a minute ago Up About a minute (unhealthy) 0.0.0.0:443->443/tcp, 8080/tcp, 8443/tcp pmm-server
[pmm@blt21943001 user]$

Hope you are following proper way of installing PMM in podman.
Have you followed each steps here mentioned in the document?

Specially step 5

# Allow non-root users to bind to privileged ports (required for port 443)        
# Make the setting persistent
echo "net.ipv4.ip_unprivileged_port_start=443" | sudo tee /etc/sysctl.d/99-pmm.conf
sudo sysctl -p /etc/sysctl.d/99-pmm.conf

Thanks for reply. Yes indeed I have run that step too.

[root@user]# ls -l /etc/sysctl.d/99-pmm.conf
-rw-r-----. 1 root root 40 Nov 7 10:53 /etc/sysctl.d/99-pmm.conf
[root@user]# cat /etc/sysctl.d/99-pmm.conf
net.ipv4.ip_unprivileged_port_start=443

Even I can see it in action in runtime.

[root@user]# sysctl net.ipv4.ip_unprivileged_port_start
net.ipv4.ip_unprivileged_port_start = 443

I have posted one healthcheck above ““curl -sf http://127.0.0.1:8080/v1/server/readyz”“, which is currently failing. I think that is the reason the container is remaining unhealthy.

However I dont see any documentation on how to enable that internal 8080 port.

BTW these are the steps I have undertaken to configure the PMM in root less podman.

#As root ran following

echo “net.ipv4.ip_unprivileged_port_start=443” | tee /etc/sysctl.d/99-pmm.conf

sysctl -p /etc/sysctl.d/99-pmm.conf

useradd pmm

#Changed the password of the new user pmm

passwd pmm

usermod --add-subuids 100000-165535 --add-subgids 100000-165535 pmm

chown -R 1007:100 /podman/percona

chcon -Rt container_file_t /podman/percona

yum install containernetworking-plugins

loginctl enable-linger pmm

#Now logged into the user pmm

su - pmm

echo ‘export DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1007/bus’ >> ~/.bash_profile

. ~/.bash_profile

systemctl --user list-units

echo ‘
[storage]
driver = “overlay”
graphroot = “/podman/percona/”
runroot = “/podman/percona/podman-run”’>>~/.config/containers/storage.conf

podman volume create pmm-data

podman network create pmm_default

#Created following systemd files.

[root@user]# cd /home/pmm/.config/systemd/user/
[root@user]# pwd
/home/pmm/.config/systemd/user
[root@user]# ls -l
total 8
drwxr-xr-x. 2 pmm users 32 Nov 11 05:05 default.target.wants
-rw-r-----. 1 pmm users 195 Nov 10 15:04 env
-rw-r-----. 1 pmm users 627 Nov 11 04:48 pmm-server.service

cat env
PMM_IMAGE=docker.io/percona/pmm-server
PMM_IMAGE_TAG=3.4.1
PMM_VOLUME_NAME=pmm-data
PMM_PUBLIC_PORT=443

[root@user]# cat pmm-server.service
[Unit]
Description=pmm-server
Wants=network-online.target
After=network-online.target
After=nss-user-lookup.target nss-lookup.target
After=time-sync.target
[Service]
Type=simple
TimeoutStartSec=480
Restart=on-failure
RestartSec=20
#Environment file for the this unit file
EnvironmentFile=%h/.config/systemd/user/env
ExecStart=/bin/bash -c “/usr/bin/podman run
–volume=${PMM_VOLUME_NAME}:/srv:Z
–replace --name %N
–net pmm_default
–cap-add=net_admin,net_raw
-p 443:8443/tcp --ulimit=host ${PMM_IMAGE}:${PMM_IMAGE_TAG}”

ExecStop=/usr/bin/podman stop -t 10 %N
[Install]
WantedBy=default.target
[root@user]#

[pmm@user]$ systemctl --user enable --now pmm-server

#After few seconds the pmm-server container startup in unhealthy state as shown above post. The container init logs also have been posted above.

I think this is the wrong port mapping. You need to map the container port 8443 to port 443 on the host (or any other port of your choice), i.e. `podman run -d -p 443:8443 …`.

I hope that will resolve the issue.

Yes i tried that earlier today. If you see the latest unit file I pasted above had the same mapping. However that too resulted in same issue.

[pmm@~]$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
51ad3283b943 docker.io/percona/pmm-server:3.4.1 /opt/entrypoint.s… 3 minutes ago Up 3 minutes (unhealthy) 0.0.0.0:443->8443/tcp, 8080/tcp pmm-server

Can you try the following:

Let’s see if this gets your PMM up and running.

Hi,

I tried with docker based install using easy install script and that too gave exact same issue. Supervisorctl shows all daemons in RUNNING state but the docker container remains unhealthy

See the spool output below

[root@CommonCode]# docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
dd5b7419b809 docker.io/percona/pmm-server:latest /opt/entrypoint.s… 45 seconds ago Up 43 seconds (unhealthy) 0.0.0.0:443->444/tcp, 8080/tcp, 8443/tcp pmm-server
[root@CommonCode]# docker container logs pmm-server
Checking nginx configuration…
nginx: [alert] could not open error log file: open() “/var/log/nginx/error.log” failed (2: No such file or directory)
2025/11/07 05:04:21 [warn] 17#17: “ssl_stapling” ignored, issuer certificate not found for certificate “/srv/nginx/certificate.crt”

[root@CommonCode]# docker exec -it pmm-server bash

[pmm@dd5b7419b809 conf.d] # supervisorctl status
clickhouse RUNNING pid 25, uptime 0:12:56
grafana RUNNING pid 26, uptime 0:12:56
nginx RUNNING pid 27, uptime 0:12:56
nomad-server STOPPED Not started
pmm-agent RUNNING pid 441, uptime 0:12:52
pmm-init EXITED Nov 07 05:04 AM
pmm-managed RUNNING pid 32, uptime 0:12:56
postgresql RUNNING pid 24, uptime 0:12:56
qan-api2 RUNNING pid 347, uptime 0:12:55
victoriametrics RUNNING pid 28, uptime 0:12:56
vmalert RUNNING pid 29, uptime 0:12:56
vmproxy RUNNING pid 30, uptime 0:12:56
[pmm@dd5b7419b809 conf.d] #
[pmm@dd5b7419b809 conf.d] # cd /srv/logs
[pmm@dd5b7419b809 logs] # cat pmm-init.log

Why are you mapping the port 444, which is not used in PMM?

I will check that point on docker based container creation.

However regarding my earlier issue of starting up the pmm-server in rootless podman mode in “unhealthy” state. I found the reason for this.

I ran following inspect which showed me which healthcheck was failing

podman inspect pmm-server | jq ‘.[0].Config.Healthcheck.Test’
[
“CMD-SHELL”,
“curl -sf http://127.0.0.1:8080/v1/server/readyz”
]

When I tested this http request on curl, it gave me an error clue that it is failing due to the proxy set in my env file.

pmm@51ad3283b943 conf.d] # curl -v http://127.0.0.1:8080/v1/server/readyz

GET http://127.0.0.1:8080/v1/server/readyz HTTP/1.1
Host: 127.0.0.1:8080
User-Agent: curl/7.76.1
Accept: /
Proxy-Connection: Keep-Alive

I removed the proxy setting from my env file and restarted the container and it attained healthy state after around 15 seconds

. So the problem has been resolved and its working fine. Many thanks for your support and help in troubleshoot. Truly appreciated.

Regards

Amod.