PostgreSQL + PMM - memory usage problem

coldycoldy1 · July 14, 2022, 9:55am

PostgreSQL 14.2 + PMM 2.26.0-6.el8

Hi guys,
I have a problem with the PMM application and memory usage.

I have two servers with several PostgreSQL instances, these servers are replicated against each other. The PMM application server is located outside the database servers. Only pmm-client is installed on both PostgreSQL servers. The first server is working fine, but the problem is with the second server.

For some time I have noticed the following entries in the system logs:

*Jul 13 01:23:00 XXXXXX pmm-agent[3280]: #033[36mINFO#033[0m[2022-07-13T01:23:00.027+02:00] Sending 14 buckets.                           #033[36magentID#033[0m=/agent_id/XXXXXX #033[36mcomponent#033[0m=agent-builtin #033[36mtype#033[0m=qan_postgresql_pgstatements_agent*

*Jul 13 01:23:12 XXXXXX pmm-agent[3280]: #033[36mINFO#033[0m[2022-07-13T01:23:12.514+02:00] time="2022-07-13T01:23:12+02:00" level=error msg="error retrieving settings: error running query on database \"XXXXXX:5436\": pg read tcp XXXXXX:37866->XXXXXX:5436: i/o timeout" source="postgres_exporter.go:1612"  #033[36magentID#033[0m=/agent_id/XXXXXX #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter*

*Jul 13 01:23:12 XXXXXX pmm-agent[3280]: #033[36mINFO#033[0m[2022-07-13T01:23:12.641+02:00] time="2022-07-13T01:23:12+02:00" level=info msg="Error running query on database \"XXXXXX:5434\": pg_postmaster_uptime read tcp XXXXXX:48400->XXXXXX:5434: i/o timeout" source="postgres_exporter.go:1433"  #033[36magentID#033[0m=/agent_id/XXXXXX #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter*

*Jul 13 01:23:12 XXXXXX pmm-agent[3280]: #033[36mINFO#033[0m[2022-07-13T01:23:12.687+02:00] time="2022-07-13T01:23:12+02:00" level=error msg="queryNamespaceMappings returned 1 errors" source="postgres_exporter.go:1612"  #033[36magentID#033[0m=/agent_id/XXXXXX #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter*

*Jul 13 01:23:12 XXXXXX pmm-agent[3280]: #033[36mINFO#033[0m[2022-07-13T01:23:12.969+02:00] time="2022-07-13T01:23:12+02:00" level=error msg="Error opening connection to database (postgres://pgpool:PASSWORD_REMOVED@XXXXXX:5432/XXXXXX?connect_timeout=1&sslmode=disable): \"read tcp XXXXXX:52026->XXXXXX:5432: i/o timeout\": too many connection retries" source="postgres_exporter.go:1612"  #033[36magentID#033[0m=/agent_id/XXXXXX #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=postgres_exporter*

Logs about “Error opening connection to database”, “i/o timeout": too many connection retries”, “error when scraping”, “Proceeding with outdated query maps, as the Postgres version could not be determined: error scanning version string on” repeats for a few minutes, then an oom_killer is called, which kills all processes in a sequence, including postgres. The machine then dies. This situation was repeated 4 times in the last month. A week ago I disabled the PMM service on this server - it works fine so far.

There are 512GB of RAM on this machine. 354 GB RAM consumes HugePages, the rest remain free for the operating system. Do you have any ideas what is the cause of the failure?

Agustin_G · July 28, 2022, 12:17am

Hi Coldy, welcome to the forums!

I wonder if this is linked to the following already-fixed bug in any way:
https://jira.percona.com/browse/PMM-8646

It would be nice if you could create a new bug report (https://jira.percona.com/projects/PMM/issues), and upload full postgres and PMM client logs and configurations used, for us to check. If you need, you can request a way to upload the logs to be not shared publicly, or edit them as you are doing here.

Topic		Replies	Views
Pmm-agent and insane memory usage PMM 2.x	5	1904	December 16, 2020
Pmm-client is consuming Memory PMM 1.x	5	961	August 14, 2018
PMM Client Holding onto a LOT of memory PMM 1.x	4	715	March 7, 2017
pmm client side mysqld_exporter consuming a lot of memory PMM 2.x	2	869	July 15, 2020
Abnormal memory usage PMM 1.x	4	832	May 3, 2018

PostgreSQL + PMM - memory usage problem

Related topics