PMM linux:metrics service causing abnormal load on client machine


PMM - 1.9.0
PMM linux:metrics causing HIGH CPU load on client machine however SAR report doesn’t suggest load.
CPU LOAD goes upto 500-1000 (using "w"command).

at the same time
PMM “system overview”, “load average” graph also showing the same load. (900+) post which it stopped populating the “system overview” graphs.
we have multiple machine configured to monitor using PMM and the problem is only for this one client.

The moment PMM linux:metrics service is stopped CPU load came down to normal.
We started and stopped the PMM linux:metrics multiple times to confirm the issue.

Attached here-with the pmm-server.log and pmm-client.log

pmm-server.txt (2.01 KB)

pmm-client.txt (5.07 KB)

Hi vaibhav_upadhyay40

I didn’t see anything that jumps out from the command line output you shared, however I see that linux:metrics was down when you performed the data collection, so we don’t have a full capture of the poor performance behaviour.

Can I suggest you start linux:metrics up on one server, and then after a period of time please share a snapshot of dashboard Prometheus Exporter Status for the host where you have linux:metrics running? It should then report to us which submodule is consuming the bulk of the CPU time and causing the high load.$__auto_interval&var-host=ps57

Snapshots is a new feature since PMM 1.9, please see the documentation here:

Hi Michael, issue is resolved. It was due to nfs service. PMM working fine post fixing the NFS service issue. Thanks once again.