PMM Performance

Hi,

I’ve been running PMM for serveral months on a Hyper-V guest. Data was hosted on a RAID10 array of 4 10k disks. It struggled a bit with IO, but usually kept up and didn’t lose data. In recent weeks it was struggling and holes appearing in data, so I took the opportunity to create a new server on dedicated tin.

The new server has a RAID10 array of 6 disks. It is still a VM, however it has a direct LUN attachment to avoid disk alignment issues. It has 28GB of RAM assigned and all CPU cores. Prometheus only seems to be using a couple of gig.

PMM is the latest build from Docker, installed on CentOS 7.

There are six Percona servers reporting in, from two PxC clusters, and are on latest PMM-Client.

Spec below:

[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=687&type=thumb”}[/IMG2][IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/image/gif;base64,R0lGODlhAQABAPABAP///wAAACH5BAEKAAAALAAAAAABAAEAAAICRAEAOw==”}[/IMG2]​

The new PMM ran OK for a couple of days, hitting hard in write IO as before. However, after a couple of days the read IO has gone high, and now we are seeing data missing.

[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=688&type=thumb”}[/IMG2]
Prometheus stats below

[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=689&type=thumb”}[/IMG2]

I have removed mysql:queries, but this doesn’t seem to have helped (slowlog was off anyway, so possibly this wasn’t a factor). Here’s example of the client config of one cluster.

Any suggestions on what I can do to get some performance back? Looking at some other threads, 157k may be a lot of “Time Series” for six servers. Is there a way to just make it collect less data?

I’ve potentially fixed this myself. Based on what I saw in two other articles:

  1. Increased the memory allocated to Prometheus (256MB default. Now raised to 4GB)
  2. Turned off table stats
    sudo pmm-admin remove mysql:metrics
    sudo pmm-admin add mysql:metrics --user pmm-mysql --password whateveryourpasswordis --disable-tablestats

Time Series is now a tenth of what it was

and disk IO has gone to something much more manageable.

I think the installation instructions for PMM-Client need to be include --disable-tablestats as I’m sure everyone will be doing the same thing. Or at least a mention that this is recommended for large deployments (which I’m sure is largely the case with PxC users).

PS - do we really need a CAPTCHA for every post and edit given we’re all authenticated? (and apparently first post vetted?)

Hi RichardGriffiths , [LIST=1]
[]Your tuning approach is correct, increasing memory for the PMM server container is the first step to ensure Prometheus can cache incoming scrapes in memory.
[
]Disabling table stats is an acceptable step in order to throttle the incoming volume of data to disk. One other approach would be to set lower the scrape_interval to perhaps 5s in /etc/prometheus.yml so that you still get the advantage of maximum data collection but just at a lower resolution.
[*]I’ve filed a request to our web team about the CAPTCHA requirement - as a moderator I’m also required to do this each post and yes it is a little excessive. Watch for some improvement shortly. :slight_smile:
[/LIST]

Hello Richard Griffiths, Michael.
Captcha is no longer required, keep posting! :wink: