Hi,
I’ve been running PMM for serveral months on a Hyper-V guest. Data was hosted on a RAID10 array of 4 10k disks. It struggled a bit with IO, but usually kept up and didn’t lose data. In recent weeks it was struggling and holes appearing in data, so I took the opportunity to create a new server on dedicated tin.
The new server has a RAID10 array of 6 disks. It is still a VM, however it has a direct LUN attachment to avoid disk alignment issues. It has 28GB of RAM assigned and all CPU cores. Prometheus only seems to be using a couple of gig.
PMM is the latest build from Docker, installed on CentOS 7.
There are six Percona servers reporting in, from two PxC clusters, and are on latest PMM-Client.
Spec below:
[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=687&type=thumb”}[/IMG2][IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/image/gif;base64,R0lGODlhAQABAPABAP///wAAACH5BAEKAAAALAAAAAABAAEAAAICRAEAOw==”}[/IMG2]
The new PMM ran OK for a couple of days, hitting hard in write IO as before. However, after a couple of days the read IO has gone high, and now we are seeing data missing.
[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=688&type=thumb”}[/IMG2]
Prometheus stats below
[IMG2=JSON]{“data-align”:“none”,“data-size”:“full”,“src”:“https://www.percona.com/forums/filedata/fetch?filedataid=689&type=thumb”}[/IMG2]
I have removed mysql:queries, but this doesn’t seem to have helped (slowlog was off anyway, so possibly this wasn’t a factor). Here’s example of the client config of one cluster.
Any suggestions on what I can do to get some performance back? Looking at some other threads, 157k may be a lot of “Time Series” for six servers. Is there a way to just make it collect less data?