Problems after upgrade from 1.1.5 to 1.2.0

Aleksey,

It must be luck… :slight_smile:

In any case I appreciate all the time and effort you put into helping us to make PMM better.

Let me ask the basic question - so you do basic install of PMM on m4.xlarge instance (4vCPU and 16GB of memory). You do not do any special configuration right ?
How do you have your EBS configured ?

When you’re adding one node - what is it ? MySQL Server ? Any special options you disable

Once you enable if you can upload how your prometheus dashboards looks like. And what wrong you’re seeing ? Note among other things Prometheus dashboard should show CPU and memory usage by prometheus process.

Hi Aleksey and Peter !

I have had the same problem after upgrading from 1.1.5 to 1.2.0 (on both server and a few clients) yesterday. Started to get gaps in the graphs immediately. Reverted the server version to
1.1.5 this morning and the gaps have disappeared on (at least what i think) most of the graphs. Not sure about InnoDB Log Buffer Performance for example.
The difference between my setup and Alekseys is that we run on physical hardware (lots of RAM and SSD RAID) in our own datacentre.
We haven’t reinstalled any clients and haven’t noticed anything special with CPU usage on the server.

So yes, something seems to be weird with version 1.2.0

BR
Johan

1. “so you do basic install of PMM on m4.xlarge instance (4vCPU and 16GB of memory)”

Yes

2. “You do not do any special configuration right ?”

  • instance was upgraded to last centos7

  • docker was moved from /var/lib/docker to /data/docker using simlink:

ls -l /data/docker/

drwx------ 5 root root 222 Авг 6 16:54 containers
drwx------ 3 root root 21 Авг 5 18:16 image
drwxr-x— 3 root root 19 Авг 5 18:16 network
drwx------ 27 root root 4096 Авг 6 16:54 overlay
drwx------ 4 root root 32 Авг 5 18:16 plugins
drwx------ 2 root root 6 Авг 5 18:16 swarm
drwx------ 2 root root 6 Авг 6 16:54 tmp
drwx------ 2 root root 6 Авг 5 18:16 trust
drwx------ 6 root root 313 Авг 5 18:28 volumes
[root@MySQL-PMMC ~]# ls -l /var/lib/ | grep docker
lrwxrwxrwx 1 root root 12 Авг 5 18:25 docker → /data/docker

3. "How do you have your EBS configured ? "

Volume 1 - Volume type “gp2”, IOPS 100/3000, Mountpoint “/”, Size 20Gb, Filesystem XFS, Not Encrypted
Volume 2 - Volume type “gp2”, IOPS 330/3000, Mountpoint “/data”, Size 110Gb, Filesystem XFS, Encrypted

4. “When you’re adding one node - what is it ? MySQL Server ?”

Yes, it is Percona MySQL Server 5.7.18. 60 databases(changes all the time) and 3500 tables for now.

5. “Any special options you disable”

Some disable, some enable.

Docker pmm setup:

docker run -d
-p 80:80
–volumes-from pmm-data
–name pmm-server
–restart always
–env TZ=“Europe/Kiev”
–env METRICS_RESOLUTION=5s
percona/pmm-server:latest

Client setup:

yum install -y pmm-client
pmm-admin config --server pmm.srv --client-name mysql.db
pmm-admin add linux:metrics
pmm-admin add mysql:metrics --disable-tablestats-limit 3000
pmm-admin add mysql:queries --query-source slowlog

6. And what wrong you’re seeing ? Note among other things Prometheus dashboard should show CPU and memory usage by prometheus process.

Later

I tried to find what you want in 6 question in prometheus PMM 1.1.5. I was not successful :). I upgraded to 1.2.0(problems back), and was unsuccessful again. Can you show me screen shot or instruction what exactly i have to find in prometheus and share? Screen shot is better.

fwiw, I am also seeing gaps and no value for “Current QPS” after upgrading 1.1.5 to 1.2

Wow, I’m not alone :slight_smile:
[URL=“https://www.percona.com/forums/questions-discussions/percona-monitoring-and-management/48687-prometheus-high-cpu”]https://www.percona.com/forums/quest...theus-high-cpu[/URL]
[URL=“https://www.percona.com/forums/questions-discussions/percona-monitoring-and-management/49047-pmm-1-2-0-a-lot-of-data-is-not-shown”]https://www.percona.com/forums/quest...a-is-not-shown[/URL]

I will try advice from related post “METRICS_MEMORY=786432

Thank you. Yes please try Metrics Memory increase. I’m still puzzled why it can’t handle even single MySQL, especially considering you’re using 5sec as resolution

increasing “Metrics Memory” didn’t help in my case

Hn,

OK can you upload the image what you’re seeing on your prometheus dashboard such as this
[url]https://pmmdemo.percona.com/graph/dashboard/db/prometheus?refresh=1m&orgId=1[/url]

docker run -d -p 80:80 --volumes-from pmm-data --name pmm-server -e SERVER_USER=pmm -e SERVER_PASSWORD=123456 -e METRICS_MEMORY=786432 --restart always --init percona/pmm-server:1.2.0

Shame on me. I must read previous topics carefully.
Yes, METRICS_MEMORY resolve problems with gaps. I think it is good idea to fix this in next release, or make mention in documentation.

docker run -d
-p 80:80
–volumes-from pmm-data
–name pmm-server
–restart always
–env TZ=“Europe/Kiev”
–env METRICS_RESOLUTION=5s
–env METRICS_MEMORY=7864320
percona/pmm-server:latest

So, biggest problem resolved.
liuqian, roma.novikov, Mykola, Peter - thanks guys!
Only one problem was not resolved - “no value” in “Current QPS” on “MySQL overview”.

Hi all,

Setting METRICS_MEMORY seems to have fixed the gaps for me. Happily running 1.2.0 now. Have also updated my docs

BR
Johan

aleksey.filippov , I think 5 sec - is the core of "QPS problem ". Created [url][PMM-1275] QPS SingleStat broken with scrape_interval: 5s - Percona JIRA and we’ll take a look what we can do with this.

Thanks, Roman. Waiting for result :slight_smile:

fwiw, I also have resolution set to 5 seconds

Hi,

Edit the graph

replace:

rate(mysql_global_status_queries{instance=“$host”}[1s]) or irate(mysql_global_status_queries{instance=“$host”}[2s])

with

rate(mysql_global_status_queries{instance=“$host”}[1s]) or irate(mysql_global_status_queries{instance=“$host”}[5m])

that works, thx

Great! Thank you for your contributions and please continue reporting if you find stuff which is broken!

Thanks Peter, it is really helped. One question - will this problems(both with gaps and QPS) be fixed in next release?

Hi aleksey.filippov , We expect to have this bug resolved in our 1.2.2 release, which we expect will be either late August or early September. You can follow the ticket here: [url][PMM-1277] Current QPS Graph has inappropriate prometheus Query - Percona JIRA