Empty data on home-dashboard

Myles · March 5, 2018, 5:17am

Hi I’m new to PMM. I just setup to monitor two MySQL servers following the same installation process. One machine works well, but the other one has no data at all for the whole row on the home-dashboard page. On other pages like “mysql-overview”, data is complete for both MySQL instances.

I’ve checked “check-network” and everything is OK with latencies <10ms. Also I checked the log files “pmm-*.log” on the client machine, as well as the log files on the PMM server inside the docker. The only suspicious thing is in the “prometheus.log” file on the server docker, as following:

[HTML]


time="2018-03-05T10:41:45Z" level=warning msg="Error on ingesting out-of-order samples" numDropped=42 source="scrape.go:534" 

time="2018-03-05T10:41:45Z" level=warning msg="Scrape health sample discarded" error="sample timestamp out of order" sample=up{instance="db-dedicated.ff.digital", job="mysql"} => 1 &#64;[1520246504.445] source="scrape.go:587" 

time="2018-03-05T10:41:45Z" level=warning msg="Scrape duration sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 1.441987055 &#64;[1520246504.445] source="scrape.go:590" 

time="2018-03-05T10:41:45Z" level=warning msg="Scrape sample count sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 1.441987055 &#64;[1520246504.445] source="scrape.go:593" 

time="2018-03-05T10:41:45Z" level=warning msg="Scrape sample count post-relabeling sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 1.441987055 &#64;[1520246504.445] source="scrape.go:596" 

time="2018-03-05T10:41:46Z" level=warning msg="Error on ingesting out-of-order samples" numDropped=42 source="scrape.go:534" 

time="2018-03-05T10:41:46Z" level=warning msg="Scrape health sample discarded" error="sample timestamp out of order" sample=up{instance="db-dedicated.ff.digital", job="mysql"} => 1 &#64;[1520246506.364] source="scrape.go:587" 

time="2018-03-05T10:41:46Z" level=warning msg="Scrape duration sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.625042525 &#64;[1520246506.364] source="scrape.go:590" 

time="2018-03-05T10:41:46Z" level=warning msg="Scrape sample count sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.625042525 &#64;[1520246506.364] source="scrape.go:593" 

time="2018-03-05T10:41:46Z" level=warning msg="Scrape sample count post-relabeling sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.625042525 &#64;[1520246506.364] source="scrape.go:596" 

time="2018-03-05T10:41:51Z" level=warning msg="Error on ingesting out-of-order samples" numDropped=42 source="scrape.go:534" 

time="2018-03-05T10:41:51Z" level=warning msg="Scrape health sample discarded" error="sample timestamp out of order" sample=up{instance="db-dedicated.ff.digital", job="mysql"} => 1 &#64;[1520246511.373] source="scrape.go:587" 

time="2018-03-05T10:41:51Z" level=warning msg="Scrape duration sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.45708211 &#64;[1520246511.373] source="scrape.go:590" 

time="2018-03-05T10:41:51Z" level=warning msg="Scrape sample count sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.45708211 &#64;[1520246511.373] source="scrape.go:593" 

time="2018-03-05T10:41:51Z" level=warning msg="Scrape sample count post-relabeling sample discarded" error="sample timestamp out of order" sample=scrape_duration_seconds{instance="db-dedicated.ff.digital", job="mysql"} => 0.45708211 &#64;[1520246511.373] source="scrape.go:596"

[/HTML]

But when I checked system time on all MySQL servers, the PMM server docker and the docker host machine, they are all correct. Does any one have an idea?

AFAIK, the only differences between the two MySQL servers are the networks. One is within the same internal network as the PMM server, while the other (the one with no data) is not with slightly different firewall config. But I’m sure all required ports are open.

lorraine.pocklington · March 15, 2018, 3:48am

Hi there, sorry for the delay, are you still having this problem?

Can I just check you ran through the troubleshooting advice [URL=“Troubleshooting Percona Monitoring and Management (PMM) Metrics - Percona Database Performance Blog”]https://www.percona.com/blog/2018/01...t-pmm-metrics/[/URL] ?
There’s also a section on checking network here [url]Percona Monitoring and Management

Meanwhile I will see if I can find any other information for you.

Roma_Novikov · March 15, 2018, 5:11am

Myles , it looks like some problem with Date - time. so better check and runtime sync on your server and client
Also pls, check the [url]https://YOURSERVER/prometheus/targets[/url] page, is there any errors?
and [url]https://YOURSERVER/graph/dashboard/db/prometheus[/url] can also provide some info about the problem

Myles · March 16, 2018, 12:31am

Thank you but I’m afraid it does not solve the problem. I followed the troubleshooting and everything seems good.

Myles · March 16, 2018, 12:45am

Hi Roma,

I re-checked the servers and they both have time in sync.
All prometheus targets UP and “last scrape” for 2 “linux” targets are within 3 seconds.
Prometheus graph shows it’s using very low percentage of resources (<10% CPU & mem, etc.), everything looks good.

I can also see the following when I visit https://<PMM_client_ip>:42000/metrics, which means the client is providing Linux related metrics correctly.

…

TYPE node_cpu counter node_cpu{cpu=“cpu0”,mode=“guest”} 0 node_cpu{cpu=“cpu0”,mode=“idle”} 5.589322065e+07 node_cpu{cpu=“cpu0”,mode=“iowait”} 2.85612548e+06 node_cpu{cpu=“cpu0”,mode=“irq”} 1112.9 node_cpu{cpu=“cpu0”,mode=“nice”} 1129.16 node_cpu{cpu=“cpu0”,mode=“softirq”} 30628.39 node_cpu{cpu=“cpu0”,mode=“steal”} 0 node_cpu{cpu=“cpu0”,mode=“system”} 149943.6 node_cpu{cpu=“cpu0”,mode=“user”} 2.1839946e+06 node_cpu{cpu=“cpu1”,mode=“guest”} 0 node_cpu{cpu=“cpu1”,mode=“idle”} 5.618002032e+07 node_cpu{cpu=“cpu1”,mode=“iowait”} 2.57305446e+06 node_cpu{cpu=“cpu1”,mode=“irq”} 1115.82 node_cpu{cpu=“cpu1”,mode=“nice”} 1070.64 node_cpu{cpu=“cpu1”,mode=“softirq”} 28261.28 node_cpu{cpu=“cpu1”,mode=“steal”} 0 node_cpu{cpu=“cpu1”,mode=“system”} 150987.33 node_cpu{cpu=“cpu1”,mode=“user”} 2.18997028e+06 node_cpu{cpu=“cpu2”,mode=“guest”} 0 node_cpu{cpu=“cpu2”,mode=“idle”} 5.854144019e+07 node_cpu{cpu=“cpu2”,mode=“iowait”} 459594.04 node_cpu{cpu=“cpu2”,mode=“irq”} 1125.94 node_cpu{cpu=“cpu2”,mode=“nice”} 475.36 node_cpu{cpu=“cpu2”,mode=“softirq”} 21815.05 node_cpu{cpu=“cpu2”,mode=“steal”} 0 node_cpu{cpu=“cpu2”,mode=“system”} 130786.26 node_cpu{cpu=“cpu2”,mode=“user”} 2.00093037e+06 node_cpu{cpu=“cpu3”,mode=“guest”} 0 node_cpu{cpu=“cpu3”,mode=“idle”} 5.861535097e+07 node_cpu{cpu=“cpu3”,mode=“iowait”} 403150.54 node_cpu{cpu=“cpu3”,mode=“irq”} 1122.46 node_cpu{cpu=“cpu3”,mode=“nice”} 587.5 node_cpu{cpu=“cpu3”,mode=“softirq”} 21959.08 node_cpu{cpu=“cpu3”,mode=“steal”} 0 node_cpu{cpu=“cpu3”,mode=“system”} 124979.33 …

Myles · March 16, 2018, 1:12am

I think I found my own answer, which looks like a bug. The following is the problematic part of page I see at http:///graph/dashboard/db/home-dashboard?orgId=1, and it is the only page I found with problem. Nothing else is wrong. And when I click on the big “no value” box, it takes me to http:///graph/dashboard/db/system-overview?from=now-12h&to=now&var-interval=$__auto_interval&var-host=All&orgId=1 with no value displaying too.
The thing I noticed is “var-host=All” in the second URL, which means no specific host is selected. Thus the second page I believe is intended behaviour.
And finally I solved my problem by changing the “client_name” value in pmm.yml at client machine, like in the screenshot attached. “db-0.fake.domain” is the old name, and I changed it to “db-dedicated”.

Conclusion: I believe it is a bug that when client_name(host) contains dot (.) the dashboard landing page will not recognise it as “var-host” value when fetching data from webpage backend, thus displaying “no value” and providing a wrong link.

jpage · March 16, 2018, 2:58am

Same problem here with a client with a dot (.) in its name.

lorraine.pocklington · March 16, 2018, 5:33am

Hi there, thanks for this input, I am just going to bring it to the attention of the PMM team so they can check it out.
Appreciated!

vvsaxena · March 16, 2018, 8:34am

Yeah , its a bug. I also tested and it works fine as far as FQDN does not have dot “.” in between them as “client_name” in pmm.yml file on pmm client machine. Re-install everything with this fix , it should work.

lorraine.pocklington · March 16, 2018, 8:53am

Hi all thanks for all your input, this has been registered as a bug, and I’ll update as soon as there’s a status change

Topic		Replies	Views
PMM Gravana does not show all data PMM 1.x	65	4511	March 21, 2017
PMM MySql 8.0.11 Instances Not Showing Graphs PMM 1.x	28	2459	January 17, 2020
PMM Dashboard is showing N/A or null for all hosts PMM 1.x	11	1603	March 26, 2018
PMM OVA No Data Graphina PMM 1.x	15	1524	January 12, 2018
PMM Query Analytics - There is no query data for the selected host PMM 1.x	14	4340	January 19, 2018

Empty data on home-dashboard

Related topics