Grafana Graph works only sporadic

shoman · October 26, 2016, 5:40am

A have installed PMM 1.05 and adding some hosts with linux:metrics and mysql:metrics, but Grafana is showing the graph very sporadic.

In times i have no graphs the client state is down.


SERVICE TYPE NAME REMOTE ENDPOINT STATUS
-------------- ------- ---------------------- -------
linux:metrics db-1-1 144.76.15.145:42000 DOWN 
mysql:metrics db-1-1 144.76.15.145:42002 DOWN

I run pmm-server as following to prevent sync issues.

docker run -d \
-p 44444:443 \
-e METRICS_RESOLUTION=5s \
--volumes-from pmm-data \
--name pmm-server \
-v /usr/share/pmm:/etc/nginx/ssl \
--restart always \
percona/pmm-server:1.0.5

But i don’t think this a problem

Connection duration | 323.078µs
Request duration | 4.35313ms
Full round trip | 4.676208ms

PING db-1-1 (144.76.15.145) 56(84) bytes of data.
64 bytes from db-1-1 (x.x.x.x): icmp_seq=1 ttl=60 time=0.311 ms
64 bytes from db-1-1 (x.x.x.x): icmp_seq=2 ttl=60 time=0.333 ms
64 bytes from db-1-1 (x.x.x.x): icmp_seq=3 ttl=60 time=0.684 ms

Server side: Wed Oct 26 10:56:51 UTC 2016
Client Side: Wed Oct 26 12:56:51 CEST 2016

If i take a look in /var/log/prometheus.log there are no error entries. Sometimes, there are absolute no entries over a few hours.

I also can do in time i get no data:

root&#64;756c3afeceba:/opt# curl http://db-1-1:42002
<html>
<head><title>MySQL 3-in-1 exporter</title></head>
<body>
<h1>MySQL 3-in-1 exporter</h1>
<li><a href="/metrics-hr">high-res metrics</a></li>
<li><a href="/metrics-mr">medium-res metrics</a></li>
<li><a href="/metrics-lr">low-res metrics</a></li>
</body>
</html>

I have no idea, why sometimes data is available an often not.

weber · October 26, 2016, 6:06am

You can watch how the container performs docker stats pmm-server. May be the monitor server is not having enough resources.

shoman · October 26, 2016, 6:14am

Hi,

thanks for the fast answer.

On the physical machine there are more as enough resources and here is the output from docker:

CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
pmm-server 3.13% 0 B / 0 B 0.00% 172.2 MB / 224.5 MB 116.3 MB / 294.9 GB 0

I think it’s another problem.

geoiii · October 26, 2016, 2:44pm

I have found that I needed to bump up the memory in the docker with:

METRICS_MEMORY=2097152

geoiii · October 26, 2016, 3:57pm

I have been having this (or similar issue) and it seems to be that it runs out of memory to ingest the metrics coming in to prometheus.
And usually when it dies it really dies and I have to restart the docker.

I have been bumping up the memory with this command:

METRICS_MEMORY=2097152

In the docker run command line. It has some fairly small default - that was working fine until I added one too many servers or has a peak in events.

You might take a look at the Prometheus graph in grafana.

My running server looks like this when I run docker stats pmm-server

CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O

pmm-server 71.24% 6.77 GB / 7.934 GB 85.33% 6.932 GB / 379.2 MB 3.766 GB / 18.15 GB

shoman · October 27, 2016, 3:21am

I test it again …

… adding slowly monitored hosts to the pmm server.

After doing this for the first 9 server only 3 minutes after adding number 10 the monitoring stopped completely.

Again there are no log entries in /var/log/prometheus.log and ALL metrics on ALL servers are in running state, but without connectivity to the server:

SERVICE TYPE NAME REMOTE ENDPOINT STATUS
-------------- ------- ---------------------- -------
linux:metrics db-1-2 x.x.x.x:42000 DOWN 
mysql:metrics db-1-2 x.x.x.x:42002 DOWN

PMM is a nice and helpfull tool, but it seems not very robust and we are not able to work with it at the moment.

What can I do?

photoid=46278

weber · October 27, 2016, 3:57am

What is OS and docker version?

When the connectivity is lost, does this work curl http://db-1-2:42000/metrics from inside the container?

shoman · October 27, 2016, 5:56am

Distributor ID: Debian
Description: Debian GNU/Linux 8.6 (jessie)
Release: 8.6
Codename: jessie

Docker version 1.12.2, build bb80604

If I run curl http://db-1-2:42000/metrics from inside the container:

process_cpu_seconds_total 23.42
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 1.370112e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.47749268343e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 2.7041792e+07
...

Since my last post i got 3 times data without changing anything on the infrastructure.

It’s strange.

weber · October 27, 2016, 6:06am

When pmm-admin check-network shows DOWN for endpoint, does curl from inside container for the same one work? I don’t think it’s possible.

shoman · October 27, 2016, 6:57am

Thx weber and geoiii,

i run the container now as following:


docker run -d \
-p 44444:443 \
-e METRICS_RESOLUTION=5s \
-e METRICS_MEMORY=4124672 \
--volumes-from pmm-data \
--name pmm-server \
-v /usr/share/pmm:/etc/nginx/ssl \
-v /etc/localtime:/etc/localtime:ro \
--restart always \
percona/pmm-server:1.0.5

Sometimes the time in the docker container changes, so i sync the timezone with the local ones and start the container with more memory.

At the moment all is working fine.

The stats are now more plausible, but mem usage looks a little bit strange. There are no values, but it works

CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
pmm-server 10.72% 0 B / 0 B 0.00% 77.36 MB / 3.013 MB 105.8 MB / 1.107 GB 0

weber · October 27, 2016, 8:36am

Oh, I misread, I thought it was your post about METRICS_MEMORY as well

Topic		Replies	Views
my graphs in Grafana sometimes working but sometimes not PMM 1.x	12	2104	September 30, 2016
Grafana has failed to load Percona Monitoring and Management (PMM)	20	2816	October 15, 2023
PMM is losing instances PMM 1.x	34	2824	July 24, 2017
PMM Gravana does not show all data PMM 1.x	65	4531	March 21, 2017
Pmm-client is alive, but not graphing PMM 1.x	3	1250	September 12, 2018

Grafana Graph works only sporadic

Related topics