PMM Gravana does not show all data

/metrics-lr targets still in “unknown” state?
am I understand correctly?

When i wrote #38, result was like in #36, it is about 1h after pmm restart, i.e. all servers in UNKNOWN state.
Now:

And there is still gaps exists on graphics(gaps on pmm-server own statistics):

lets revert config from backup and restart


docker exec -it pmm-server cp /opt/prometheus/prometheus.yml_BAK /opt/prometheus/prometheus.yml docker restart pmm-server 

Good: all good servers have all metrix
Bad: We have gaps like on pics before, bad server do not have metrics we talk about
Question: is it OK that PMM after restart docker container start to show data after 1h? Is start can take so lot of time?..

no, it is something strange, can you synchronize time on your docker server?

Hm. Time on pmm-server container delay from db hosts for 3 sec, from my host for 2 sec. Docker server and docker container have the same time.

Is Grafana only affected or you don’t see graphs also in prometheus?

[url]http://PMM-SERVER-IP/prometheus/graph?g0.range_input=1h&g0.expr=node_load1&g0.tab=0[/url]

:

Screens from grafana and prometheus

But it is Linux metrics, it always exist for bad and good servers.

Some other news: we have 3 graphs, all different - bad server have no some metrics, other have all metrics, other have gaps in some metrics:

oh, can you check prometheus graph for mysql_global_variables_max_connections ?

Bad server is missing. And i have feeling like we walk by the circle.

How many tables do you have?

I think we have very long or wrong response from mysqld_exporter.

can you measure response time with the following command?

wget https://192.168.200.206:42002/metrics-lr --no-check-certificate

wget output should show start time in the first line and end time in the last

After some internal discussion, my colleague created feature request for mysqld_exporter - [URL=“Gather statistics on mysqld_exporter collection performance · Issue #188 · prometheus/mysqld_exporter · GitHub”]https://github.com/prometheus/mysqld...ter/issues/188[/URL]
feel free to +1 it

also we created two internal tickets
[URL][PMM-663] scrape_duration_seconds is broken with current prometheus configuration file - Percona JIRA
[URL][PMM-664] Gather statistics on mysqld_exporter collection performance - Percona JIRA

Done :slight_smile:

And ones more. I checked memory usage by mysqld_exporter with top.
Bad server:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18853 root 20 0 2441360 2,054g 4680 S 280,4 13,2 19:02.85 mysqld_exporter

Good servers:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26945 root 20 0 766688 117948 3560 S 7,0 0,7 2550:20 mysqld_exporter

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15257 root 20 0 316880 144584 3168 S 23,6 7,7 2103:36 mysqld_exporter

On bad server RES=2,054g (10 min after pmm-admin restart)
On good servers RES=117948(144584) (few days after restart)

How many tables do you have?

can you show output of the following command?


wget https://192.168.200.206:42002/metrics-lr --no-check-certificate 

Tables count in bad db = 3639, good db`s = 1736 each
About output - command downloaded file 250M size. If you want i can share it somewhere.

OUTPUT FROM BAD SERVER:

–2017-03-16 17:04:10-- https://bad_ip:42002/metrics-lr
Подключение к bad_ip:42002… соединение установлено.
ПРЕДУПРЕЖДЕНИЕ: невозможно проверить сертификат bad_ip, выпущенный «O=PMM Client»:
Невозможно локально проверить подлинность запрашивающего.
HTTP-запрос отправлен. Ожидание ответа… 200 OK
Длина: 261433316 (249M) [text/plain]
Сохранение в: «metrics-lr»

metrics-lr 100%[================================================== ==========================================>] 249,32M 5,15MB/s in 57s

2017-03-16 17:06:04 (4,34 MB/s) - «metrics-lr» сохранён [261433316/261433316]

OUTPUT FROM GOOD SERVER:

–2017-03-16 17:48:43-- https://good_ip:42002/metrics-lr
Подключение к good_ip:42002… соединение установлено.
ПРЕДУПРЕЖДЕНИЕ: невозможно проверить сертификат good_ip, выпущенный «O=PMM Client»:
Невозможно локально проверить подлинность запрашивающего.
HTTP-запрос отправлен. Ожидание ответа… 200 OK
Длина: 8876739 (8,5M) [text/plain]
Сохранение в: «metrics-lr.1»

metrics-lr.1 100%[============================================================================================>] 8,46M 3,30MB/s in 2,6s

2017-03-16 17:48:50 (3,30 MB/s) - «metrics-lr.1» сохранён [8876739/8876739]

wow!
can you share this archive??

[url]DepositFiles
or
[url]http://rg.to/file/79f11a1fb036e734bfe207a525dd46ec/metrics-lr.tar.gz.html[/url]