Memory Usage graph reports error when >68 hosts are selected

Could you please try to download the HAR files for the “fail” and “success” conditions using the following URLs?

[url]https://drive.google.com/file/d/0B5FL9q-xRjLJZy1WNnRXRkV5Rms/view?usp=sharing[/url]
[url]https://drive.google.com/file/d/0B5FL9q-xRjLJVUxJRVlDU2lPcEU/view?usp=sharing[/url]

Thanks

can you try to use the latest beta version of pmm-server?
it has updated grafana to the latest version -

I analyzed HAR files and it looks very strange.
failed response size is exactly 200000 bytes…
successful response size is much bigger - 6376716
looks like some firewall or proxy cut packages if they bigger that 6.5 or 7 megabytes.

I can try with the Beta version later today.

For what concerns the network issue, the MySQL and the PMM-servers are on the same network segment and no firewall is configured at any level.

Thanks for your help

Just tried with the Docker image for version 1.1.0beta (containing Graphana 4.1.1) and the behavior is exactly the same as before.

Please disregard my comment on network, I misread you message. Client (browser) and PMM server are not on the same network segment.

Can you try to fetch exactly the same url (which you dumped to HAR file) locally on PMM Server and remotely from you computer?
if result be the same (200000 bytes), it is definitely not network issue.

Chrome Developer tools has nice feature - “Copy as cURL”.
you can get simple curl command which just needed to run in shell.

I have done more than that. I have been able to temporary deploy firefox on the Docker host and the Memory Graph fails there as well, same as on any other browser from any other machine.
Unless there is something weird on the container network stack, I would exclude a truncation of the TCP stream…

is “Load Average” graph is also affected?
is “Memory Usage” still broken if open it alone in “fullscreen” mode (left click on title, “View” button) ?
can you dump failed request one more time please?

Neither Load Average nor any other graph I have been able to check has problems excluding Memory Usage.

The problem is present even in fullscreen mode.

Could you dump HAR file again?
now in fullscreen mode

Here is the link: [url]https://drive.google.com/file/d/0B5FL9q-xRjLJVUxJRVlDU2lPcEU/view?usp=sharing[/url]

I will be traveling most of next week, so I don’t know if I will be able to answer your message.

Thanks a lot

it is old one :frowning:
can you make completely new dump with fail?

Sorry, I sent the link in a rush. Here is the correct one:

[url]https://drive.google.com/file/d/0B5FL9q-xRjLJVkN5c3NtTWhfa0k/view?usp=sharing[/url]

Hi ac2,

it is fully empty :frowning: size is 0 bytes

Sorry, the file was created on a remote connection while traveling…

I have been back from holidays and generated a new file: [url]https://drive.google.com/file/d/0B5FL9q-xRjLJWFc4WEZfQTI5RzA/view?usp=sharing[/url]

Could you please check this?

is this issue connected to exact host or to any random 68 hosts?

can you choose first 67 hosts in the list without issues?
can you choose last 67 hosts in the list without issues?
can you choose 67 hosts in the middle of the list without issues?

Hi Mykola,
actually I made a mistake while counting, the critical threshold is 68, sorry for that. Basically, the memory graph works correctly with 68 hosts selected (top, bottom, middle or punched card style doesn’t matter), fails with 69 onwards.
I have tested version 1.1.1 without any luck, too.

Thanks

is 59 mysql instances ran in 19 servers? or you have 78 independent servers?

lets check if we can fetch graph data directly from prometheus
http://PMM-SERVER-IP/prometheus/graph?g0.range_input=12h&g0.stacked=0&g0.expr=node_memory_MemTotal&g0.tab=0
http://PMM-SERVER-IP/prometheus/graph?g0.range_input=12h&g0.stacked=0&g0.expr=node_memory_MemFree&g0.tab=0
http://PMM-SERVER-IP/prometheus/graph?g0.range_input=12h&g0.stacked=0&g0.expr=node_memory_Buffers&g0.tab=0
http://PMM-SERVER-IP/prometheus/graph?g0.range_input=12h&g0.stacked=0&g0.expr=node_memory_Cached&g0.tab=0

We have 58 MySQL instances running on 16 MySQL VM, plus the PMM server is monitored, of course.

I have tried to access data directly in Prometheus and I see all the metrics for all nodes without any trouble, hence the issue seems related to Grafana.

Playing around selecting and deselecting hosts from the list, the problem appears “around” 68 hosts. Depending on the set, sometimes 67 is enough to trigger the problem, other times you need 69. So, it does not seem to be related to the number of hosts rather than the data volume to be accessed.

Thanks