Memory Usage graph reports error when >68 hosts are selected

Our PMM server monitors 16 hosts and 59 MySQL instances. After adding all the linux:metrics and mysql:metrics source types, we have detected that the Memory Usage graph in the Cross Server Graphs dashboard started reporting an error when all the hosts were selected.
Adding one by one we were able to restrict the problem to the number of hosts (order doesn’t matter), basically when the number is > 68, the graph fails consistently with the following error:

Message:
a.data.data is undefined Stack trace:
b/this.query/</<@http://mysql-hxvm7-monitor-005:8000/graph/public/app/plugins/datasource/prometheus/datasource.js?bust=1486481238956:4:1767 f&#64;http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:48:28473 Pg&#64;http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:49:29427 b/this.query/<@http://mysql-hxvm7-monitor-005:8000/graph/public/app/plugins/datasource/prometheus/datasource.js?bust=1486481238956:4:1644 g&#64;http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:38:29545 h/<@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:38:29717 xc/this.$get</o.prototype.$eval@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:39:5353 xc/this.$get</o.prototype.$digest@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:39:3835 xc/this.$get</o.prototype.$apply@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:39:5638 x/i<@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:39:1804 f&#64;http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:37:15602 jb/k.defer/c<@http://mysql-hxvm7-monitor-005:8000/graph/public/app/boot.85c49108.js:37:17066
All the other graphs seem to work as expected.

Is there any limitation in place we are not aware of?
We tried to dig into the Docker container logs, but we didn’t find any clue.

Could you please help?

Thanks
Regards,
Alessio

I should add, we label the hosts with the short hostname string, i.e.:

pmm-admin add linux:metrics mysql-vm-0NN

while the instances on the hosts are labelled as follows:

pmm-admin add mysql --user --password --query-source perfschema --socket mysql-vm-0NN_

The Host drop down list in Graphana displays all the labels. When >68 labels are selected the error is triggered.

What is pmm-server version?
What is you browser and version?
Is issue reproducible in another browser (firefox)?

Both server and clients are 1.0.7/

I have tried with Google Chrome and Firefox (latest versions) on OS X, RHEL6 and Fedora 24 without any noticeable change. The Memory Usage graph always reports the same error if >68 hosts are selected.

Is there any other check I can perform?

Thanks

can you enable “Developer Tools” in Chrome (Network tab) and try to catch response for 67 hosts and for 68?
and share in same way? (pastebin, dropbox etc)
I’ll try to compare them.

Just to clarify. Are you looking for the HAR file that can be generated or a screen snaphot of the Network tab?

Thanks

HAR please :slight_smile:

Could you please try to download the HAR files for the “fail” and “success” conditions using the following URLs?

[url]https://drive.google.com/file/d/0B5FL9q-xRjLJZy1WNnRXRkV5Rms/view?usp=sharing[/url]
[url]https://drive.google.com/file/d/0B5FL9q-xRjLJVUxJRVlDU2lPcEU/view?usp=sharing[/url]

Thanks

can you try to use the latest beta version of pmm-server?
it has updated grafana to the latest version -

I analyzed HAR files and it looks very strange.
failed response size is exactly 200000 bytes…
successful response size is much bigger - 6376716
looks like some firewall or proxy cut packages if they bigger that 6.5 or 7 megabytes.

I can try with the Beta version later today.

For what concerns the network issue, the MySQL and the PMM-servers are on the same network segment and no firewall is configured at any level.

Thanks for your help

Just tried with the Docker image for version 1.1.0beta (containing Graphana 4.1.1) and the behavior is exactly the same as before.

Please disregard my comment on network, I misread you message. Client (browser) and PMM server are not on the same network segment.

Can you try to fetch exactly the same url (which you dumped to HAR file) locally on PMM Server and remotely from you computer?
if result be the same (200000 bytes), it is definitely not network issue.

Chrome Developer tools has nice feature - “Copy as cURL”.
you can get simple curl command which just needed to run in shell.

I have done more than that. I have been able to temporary deploy firefox on the Docker host and the Memory Graph fails there as well, same as on any other browser from any other machine.
Unless there is something weird on the container network stack, I would exclude a truncation of the TCP stream…

is “Load Average” graph is also affected?
is “Memory Usage” still broken if open it alone in “fullscreen” mode (left click on title, “View” button) ?
can you dump failed request one more time please?

Neither Load Average nor any other graph I have been able to check has problems excluding Memory Usage.

The problem is present even in fullscreen mode.

Could you dump HAR file again?
now in fullscreen mode

Here is the link: [url]https://drive.google.com/file/d/0B5FL9q-xRjLJVUxJRVlDU2lPcEU/view?usp=sharing[/url]

I will be traveling most of next week, so I don’t know if I will be able to answer your message.

Thanks a lot

it is old one :frowning:
can you make completely new dump with fail?