PMM Grafana Dashboard doesn't show monitored host

I installed PMM server in one AWS EC2 instance and PMM client on another EC2 instance. I configured http, https and ssh on both servers. However, after I started the data collection, I only got QAN information, but not system metrics because the client host is not listed under PMM server. I checked the network status and it shows the below:

[ec2-user@ip-172-31-26-150 ~]$ sudo pmm-admin check-network
PMM Network Status

Server Address | 172.31.16.149
Client Address | 172.31.26.150

  • System Time
    NTP Server (0.pool.ntp.org) | 2017-10-03 20:54:18 +0000 UTC
    PMM Server | 2017-10-03 20:54:18 +0000 GMT
    PMM Client | 2017-10-03 20:54:18 +0000 UTC
    PMM Server Time Drift | OK
    PMM Client Time Drift | OK
    PMM Client to PMM Server Time Drift | OK

  • Connection: Client --> Server


SERVER SERVICE STATUS


Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 313.172µs
Request duration | 574.062µs
Full round trip | 887.234µs

  • Connection: Client <-- Server

SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD


linux:metrics ip-172-31-26-150 172.31.26.150:42000 DOWN YES YES
mysql:metrics ip-172-31-26-150 172.31.26.150:42002 DOWN YES YES

[ec2-user@ip-172-31-26-150 ~]$ sudo pmm-admin list
pmm-admin 1.3.1

PMM Server | 172.31.16.149 (password-protected)
Client Name | ip-172-31-26-150
Client Address | 172.31.26.150
Service Manager | unix-systemv


SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS


mysql:queries ip-172-31-26-150 - YES root:@unix(/var/lib/mysql/mysql.sock) query_source=perfschema, query_examples=true
linux:metrics ip-172-31-26-150 42000 YES -
mysql:metrics ip-172-31-26-150 42002 YES root:
@unix(/var/lib/mysql/mysql.sock)

How can I fix this?

Thanks,
Jie

In the log file, I saw those errors. I don’t know what those mean. Any help is appreciated.

time=“2017-10-03T20:48:46Z” level=info msg=" - uname" source=“node_exporter.go:164”
time=“2017-10-03T20:48:46Z” level=info msg=" - loadavg" source=“node_exporter.go:164”
time=“2017-10-03T20:48:46Z” level=info msg=“HTTP Basic authentication is enabled.” source=“basic_auth.go:105”
time=“2017-10-03T20:48:46Z” level=info msg=“Starting HTTPS server of 172.31.26.150:42000 …” source=“server.go:106”
2017/10/03 20:52:29 http: TLS handshake error from 172.31.26.150:51172: tls: first record does not look like a TLS handshake
2017/10/03 20:52:29 http: TLS handshake error from 172.31.26.150:51174: tls: first record does not look like a TLS handshake
2017/10/03 20:54:18 http: TLS handshake error from 172.31.26.150:51202: tls: first record does not look like a TLS handshake
2017/10/03 20:54:18 http: TLS handshake error from 172.31.26.150:51204: tls: first record does not look like a TLS handshake

Hi Jie,

  1. please use elastic IP for all nodes (nodes shouldn’t change IPs after reboot).
  2. we recommend run all instances in the same Availability Zone
  3. please use the following configuration command on client side
sudo pmm-admin config \
--client-name <DB_SERVER_NAME> \
--server <PMM_SERVER_PUBLIC_IP> \
--bind-address <PMM_CLIENT_PRIVATE-IP> \
--client-address <PMM_CLIENT_PUBLIC_IP>

I tried to use Public IP based on your suggestion, but still get the same error. Here are my configuration:

[ec2-user@ip-172-31-26-150 ~]$ sudo pmm-admin list
pmm-admin 1.3.1

PMM Server | 18.221.252.203 (password-protected)
Client Name | ip-172-31-26-150
Client Address | 18.220.227.38 (172.31.26.150)
Service Manager | unix-systemv


SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS


mysql:queries ip-172-31-26-150 - YES root:@unix(/var/lib/mysql/mysql.sock) query_source=slowlog, query_examples=true
linux:metrics ip-172-31-26-150 42000 YES -
mysql:metrics ip-172-31-26-150 42002 YES root:
@unix(/var/lib/mysql/mysql.sock)

[ec2-user@ip-172-31-26-150 ~]$ sudo pmm-admin check-network
PMM Network Status

Server Address | 18.221.252.203
Client Address | 18.220.227.38 (172.31.26.150)

  • System Time
    NTP Server (0.pool.ntp.org) | 2017-10-04 14:00:43 +0000 UTC
    PMM Server | 2017-10-04 14:00:43 +0000 GMT
    PMM Client | 2017-10-04 14:00:43 +0000 UTC
    PMM Server Time Drift | OK
    PMM Client Time Drift | OK
    PMM Client to PMM Server Time Drift | OK

  • Connection: Client --> Server


SERVER SERVICE STATUS


Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 355.84µs
Request duration | 637.266µs
Full round trip | 993.106µs

  • Connection: Client <-- Server

SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD


linux:metrics ip-172-31-26-150 18.220.227.38–>172.31.26.150:42000 DOWN YES YES
mysql:metrics ip-172-31-26-150 18.220.227.38–>172.31.26.150:42002 DOWN YES YES

When an endpoint is down it may indicate that the corresponding service is stopped (run ‘pmm-admin list’ to verify).
If it’s running, check out the logs /var/log/pmm-*.log

When all endpoints are down but ‘pmm-admin list’ shows they are up and no errors in the logs,
check the firewall settings whether this system allows incoming connections from server to address:port in question.

Also you can check the endpoint status by the URL: http://18.221.252.203/prometheus/targets

IMPORTANT: client and bind addresses are not the same which means you need to configure NAT/port forwarding to map them.

The log file still shows:
[ec2-user@ip-172-31-26-150 log]$ tail -100f pmm-linux-metrics-42000.log
time=“2017-10-04T13:57:40Z” level=info msg=“Starting node_exporter (version=0.14.0+percona.2, branch=PMM-1.3, revision=8ea8a4521f8f42d581847ee3d271dbb2a1fe8146)” source=“node_exporter.go:142”
time=“2017-10-04T13:57:40Z” level=info msg=“Build context (go=go1.8, user=jenkins@jenkins-centos-6.x64-65464c7d-d90f-4ccb-a218-85453e261bb1, date=20170927-15:36:05)” source=“node_exporter.go:143”
time=“2017-10-04T13:57:40Z” level=info msg=“Enabled collectors:” source=“node_exporter.go:162”
time=“2017-10-04T13:57:40Z” level=info msg=" - stat" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - uname" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - vmstat" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - filefd" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - loadavg" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - netdev" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - netstat" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - time" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - diskstats" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - filesystem" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=" - meminfo" source=“node_exporter.go:164”
time=“2017-10-04T13:57:40Z” level=info msg=“HTTP Basic authentication is enabled.” source=“basic_auth.go:105”
time=“2017-10-04T13:57:40Z” level=info msg=“Starting HTTPS server of 172.31.26.150:42000 …” source=“server.go:106”
2017/10/04 14:00:43 http: TLS handshake error from 172.31.26.150:53574: tls: first record does not look like a TLS handshake
2017/10/04 14:00:43 http: TLS handshake error from 172.31.26.150:53576: tls: first record does not look like a TLS handshake

Any idea?

Both servers are in the same AZ: us-east-2b

After I enabled all TCP traffic, it works! :slight_smile: Thanks.