Pmm client crashes

  1. pmm client mysql:metrics crashes every now and then. I have already created a superuser for it on the mysql host. It has even grant permission.
    How do I check logs on client side to find out what is going wrong.
    [root@XYZ sbin]# pmm-admin list
    pmm-admin 1.0.6
    PMM Server | XX.YY.ZZ.AA
    Client Name | ABC
    Client Address | AA.BB.CC.DD
    Service manager | unix-systemv

SERVICE TYPE NAME CIENT PORT RUNNING DATA SOURCE OPTIONS


linux:metrics ABC 42000 NO -
mysql:queries ABC 42001 YES root:@unix(/export/home/mysql/mysql.sock) query_source=slowlog
mysql:metrics ABC 42002 NO pmm:
@unix(/export/home/mysql/mysql.sock)

Error Says
time=“2016-10-18T14:29:31Z” level=info msg=“Starting mysqld_exporter (version=1.0.5, branch=master, revision=3c95bb2c647443430db2337627d5f677665f41c7)” source=“mysqld_exporter.go:631”

time=“2016-10-18T14:29:31Z” level=info msg=“Build context (go=go1.7.1, user=, date=)” source=“mysqld_exporter.go:632”

time=“2016-10-18T14:29:31Z” level=info msg=“Listening on :42002” source=“mysqld_exporter.go:666”

time=“2016-10-18T14:29:31Z” level=fatal msg=“listen tcp :42002: bind: cannot assign requested address” source=“mysqld_exporter.go:667”

Is selinux disabled?
according to documentation it should be disabled - https://www.percona.com/doc/percona-monitoring-and-management/install.html#installing-pmm-client

Yes Selinux is disabled. I think its the problem with address binding on client.
How do I use -bind-address option in new 1.0.7 client?

Here is the exact current status

CLIENT
[root@XYZ bin]# ./pmm-admin list
pmm-admin 1.0.7

PMM Server | Z.Z.Z.Z
Client Name | XYZ.COMPANY.com
Client Address | X.X.X.X
Service Manager | unix-systemv


SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS


mysql:queries XYZ.COMPANY.com - YES root:@unix(/export/home/mysql/mysql.sock) query_source=slowlog
linux:metrics XYZ.COMPANY.com 42000 NO -
mysql:metrics XYZ.COMPANY.com 42002 NO root:
@unix(/export/home/mysql/mysql.sock)

[root@XYZ bin]# ./pmm-admin check-network
PMM Network Status

Server Address | Z.Z.Z.Z
Client Address | X.X.X.X

  • System Time
    Server | 2016-12-27 02:39:08 -0800 PST
    Client | 2016-12-27 02:39:08 -0800 PST
    Time Drift | OK

  • Connection: Client → Server


SERVER SERVICE STATUS


Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 71.493201ms
Request duration | 75.272449ms
Full round trip | 146.76565ms

  • Connection: Client ← Server

SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD


linux:metrics XYZ.COMPANY.com X.X.X.X:42000 DOWN - -
mysql:metrics XYZ.COMPANY.com X.X.X.X:42002 DOWN - -

ERROR in THE LOG
time=“2016-12-27T02:38:15-08:00” level=info msg=“Build context (go=go1.7.4, user=, date=)” source=“mysqld_exporter.go:680”
time=“2016-12-27T02:38:15-08:00” level=info msg=“HTTPS/TLS is enabled” source=“mysqld_exporter.go:724”
time=“2016-12-27T02:38:15-08:00” level=info msg=“Listening on X.X.X.X:42002” source=“mysqld_exporter.go:727”
time=“2016-12-27T02:38:37-08:00” level=fatal msg=“listen tcp X.X.X.X:42002: bind: cannot assign requested address” source=“mysqld_exporter.go:777”

We should understand the reason why it is impossible to open port on this machine.
possible reasons:
[LIST]
[]port already busy by another application
[
]selinux does not allow open this port
[]no IPv4 addresses on host
[
]binding to IP which is not exists on this host (for example binding to public IP address on AWS instance instead of private IP)
[*]etc
[/LIST]

It is required to use exactly the same version of PMM client and server.
So please try to update both PMM client and server to version 1.0.7

Here is the exact command I used to add the server to client. Also both are running 1.0.7
pmm-admin config --server ELASTIC-IP-OF-AWS-MACHINE –bind-address PUBLIC-IP-OF-CLIENT --client-address INTERNAL-IP-OF-CLIENT

If server and client are in the same “Subnet ID” it is enough to configure access via internal IP only.

pmm-admin config --server INTERNAL-IP-OF-SERVER

If server and client are in different AWS Subnet IDs

pmm-admin config \
--server ELASTIC-IP-OF-SERVER \
--bind-address INTERNAL-IP-OF-CLIENT \
--client-address PUBLIC-IP-OF-CLIENT

according to help

# pmm-admin config --help
This command configures pmm-admin to communicate with PMM server.
...
Flags:
--bind-address string bind address, also local/private address that is mapped from client address via NAT/port forwarding (defaults to the client address)
--client-address string client address, also remote/public address for this system (if omitted it will be automatically detected by asking server)

Well the bind-address option worked with client 1.0.7. Also both client and server are running 1.0.7.
While client is in our company DC, server is in AWS vpc.

Following is the config command that I used.
pmm-admin config --server PUBLIC-IP of Server --bind-address Internal ip of client --client-address Public ip of client

pmm-admin list shows that all metrics are working fine.
pmm-admin list
pmm-admin 1.0.7

PMM Server | PUBLIC-IP of Server
Client Name | XYZ.COMPANY.com
Client Address | Internal ip of client
Service Manager | unix-systemv


SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS


mysql:queries XYZ.COMPANY.com - YES root:@unix(/export/home/mysql/mysql.sock) query_source=slowlog
linux:metrics XYZ.COMPANY.com 42000 YES -
mysql:metrics XYZ.COMPANY.com 42002 YES root:
@unix(/export/home/mysql/mysql.sock)

But check-network shows that server-> client has issue.
pmm-admin check-network
PMM Network Status

Server Address | PUBLIC-IP of Server
Client Address | Public ip of client (Internal ip of client )

  • System Time
    Server | 2017-01-02 23:28:12 -0800 PST
    Client | 2017-01-02 23:28:12 -0800 PST
    Time Drift | OK

  • Connection: Client → Server


SERVER SERVICE STATUS


Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 72.795236ms
Request duration | 83.901484ms
Full round trip | 156.69672ms

  • Connection: Client ← Server

SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD


linux:metrics XYZ.COMPANY.com Public ip of client–>Internal ip of client:42000 DOWN YES -
mysql:metrics XYZ.COMPANY.com Public ip of client–>Internal ip of client:42002 DOWN YES -

Last log lines in the following logs on client machine
pmm-mysql-metrics-42002.log
time=“2017-01-02T23:29:17-08:00” level=info msg=“Starting mysqld_exporter (version=1.0.7, branch=master, revision=4a4c53fb313fb1883bcbd464f53c83f73b336100)” source=“mysqld_exporter.go:679”
time=“2017-01-02T23:29:17-08:00” level=info msg=“Build context (go=go1.7.4, user=, date=)” source=“mysqld_exporter.go:680”
time=“2017-01-02T23:29:17-08:00” level=info msg=“HTTPS/TLS is enabled” source=“mysqld_exporter.go:724”
time=“2017-01-02T23:29:17-08:00” level=info msg=“Listening on Internal ip of client:42002” source=“mysqld_exporter.go:727”
2017/01/02 23:29:44 http: TLS handshake error from Internal ip of client:47658: tls: first record does not look like a TLS handshake

pmm-linux-metrics-42000.log
time=“2017-01-02T23:29:16-08:00” level=info msg=“HTTPS/TLS is enabled” source=“node_exporter.go:235”
time=“2017-01-02T23:29:16-08:00” level=info msg=“Listening on Internal ip of client:42000” source=“node_exporter.go:238”
2017/01/02 23:29:44 http: TLS handshake error from Internal ip of client:50902: tls: first record does not look like a TLS handshake

Could you please let me know what is happening here? Client still doesnt show up the grafana.

according to shown output - client have been added correctly.
now it is needed to understand why server cannot connect to client.
possible reasons:
[LIST]
[]internal client firewall
[
]AWS security group rules
[/LIST] it is needed to open 42000 (for linux:metrics) and 42002 (for mysql:metrics) ports from PMM Server to PMM Client.
the following commands should should connect to port without errors.

# on PMM Client
telnet PRIVATE-IP-OF-CLIENT 42000
telnet PUBLIC-IP-OF-CLIENT 42000

# on PMM Server
telnet PUBLIC-IP-OF-CLIENT 42000

Ok so I think it is something to do with firewall of client inside our company datacenter. Can you please confirm.

ON CLIENT
[root@HOSTNAME pmm-client-1.0.7]# telnet INTERNAL_IP_OF_CLIENT 42000
Trying INTERNAL_IP_OF_CLIENT …
Connected to XYZ.COMPANY.com (INTERNAL_IP_OF_CLIENT ).
Escape character is ‘^]’.

[root@HOSTNAME pmm-client-1.0.7]# telnet PUBLIC_IP_OF_CLIENT 42000
Trying PUBLIC_IP_OF_CLIENT …
^C

[root@HOSTNAME pmm-client-1.0.7]# netstat -tanpu | grep 420
tcp 0 0 INTERNAL_IP_OF_CLIENT :42000 0.0.0.0:* LISTEN 15416/node_exporter
tcp 0 0 INTERNAL_IP_OF_CLIENT :42002 0.0.0.0:* LISTEN 15438/mysqld_export

ON SERVER
[root@HOSTNAME-user]# telnet PUBLIC_IP_OF_CLIENT 42000
Trying PUBLIC_IP_OF_CLIENT …
^C

yes, it is definitely firewall issue
client and server cannot connect to PUBLIC_IP_OF_CLIENT 42000

Thanks for confirming. i will try to get that fixed. Any other test we can do from server to check its connectivity with client other than telnet?

If telnet is not installed, it is possible to use curl or openssl tool

curl https://PUBLIC_IP_OF_CLIENT:42000/metrics --insecure
# or
(echo -e "GET /metrics HTTP/1.0\r\n"; sleep 1) | openssl s_client -connect PUBLIC_IP_OF_CLIENT:42000