Problems with qan in 1.8.0

Stateros · March 3, 2018, 8:38am

After update from 1.7.0 to 1.8.0 some db instances couldn’t sent data to qan.

:~# pmm-admin list
pmm-admin 1.8.0

PMM Server | pmm.qa.com
Client Name | master.db.com
Client Address | 172.*.*.*
Service Manager | linux-systemd

-------------- ------------------------------------ ----------- -------- ------------------------------------------ ------------------------------------------
SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS
-------------- ------------------------------------ ----------- -------- ------------------------------------------ ------------------------------------------
mysql:queries master.db.com - YES pmm:***&#64;unix(/var/run/mysqld/mysqld.sock) query_source=slowlog, query_examples=true
linux:metrics master.db.com 42000 YES -
mysql:metrics master.db.com 42002 YES pmm:***&#64;unix(/var/run/mysqld/mysqld.sock)

~# pmm-admin check-network
PMM Network Status

Server Address | pmm.qa.com
Client Address | 172.*.*.*

* System Time
NTP Server (0.pool.ntp.org) | 2018-03-03 13:58:34 +0000 UTC
PMM Server | 2018-03-03 13:58:34 +0000 GMT
PMM Client | 2018-03-03 13:58:34 +0000 UTC
PMM Server Time Drift | OK
PMM Client Time Drift | OK
PMM Client to PMM Server Time Drift | OK

* Connection: Client --> Server
-------------------- -------
SERVER SERVICE STATUS
-------------------- -------
Consul API OK
Prometheus API OK
Query Analytics API OK

Connection duration | 800.78µs
Request duration | 1.69178ms
Full round trip | 2.49256ms


* Connection: Client <-- Server
-------------- ------------------------------------ --------------------- ------- ---------- ---------
SERVICE TYPE NAME REMOTE ENDPOINT STATUS HTTPS/TLS PASSWORD
-------------- ------------------------------------ --------------------- ------- ---------- ---------
linux:metrics master.db.com 172.*.*.*:42000 OK YES -
mysql:metrics master.db.com 172.*.*.*:42002 OK YES -

But in grafana I see

Also when I try to remove mysql:queries for add it again I got error:

~# pmm-admin rm mysql:queries
Error removing MySQL queries master.db.com: timeout 10s waiting on agent to connect to API.

[HR][/HR]

Second issue doesn’t connected with 1.8.0.
In fingerprint of query I see wrong schema detection. For example:
db_server 1: has user schema with order table.
db_server 2: has message schema with table mail

in QAN I am checking user server. In fingerprint I see

select from message.order ...

but there are no schema message on user server.

Everything else works ok, Explain, table structure all looks correct.

Michael_Coburn · March 22, 2018, 2:18pm

Hi Stateros , were you able to determine why some hosts stopped sending metrics after upgrade to 1.8.0?
Can you please share the contents from /var/log/pmm-mysql-metrics* for an affected host so we can see why it is failing?

Regarding schema - yes we have identified this as a bug, but haven’t yet identified a release. Please follow this ticket and we welcome your commentary! thanks
[url][PMM-2266] Query Abstract adds wrong schema when schema is not detected - Percona JIRA

Stateros · March 23, 2018, 9:44am

Michael Coburn , I don’t have log files like you wrote. And I had been updated pmm-server and all clients to 1.8.1 at the release day

less /var/log/
anaconda/ lastlog prometheus.log.2
btmp mysql.log purge-qan-data.log
consul.log nginx/ qan-api.log
createdb.log nginx.log rhsm/
createdb2.log node_exporter.log supervisor/
createdb3.log orchestrator.log tallylog
cron.log pmm-manage.log wtmp
dashboard-upgrade.log pmm-managed.log yum.log
grafana/ prometheus.log
grubby_prune_debug prometheus.log.1

Ezail · March 25, 2018, 6:40am

Michael Coburn
Hi, I encountered the same error message in the grafana. Now,i describe what i did here，

I pulled the latest pmm and installed it according the document , the function is okay,

but i wanna move the template from sqlite3 to mysql, so i modified the grafana.ini like next:
···

Either “mysql”, “postgres” or “sqlite3”, it’s your choice

type = mysql
host = 127.0.0.1:3306
name = grafana
user = root

If the password contains # or ; you have to wrap it with trippel quotes. Ex “”“#password;”“”

password =
···

and then，restart the pmm-server in docker

next，I chose the data source and imported the dashbords in grafana web

oh, no！！！ the query analytics could not work incorrectly！！！

pls help me，thx

谢谢！！！

Michael_Coburn · March 26, 2018, 1:45pm

Hi Stateros , sorry for not being clear - the files in /var/log/pmm-* will exist on any node that has pmm-client package installed and if it is running exporters.

Michael_Coburn · March 26, 2018, 1:47pm

Hi Ezail ,

I suspect your issue is different than what Stateros is experiencing, and I suggest you open a separate thread for your request.

Can you share in that new thread the actual error message you are receiving? Thanks!

Ezail · March 27, 2018, 4:33am

I suspect your issue is different than what Stateros is experiencing, and I suggest you open a separate thread for your request.

Can you share in that new thread the actual error message you are receiving? Thanks!
[/QUOTE]

thx 4 ur support， I hv solved my issue，
while i change the template from sqlite3 to mysql，i must add a new datasource named api-qan(mysql) which used to stored the template,i did not do this step before

but i hv met another issue ,i will open a separate thread to discuss it

thx!!!

Stateros · March 27, 2018, 9:18am

Hi Michael Coburn Here 2 log files with mysql metrics and queries for 2 days

pmm-mysql-queries.txt (17.5 KB)

pmm-mysql-metrics.txt (36 KB)

Stateros · April 6, 2018, 10:23am

After update to 1.9.0 problem still exists

Michael_Coburn · April 9, 2018, 11:07am

Hi Stateros

Looking at mysql-metrics, it appears the exporter is starting up every minute, but I don’t see where the exporter is being killed.

time="2018-03-27T14:00:05Z" level=info msg="Listening on db_server_ip:42002" source="mysqld_exporter.go:393"
time="2018-03-27T14:30:05Z" level=info msg="Starting mysqld_exporter (version=1.8.1, branch=master, revision=74d5373dceed55bf9cb15a932fa0bedd8996e251)" source="mysqld_exporter.go:286"

Can you check if you have more than one exporter running?

ps -ef | grep mysqld_exporter

and mysql-queries, we can see the agent starting, then immediately getting terminated:

2018/03/27 11:30:04.822084 main.go:194: API is ready
2018/03/27 11:30:04.887335 main.go:349: Caught terminated signal, shutting down
2018/03/27 11:30:04.887374 main.go:375: Stopping QAN...

I would suggest you remove the exporter/qan agent and re-add it:

pmm-admin remove mysql
pmm-admin add mysql

Stateros · April 11, 2018, 10:31am

Michael Coburn

~# ps -ef | grep mysqld_exporter
root 23777 1 0 13:30 ? 00:00:00 /bin/sh -c /usr/local/percona/pmm-client/mysqld_exporter -collect.auto_increment.columns=true -collect.binlog_size=true -collect.global_status=true -collect.global_variables=true -collect.info_schema.innodb_metrics=true -collect.info_schema.processlist=true -collect.info_schema.query_response_time=true -collect.info_schema.tables=true -collect.info_schema.tablestats=true -collect.info_schema.userstats=true -collect.perf_schema.eventswaits=true -collect.perf_schema.file_events=true -collect.perf_schema.indexiowaits=true -collect.perf_schema.tableiowaits=true -collect.perf_schema.tablelocks=true -collect.slave_status=true -web.listen-address=172.25.127.186:42002 -web.auth-file=/usr/local/percona/pmm-client/pmm.yml -web.ssl-cert-file=/usr/local/percona/pmm-client/server.crt -web.ssl-key-file=/usr/local/percona/pmm-client/server.key >> /var/log/pmm-mysql-metrics-42002.log 2>&1
root 23778 23777 3 13:30 ? 00:04:21 /usr/local/percona/pmm-client/mysqld_exporter -collect.auto_increment.columns=true -collect.binlog_size=true -collect.global_status=true -collect.global_variables=true -collect.info_schema.innodb_metrics=true -collect.info_schema.processlist=true -collect.info_schema.query_response_time=true -collect.info_schema.tables=true -collect.info_schema.tablestats=true -collect.info_schema.userstats=true -collect.perf_schema.eventswaits=true -collect.perf_schema.file_events=true -collect.perf_schema.indexiowaits=true -collect.perf_schema.tableiowaits=true -collect.perf_schema.tablelocks=true -collect.slave_status=true -web.listen-address=172.25.127.186:42002 -web.auth-file=/usr/local/percona/pmm-client/pmm.yml -web.ssl-cert-file=/usr/local/percona/pmm-client/server.crt -web.ssl-key-file=/usr/local/percona/pmm-client/server.key
root 27437 27409 0 15:47 pts/0 00:00:00 grep --color=auto mysqld_exporter

What should I do with it? I can’t kill any of processes. If I kill one of them, second also die.

Stateros · April 16, 2018, 1:39am

It didn’t help. I still see in log:

# Version: percona-qan-agent 1.0.5
# Basedir: /usr/local/percona/qan-agent
# PID: 12490
# API: pmm.qa.com/qan-api
# UUID: 640466ec83f743e262efd49a6b6ec7f2
2018/04/16 06:00:05.393009 main.go:153: Starting agent...
2018/04/16 06:00:05.396324 main.go:321: Agent is ready
2018/04/16 06:00:05.423233 main.go:194: API is ready
2018/04/16 06:00:05.446720 main.go:349: Caught terminated signal, shutting down
2018/04/16 06:00:05.446758 main.go:375: Stopping QAN...
2018/04/16 06:00:05.448388 main.go:382: Waiting 2 seconds to flush agent log to API...
2018/04/16 06:00:07.448587 main.go:157: Agent has stopped
# Version: percona-qan-agent 1.0.5
# Basedir: /usr/local/percona/qan-agent
# PID: 12535
# API: pmm.qa.com/qan-api
# UUID: 640466ec83f743e262efd49a6b6ec7f2
2018/04/16 06:00:07.483234 main.go:153: Starting agent...
2018/04/16 06:00:07.492553 main.go:321: Agent is ready
2018/04/16 06:00:07.493524 main.go:194: API is ready

Michael_Coburn · April 16, 2018, 10:31am

Hi Stateros

Are all your systems using systemd? Is there anything in the systemd logs that indicate why the binaries are being shut down?

nailya · April 16, 2018, 12:53pm

Hi Stateros
Did you try to re-install pmm-client? Is it CentOS?

Stateros · April 17, 2018, 6:43am

Hi nailya, no I use Ubuntu 16.04 on all pmm and DB servers. Sure, I tried to update, re-install. I even rebuild pmm-server

Hi Michael Coburn, checked with our OPS guys. Nothing suspicious.

Stateros · April 27, 2018, 1:52am

Looks like after I have updated pmm to 1.10.0 no any issue.

hakim · April 27, 2018, 4:16am

Hi ,

i have the same issue withe the pmm 1.10.0 version , all it’s configured but it doesn’t work help me please

Topic		Replies	Views
Not Able to connect to QAN API. PMM 1.x	11	1579	April 10, 2019
PMM Query Analytics - There is no query data for the selected host PMM 1.x	14	4340	January 19, 2018
ERROR qan-analyzer driver: bad connection PMM 1.x	41	2325	February 22, 2017
PMM QAN - can't add mysql:queries PMM 1.x	2	1097	October 25, 2017
strange new behavior in QAN in pmm-admin 1.5.0 PMM 1.x	3	699	December 5, 2017

Problems with qan in 1.8.0

Either “mysql”, “postgres” or “sqlite3”, it’s your choice

If the password contains # or ; you have to wrap it with trippel quotes. Ex “”“#password;”“”

Related topics