ERROR qan-analyzer driver: bad connection

aleksey.filippov · January 30, 2017, 1:10pm

Hi!
After starting “pmm-admin start --all” everything works fine, for a few time. After 2-3 hours approximately, Query Analytics stop to show data. We have 2 servers for now, and they both have the same problem.
In /var/log/pmm-mysql-queries-0.log we can see:

[mysql] 2017/01/30 20:29:09 packets.go:59: unexpected EOF
[mysql] 2017/01/30 20:29:09 packets.go:386: busy buffer
2017/01/30 20:29:09.538585 ERROR qan-analyzer-1342087d driver: bad connection
2017/01/30 20:30:00.001315 WARNING qan-analyzer-1342087d-worker Interval out of sequence: got 186, expected 182
2017/01/30 20:30:08.733680 ERROR qan-analyzer-1342087d-worker Got class twice: registry_dev2 e1fecd37897b7375d76acae411ed0f5b
2017/01/30 20:31:00.011321 WARNING qan-analyzer-1342087d Skipping interval ‘187 2017-01-30 18:30:00 UTC to 2017-01-30 18:31:00 UTC (0-0)’ because interval ‘186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)’ is still being parsed
2017/01/30 20:32:00.005720 WARNING qan-analyzer-1342087d Skipping interval ‘188 2017-01-30 18:31:00 UTC to 2017-01-30 18:32:00 UTC (0-0)’ because interval ‘186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)’ is still being parsed
2017/01/30 20:33:00.001729 WARNING qan-analyzer-1342087d Skipping interval ‘189 2017-01-30 18:32:00 UTC to 2017-01-30 18:33:00 UTC (0-0)’ because interval ‘186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)’ is still being parsed
[mysql] 2017/01/30 20:33:28 packets.go:59: unexpected EOF
[mysql] 2017/01/30 20:33:28 packets.go:386: busy buffer
[mysql] 2017/01/30 20:33:28 connection.go:307: invalid connection
2017/01/30 20:33:28.778027 ERROR qan-analyzer-1342087d driver: bad connection
2017/01/30 20:34:00.003532 WARNING qan-analyzer-1342087d-worker Interval out of sequence: got 190, expected 187
2017/01/30 20:34:04.971836 ERROR qan-analyzer-1342087d-worker Got class twice: registry_dev2 e1fecd37897b7375d76acae411ed0f5b

After “pmm-admin restart --all” there is no errors for some time, Query Analytics works fine. But not for long. Same errors.

Server and clients was reinstalled by documentation 3 times. No effect.
How can i fix this problem?

Mykola · January 31, 2017, 10:22am

Hi Aleksey,

Did you have MySQL restarts in this period? [URL]https://github.com/go-sql-driver/mysql/issues/449[/URL]

aleksey.filippov · January 31, 2017, 4:33pm

No. pmm client lost connection every hour now. Approximately. All server work without restart few weeks

Mykola · February 1, 2017, 12:54am

looks strange,
is pmm-admin ran locally on database host?
can you share output? (just replace IPs by x.x.x.x)

aleksey.filippov · February 1, 2017, 7:40am

[root@mysql ~]# pmm-admin list
pmm-admin 1.0.7

PMM Server | 1.1.1.1:82
Client Name | mysql.db
Client Address | 1.1.1.9
Service Manager | linux-systemd

SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS

mysql:queries mysql.db - YES root:@unix(/var/lib/mysql/mysql.sock) query_source=perfschema
linux:metrics mysql.db 42000 YES -
mysql:metrics mysql.db 42002 YES root:@unix(/var/lib/mysql/mysql.sock)

aleksey.filippov · February 1, 2017, 7:41am

And yes, pmm-admin runs locally on database host

Mykola · February 1, 2017, 9:18am

Do you use MariaDB+Galera? [URL][MDEV-10812] WSREP causes responses being sent to protocol commands that must not send a response - Jira

aleksey.filippov · February 2, 2017, 2:21am

No, only percona server:
Percona-Server-server-57-5.7.16-10.1.el7.x86_64
Percona-Server-shared-compat-57-5.7.16-10.1.el7.x86_64
Percona-Server-client-57-5.7.16-10.1.el7.x86_64
percona-zabbix-templates-1.1.7-2.noarch
percona-toolkit-2.2.20-1.noarch
percona-release-0.1-4.noarch
Percona-Server-shared-57-5.7.16-10.1.el7.x86_64

Mykola · February 2, 2017, 2:13pm

What is your OS?
Is selinux disabled?
do you have any periodic jobs?
Is you systems has enough max connection count?

aleksey.filippov · February 3, 2017, 2:24am

CentOS 7 updated
selinux disabled:

[root@mysql ~]# sestatus
SELinux status: disabled

periodic jobs:

retarting qan(until driver: bad connection will fix)

[root@mysql ~]# crontab -l
*/30 * * * * /usr/sbin/pmm-admin restart mysql:queries > /dev/null

/etc/cron.hourly/logrotate.mysql

#!/bin/sh

/usr/sbin/logrotate /etc/logrotate.d/mysql
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate “ALERT exited abnormally with [$EXITVALUE]”
fi
exit 0

/etc/logrotate.d/mysql

/var/log/mysqld.log {
notifempty
size 100M
rotate 100
missingok
compress
olddir /var/log/mysql_old_logs
postrotate
touch /var/log/mysqld.log
chown mysql:mysql /var/log/mysqld.log
chmod 600 /var/log/mysqld.log
mysqladmin flush-logs
endscript
}

all other centos7 system jobs without changes

Is you systems has enough max connection count?

net.ipv4.ip_local_port_range = 32768 60999
net.core.somaxconn = 128

Or how can i find out it?

Mykola · February 7, 2017, 3:18am

maybe you have any periodic task which use mysql on your application server and runs every 3 hours?
can you share “MySQL Connections” graph?

aleksey.filippov · February 8, 2017, 1:39am

Well, we have big counts of tasks, that use server, but all from other servers, no local tasks. Database creation, dropping, DDL, DML etc. A lot.
Images by url: [URL=“Dropbox - File Deleted”]https://www.dropbox.com/sh/geb4muklv...iKF4jVRva?dl=0[/URL]
Sorry, i cant overcome image uploading on this forum

Mykola · February 8, 2017, 8:04am

can you try to switch QAN Collect source from “Performance Schema” to “Slow Log”?

aleksey.filippov · February 8, 2017, 11:32am

Ok, i will.
By the way, 3 hours ago i had tuned new server, with same settings. And what? The same problem.

aleksey.filippov · February 10, 2017, 1:32am

So, with “–query-source slowlog” everything fine. No problems since yesterday morning, it is about 20 hours.
Another problem - slowquery.log grow very fast, 890M from time of switching from performance_schema. Can i rotate it without loss of collected data in PMM?

Mykola · February 10, 2017, 3:47am

yes, you can, QAN should work fine, if any issues please let me know.

don’t use https://www.percona.com/doc/percona-…_rotation.html

aleksey.filippov · February 10, 2017, 5:20am

“This feature is currently considered BETA quality.”
Can i use it in production environment, or it is better for now to use logrotate?

Mykola · February 10, 2017, 5:32am

It’s beta, so up to you.
Log rotate is also should work OK. Example configuration [url]https://www.percona.com/blog/2013/04/18/rotating-mysql-slow-logs-safely/[/url]

aleksey.filippov · February 10, 2017, 5:40am

Thanks. But you have to fix problem with performance_schame in QAN

Mykola · February 10, 2017, 10:13am

after some research
please don’t use slowlog_rotation in PS, it is BETA and looks like it has some strange handling inside PMM.
please write immediately if you have any issues with log rotate.

Topic		Replies	Views
PMM Query Analytics - There is no query data for the selected host PMM 1.x	14	4339	January 19, 2018
PMM 2.40 Query Analytics doesn't work for MySQL services PMM 2.x	19	800	December 21, 2023
There is no data for the selected MySQL instance, time range or search query. PMM 1.x	11	1482	February 24, 2017
Not Able to connect to QAN API. PMM 1.x	11	1578	April 10, 2019
No data in Query Analytics PMM 1.x	18	3740	January 15, 2018

ERROR qan-analyzer driver: bad connection

Related topics