Not the answer you need?
Register and ask your own question!

ERROR qan-analyzer driver: bad connection

Hi!
After starting "pmm-admin start --all" everything works fine, for a few time. After 2-3 hours approximately, Query Analytics stop to show data. We have 2 servers for now, and they both have the same problem.
In /var/log/pmm-mysql-queries-0.log we can see:

[mysql] 2017/01/30 20:29:09 packets.go:59: unexpected EOF
[mysql] 2017/01/30 20:29:09 packets.go:386: busy buffer
2017/01/30 20:29:09.538585 ERROR qan-analyzer-1342087d driver: bad connection
2017/01/30 20:30:00.001315 WARNING qan-analyzer-1342087d-worker Interval out of sequence: got 186, expected 182
2017/01/30 20:30:08.733680 ERROR qan-analyzer-1342087d-worker Got class twice: registry_dev2 e1fecd37897b7375d76acae411ed0f5b
2017/01/30 20:31:00.011321 WARNING qan-analyzer-1342087d Skipping interval '187 2017-01-30 18:30:00 UTC to 2017-01-30 18:31:00 UTC (0-0)' because interval '186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)' is still being parsed
2017/01/30 20:32:00.005720 WARNING qan-analyzer-1342087d Skipping interval '188 2017-01-30 18:31:00 UTC to 2017-01-30 18:32:00 UTC (0-0)' because interval '186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)' is still being parsed
2017/01/30 20:33:00.001729 WARNING qan-analyzer-1342087d Skipping interval '189 2017-01-30 18:32:00 UTC to 2017-01-30 18:33:00 UTC (0-0)' because interval '186 2017-01-30 18:29:00 UTC to 2017-01-30 18:30:00 UTC (0-0)' is still being parsed
[mysql] 2017/01/30 20:33:28 packets.go:59: unexpected EOF
[mysql] 2017/01/30 20:33:28 packets.go:386: busy buffer
[mysql] 2017/01/30 20:33:28 connection.go:307: invalid connection
2017/01/30 20:33:28.778027 ERROR qan-analyzer-1342087d driver: bad connection
2017/01/30 20:34:00.003532 WARNING qan-analyzer-1342087d-worker Interval out of sequence: got 190, expected 187
2017/01/30 20:34:04.971836 ERROR qan-analyzer-1342087d-worker Got class twice: registry_dev2 e1fecd37897b7375d76acae411ed0f5b

After "pmm-admin restart --all" there is no errors for some time, Query Analytics works fine. But not for long. Same errors.

Server and clients was reinstalled by documentation 3 times. No effect.
How can i fix this problem?
«1

Comments

  • MykolaMykola Percona Percona Staff Role
    Hi Aleksey,

    Did you have MySQL restarts in this period? https://github.com/go-sql-driver/mysql/issues/449
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    No. pmm client lost connection every hour now. Approximately. All server work without restart few weeks
  • MykolaMykola Percona Percona Staff Role
    looks strange,
    is pmm-admin ran locally on database host?
    can you share output? (just replace IPs by x.x.x.x)
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    [[email protected] ~]# pmm-admin list
    pmm-admin 1.0.7

    PMM Server | 1.1.1.1:82
    Client Name | mysql.db
    Client Address | 1.1.1.9
    Service Manager | linux-systemd






    SERVICE TYPE NAME LOCAL PORT RUNNING DATA SOURCE OPTIONS





    mysql:queries mysql.db - YES root:***@unix(/var/lib/mysql/mysql.sock) query_source=perfschema
    linux:metrics mysql.db 42000 YES -
    mysql:metrics mysql.db 42002 YES root:***@unix(/var/lib/mysql/mysql.sock)
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    And yes, pmm-admin runs locally on database host
  • MykolaMykola Percona Percona Staff Role
    Do you use MariaDB+Galera? https://jira.mariadb.org/browse/MDEV-10812
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    No, only percona server:
    Percona-Server-server-57-5.7.16-10.1.el7.x86_64
    Percona-Server-shared-compat-57-5.7.16-10.1.el7.x86_64
    Percona-Server-client-57-5.7.16-10.1.el7.x86_64
    percona-zabbix-templates-1.1.7-2.noarch
    percona-toolkit-2.2.20-1.noarch
    percona-release-0.1-4.noarch
    Percona-Server-shared-57-5.7.16-10.1.el7.x86_64
  • MykolaMykola Percona Percona Staff Role
    What is your OS?
    Is selinux disabled?
    do you have any periodic jobs?
    Is you systems has enough max connection count?
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    CentOS 7 updated
    selinux disabled:

    [[email protected] ~]# sestatus
    SELinux status: disabled

    periodic jobs:

    1. retarting qan(until driver: bad connection will fix)

    [[email protected] ~]# crontab -l
    */30 * * * * /usr/sbin/pmm-admin restart mysql:queries > /dev/null

    2. /etc/cron.hourly/logrotate.mysql

    #!/bin/sh

    /usr/sbin/logrotate /etc/logrotate.d/mysql
    EXITVALUE=$?
    if [ $EXITVALUE != 0 ]; then
    /usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
    fi
    exit 0

    3. /etc/logrotate.d/mysql

    /var/log/mysqld.log {
    notifempty
    size 100M
    rotate 100
    missingok
    compress
    olddir /var/log/mysql_old_logs
    postrotate
    touch /var/log/mysqld.log
    chown mysql:mysql /var/log/mysqld.log
    chmod 600 /var/log/mysqld.log
    mysqladmin flush-logs
    endscript
    }

    4. all other centos7 system jobs without changes

    Is you systems has enough max connection count?

    net.ipv4.ip_local_port_range = 32768 60999
    net.core.somaxconn = 128

    Or how can i find out it?
  • MykolaMykola Percona Percona Staff Role
    1. maybe you have any periodic task which use mysql on your application server and runs every 3 hours?
    2. can you share "MySQL Connections" graph?
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    1. Well, we have big counts of tasks, that use server, but all from other servers, no local tasks. Database creation, dropping, DDL, DML etc. A lot.
    2. Images by url: https://www.dropbox.com/sh/geb4muklv...iKF4jVRva?dl=0
    Sorry, i cant overcome image uploading on this forum
  • MykolaMykola Percona Percona Staff Role
    can you try to switch QAN Collect source from "Performance Schema" to "Slow Log"?
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Ok, i will.
    By the way, 3 hours ago i had tuned new server, with same settings. And what? The same problem.
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    So, with "--query-source slowlog" everything fine. No problems since yesterday morning, it is about 20 hours.
    Another problem - slowquery.log grow very fast, 890M from time of switching from performance_schema. Can i rotate it without loss of collected data in PMM?
  • MykolaMykola Percona Percona Staff Role
    yes, you can, QAN should work fine, if any issues please let me know.

    don't use https://www.percona.com/doc/percona-..._rotation.html
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    "This feature is currently considered BETA quality."
    Can i use it in production environment, or it is better for now to use logrotate?
  • MykolaMykola Percona Percona Staff Role
    It's beta, so up to you.
    Log rotate is also should work OK. Example configuration https://www.percona.com/blog/2013/04/18/rotating-mysql-slow-logs-safely/
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Thanks. But you have to fix problem with performance_schame in QAN :)
  • MykolaMykola Percona Percona Staff Role
    after some research
    please don't use slowlog_rotation in PS, it is BETA and looks like it has some strange handling inside PMM.
    please write immediately if you have any issues with log rotate.
  • MykolaMykola Percona Percona Staff Role
    Aleksey,

    can you switch back to perfschema and try to disable query examples?
    we have theory that issue related to query examples.
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Mykola, how can i "disable query examples"?
  • MykolaMykola Percona Percona Staff Role
    - Open "Query Analytics" interface
    - choose database host
    - click on SETTINGS button
    - disable (uncheck) "Send real query examples" flag
    - set "Collect from" field to "Performance Schema"
    - click apply button

    please notify me if any bugs still exists
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    You were right, it work fine with disabled "Send real query examples"
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Is there some way to disable "Send real query examples" from command line? I just use ansible to add hosts to PMM, so i need to add it without GUI.
  • MykolaMykola Percona Percona Staff Role
    currently it is imposible to add node with disabled "Send real query examples" option.

    but, I added such functionality as experimental feature (it is not reviewed and not tested) - https://github.com/percona/pmm-client/pull/24
    if you want you can compile it and test
    sudo yum install golang git
    mkdir /tmp/pmm-admin
    cd /tmp/pmm-admin
    git clone -b query-examples https://github.com/percona/pmm-client src/github.com/percona/pmm-client
    export GOPATH=$(pwd)
    go build ./src/github.com/percona/pmm-client/
    sudo mv pmm-client /usr/sbin/pmm-admin
    sudo pmm-admin remove mysql
    sudo pmm-admin add mysql --disable-queryexamples
    

    anyway it is workaround, we are searching root of the issue.
  • MykolaMykola Percona Percona Staff Role
    Hi Alexksey,

    after all, I created the issue https://jira.percona.com/browse/PMM-589
    feel free to add any additional information to jira ticket.
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Ok, thanks.
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Adventure does not end. Everything works fine, until slow_query_log disabled. In spite of the fact usage of performance_schema. Same problem, bad connection driver.
  • MykolaMykola Percona Percona Staff Role
    sorry, can you clarify one more time.
    when slow_query_log=OFF, performance_schema=ON, query_examples=OFF, Do you have any issues?
  • aleksey.filippovaleksey.filippov Contributor Current User Role Beginner
    Right now "slow_query_log=OFF, performance_schema=ON, query_examples=OFF".
    Problem exist.
    A lot of WARNING in pmm-mysql-queries-0.log:

    2017/02/17 08:51:00.001570 WARNING qan-analyzer-f122d8f8-worker Interval out of sequence: got 85, expected 81
    2017/02/17 08:52:00.001470 WARNING qan-analyzer-f122d8f8 Skipping interval '86 2017-02-17 06:51:00 UTC to 2017-02-17 06:52:00 UTC (0-0)' because interval '85 2017-02-17 06:50:00 UTC to 2017-02-17 06:51:00 UTC (0-0)' is still being parsed
    2017/02/17 08:53:00.001461 WARNING qan-analyzer-f122d8f8 Skipping interval '87 2017-02-17 06:52:00 UTC to 2017-02-17 06:53:00 UTC (0-0)' because interval '85 2017-02-17 06:50:00 UTC to 2017-02-17 06:51:00 UTC (0-0)' is still being parsed
    2017/02/17 08:54:00.001591 WARNING qan-analyzer-f122d8f8 Skipping interval '88 2017-02-17 06:53:00 UTC to 2017-02-17 06:54:00 UTC (0-0)' because interval '85 2017-02-17 06:50:00 UTC to 2017-02-17 06:51:00 UTC (0-0)' is still being parsed
    2017/02/17 08:55:00.001422 WARNING qan-analyzer-f122d8f8 Skipping interval '89 2017-02-17 06:54:00 UTC to 2017-02-17 06:55:00 UTC (0-0)' because interval '85 2017-02-17 06:50:00 UTC to 2017-02-17 06:51:00 UTC (0-0)' is still being parsed
    2017/02/17 08:56:00.001660 WARNING qan-analyzer-f122d8f8-worker Interval out of sequence: got 90, expected 86
    2017/02/17 08:57:00.001457 WARNING qan-analyzer-f122d8f8 Skipping interval '91 2017-02-17 06:56:00 UTC to 2017-02-17 06:57:00 UTC (0-0)' because interval '90 2017-02-17 06:55:00 UTC to 2017-02-17 06:56:00 UTC (0-0)' is still being parsed
This discussion has been closed.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.