We use PMM 1.17.0 to monitor MySQL DB hosts. Currently two MySQL hosts are being monitored, for few months.
Today, the QAN presentation for one of the hosts became unavailable.Namely, the QAN information for the time period of about 09:50-10:50 AM is not presented.
When trying to present it - a message appears “There is no query data because the MySQL Server is not configured for monitoring. For details about the required configuration, see Configuring MySQL for Percona Monitoring and Management in PMM documentation.”. If i try to view information before/after the time period 09:50-10:50 AM - everything is fine presented.
(That happens only for one of the two monitored MySQL hosts. The second host’s QAN information can be presented OK, including the 09:50-10:50 AM period.)
The monitored MySQL DB was up and available all the time.
Other than QAN information is presented properly for the 09:50-10:50 AM period, for example “MySQL Overview” or “MySQL InnoDB Metrics”.
What can be reason of such a behavior, and is there any solution/workaround for it ?
Please find attached few screenshots.
I’ve noticed , that the /var/log/pmm-mysql-queries-0.log file at the problematic MySQL host includes many error messages, like :
2019/05/13 10:43:03.319750 analyzer.go:411: qan-analyzer-mysql-31762327-worker crashed: ‘4304 2019-05-13 07:42:03 UTC to 2019-05-13 07:43:03 UTC (0-0)’: runtime error: invalid memory address or nil pointer dereference
goroutine 3174513 [running]:
runtime/debug.Stack(0x4e5c3c, 0xc42009c0f0, 0x2)
github.com/percona/qan-agent/qan/analyzer/mysql/worker/perfschema.(*Worker).Cleanup(0xc420196a00, 0x0, 0x0)
created by github.com/percona/qan-agent/qan/analyzer/mysql.(*RealAnalyzer).run
2019/05/13 10:43:03.319841 ERROR qan-analyzer-mysql-31762327 qan-analyzer-mysql-31762327-worker crashed: ‘4304 2019-05-13 07:42:03 UTC to 2019-05-13 07:43:03 UTC (0-0)’: runtime error: invalid memory address or nil pointer dereference
However, similar messages appear not only during the 09:50-10:50 AM time period, but also before and after it also…
(A similar log file at the second MySQL host (non-problematic) - does not include such messages at all.)
Could those messages be somehow related to the described behavior ? How can i get rid of them ?
hi avi vainshtein
Thanks for reporting this - I don’t have any initial conclusions for you. But can you supply the following: [LIST=1]
Attach the output of the logs.zip from PMM Server [URL]Percona Monitoring and Management
Attach the output of pmm-admin summary from the Client [URL]Percona Monitoring and Management
Could you also share with us a view of the System Overview for the host where the outage occurred? We want to see how loaded it was during this period
Hello Michael Coburn
Please find attached screenshots of System Overview for the mentioned time period.
Please also find attached the logs.zip from PMM Server
pmm-server_2019-05-14-06-53.zip (68.1 KB)
Please find also attached the output of pmm-admin summary from the Client host :
[root@vm-tcmmydbr ~]# pmm-admin summary
Collecting information for system diagnostic
Collect pmm-admin check-network output … Done
Collect pmm-admin list output … Done
Collect ps output … Done
Collect pt-summary output … Done
Collect list of open ports … Done
Collect service output … Done
Collect pt-mysql-summary output … ERROR 1045 (28000): Access denied for user ‘root’@‘localhost’ (using password: NO)
2019_05_14_10_21_03 Cannot connect to MySQL. Check that MySQL is running and that the options after – are correct.
Data collection complete. Please attach file summary_vm-tcmmydbr_2019-05-14T10_20_42.tar.gz to the issue as requested by Percona Support.
summary_vm-tcmmydbr_2019-05-14T10_20_42.tar.zip (155 KB)
Hello, in this case I think it would be best to raise this in our JIRA system so that our bug analyst can take a proper look at it in detail. Are you OK to do this?
I would help with it but if you own the ticket you will be kept up to date with progress. https://jira.percona.com
Let me know if you’d like help with setting that up though!
Hello @lorraine.pocklington I’ve opened an issue in JIRA system, as per your recommendations.