PMM 2.40 Query Analytics doesn't work for MySQL services

Greetings

Recently installed a fresh PMM 2.40 Server.

Added two MySQL nodes (clients) to be monitored, using 2.20 RPM, Centos 7.9 hosts.

Command to MySQL host/service to Add & Register

pmm-admin config --server-insecure-tls --server-url=http://admin:XXXXXX@<PMM_Server>:80

pmm-admin add mysql --query-source=slowlog --size-slow-logs=-1GiB --username=pmm --password=XXXXX vm-tngmmydbt12 vm-tngmmydbt12:3306

As a result - the MySQL node&service is presented in PMM dashboards, but no MySQL-related information at the Query Analytics panel.

pmm-admin inventory list services

Services list.
Service type Service name Address and Port Service ID
MySQL MYSQL_NODE vm-tngmmydbt11:3306 /service_id/19feefc2-d67a-495d-8593-0d0003d20ee2
MySQL vm-tngmmydbt12 vm-tngmmydbt12:3306 /service_id/7e32c412-f4bd-42aa-ba82-106a6ae45343
PostgreSQL pmm-server-postgresql 127.0.0.1:5432 /service_id/0cad6b88-983c-4034-a1e9-2564002baae2

Slow log parameters are ON at the MySQL targets.

What could be wrong or missing in QAN MySQL configuration ?

Best regards,
Avi Vainshtein

pmm-admin list

Service type Service name Address and port Service ID
MySQL vm-tngmmydbt12 vm-tngmmydbt12:3306 /service_id/7e32c412-f4bd-42aa-ba82-106a6ae45343

Agent type Status Metrics Mode Agent ID Service ID Port
pmm_agent Connected /agent_id/b161194f-4ddc-4cd3-ae99-53204cb3ae98 0
node_exporter Running push /agent_id/aa8c5c5a-bb01-4075-b070-9d77ba122ae8 42001
mysqld_exporter Running push /agent_id/b78b9d53-2317-4491-a6c3-883918d55a8d /service_id/7e32c412-f4bd-42aa-ba82-106a6ae45343 42002
mysql_slowlog_agent Running /agent_id/857ed073-baae-48e1-81e7-2fd60faf1e07 /service_id/7e32c412-f4bd-42aa-ba82-106a6ae45343 0
vmagent Running push /agent_id/8256cd69-be04-45b7-90cd-65ea6fe9809f 42000

The monitored MySQL clients are 8.0.33 Community Edition.

Hi @avi_vainshtein , could you run pmm-admin summary and share logs from mysql_slowlog_agent?

Hello @nurlan

Thanks for the fast reply.
As a new user i am not allowed to upload files here.
Is there any other way to provide you with the requested information ?

regards,
Avi

Hello @avi_vainshtein,

–size-slow-logs=-1GiB

Did you set your slow log size to a negative number? This should be a positive number representing the max size of the log file.

Also, what is the MySQL parameter SELECT @@long_query_time; If this is > 1, and all of your queries execute faster than 1s, you won’t see anything in PMM. What is the value of SELECT @@slow_query_log;

Hello @matthewb

Many thanks for your reply.
Indeed there was a typo mistake with slow log size.
So i removed that service and added it again as
pmm-admin add mysql --query-source=slowlog --size-slow-logs=1GiB --username=pmm --password=XXXXXX vm-tngmmydbt12 vm-tngmmydbt12:3306
MySQL Service added.
Service ID : /service_id/5124813a-b3f4-4c7c-8db4-4f3770fe4d1a
Service name: vm-tngmmydbt12
Table statistics collection enabled (the limit is 1000, the actual table count is 558).

The other results are
mysql> SELECT @@slow_query_log;
±-----------------+
| @@slow_query_log |
±-----------------+
| 1 |
±-----------------+

| slow_query_log | ON |
| long_query_time | 2.000000 |

I’ve reproduced a slow query - appears in slow log as

Time: 2023-12-13T09:05:34.388248+02:00

User@Host: root[root] @ localhost Id: 320929

Query_time: 82.302393 Lock_time: 0.000029 Rows_sent: 0 Rows_examined: 227509

use test;
SET timestamp=1702451052;
select
min(a.ip)
from
avi_alarms_history a,
avi_alarms_history b;

But still, no information about that node/service in Query Analytics.
" No queries available for this combination of filters in the selected time frame"

The other node, the service of which i’ve set with “–query-source=perfschema” - does present QAN results.

regards,
Avi

By the way, may be i am wrong trying to implement slow-log data source at all, and maybe
perf-schema provides more detailed and accurate information ?

Actually, the opposite is true. You get more detailed info from slow query log.

Can you check the docker pmm-server logs and see if there are any errors in accessing the slow log file?

by the “docker pmm-server logs” - you mean the contents of /srv/logs/ directory inside the pmm-server docker ?

which of the files could be useful for further investigation ?

[root@edf5728f2ad2 logs]# pwd
/srv/logs
[root@edf5728f2ad2 logs]# ls -lrt
total 364688
-rw-r–r-- 1 root root 2308 Dec 7 09:02 postgresql14.log
-rw-r–r-- 1 root root 3032 Dec 7 09:02 alertmanager.log
-rw-r–r-- 1 root root 8956 Dec 7 09:02 vmalert.log
-rw-r–r-- 1 root root 9080 Dec 7 09:02 pmm-update-perform-init.log
-rw-r–r-- 1 root root 8412 Dec 7 09:02 supervisord.log
-rw-r–r-- 1 root root 10485787 Dec 11 14:14 qan-api2.log.1
-rw-r–r-- 1 root root 52428871 Dec 12 06:28 nginx.log.2
-rw-r–r-- 1 root root 52428801 Dec 12 12:32 grafana.log.2
-rw-r–r-- 1 root root 50401 Dec 13 10:25 victoriametrics.log
-rw-r–r-- 1 root root 52428829 Dec 13 11:19 pmm-managed.log.2
-rw-r–r-- 1 root root 52428816 Dec 13 19:48 grafana.log.1
-rw-r–r-- 1 root root 52428893 Dec 13 21:16 nginx.log.1
-rw-r–r-- 1 root root 52428915 Dec 14 06:28 pmm-managed.log.1
-rw-r–r-- 1 root root 2609586 Dec 14 06:55 vmproxy.log
-rw-r–r-- 1 root root 892565 Dec 14 06:55 clickhouse-server.log
-rw-r–r-- 1 root root 4438631 Dec 14 06:58 pmm-agent.log
-rw-r–r-- 1 root root 1386184 Dec 14 06:58 pmm-managed.log
-rw-r–r-- 1 root root 12316381 Dec 14 06:58 nginx.log
-rw-r–r-- 1 root root 18719462 Dec 14 06:58 grafana.log
-rw-r–r-- 1 root root 6289269 Dec 14 06:58 qan-api2.log

@matthewb

as per PMM documentation - for usage of slow log source
--query-source=slowlog
Source of SQL queries, one of: slowlog, perfschema, none (default: slowlog). For slowlog query source, you need change permissions for specific files. Root permissions are needed for this.

What are those “specific files” ? Which permissions shall be changed ?

It’s likely to be in one of qan-api2.log or pmm-managed.log if the issue is on the server side. you can also check your client side logs (pmm-agent uses syslog).

I think what @matthewb is suggesting is to check the variable @@slow_query_log_file and go to your filesystem and see if there’s a permissions issue preventing pmm-client from reading it. (I think default is something like /var/lib/mysql/host-name.slow.log).

given the issues you’re seeing, I would probably just try to configure one client to use perf-schema just to see if you can get metrics that way which would help isolate to client vs server issue with PMM.

@avi_vainshtein,
We are basically at “check everything”. Without knowing where the issue is, we need to check everywhere. We know PMM 2.40 QAN works with MySQL + slow log, you can see it on our public demo. The challenge is figuring out what is different in your setup. Check the logs on the pmm-agent, check if SELinux is enabled, check logs on PMM server. You have to check everywhere to see what is denying access to the slow log file.

@steve.hoffman @matthewb

I have defined pmm client at two mysql hosts : T11 and T12. (Actually T11 is a Master and T12 is a Slave in a replication).

At T11 it is set perfschema, and Query Analytics work fine.
When i tried to use (before perfschema) at that host the slow log - QAN didn’t provide information.

At T12 it set slow log, and Query Analytics don’t work.

The slow log collection is set ON at both hosts.

The permissions of the slow log location at both hosts are the same
drwxr-xr-x 21 mysql mysql 4096 Dec 15 23:30 data
-rw-r–r-- 1 mysql mysql 170 Dec 15 23:30 /data/mysql-slow.log

At the T12 host (problems with QAN) - there are numerous messages in /var/log/messages
(examples)

Dec 17 04:55:56 vm-tngmmydbt12 pmm-agent: time=“2023-12-17T04:55:56.869+02:00” level=error msg=“Cannot scan db user privileges: dial tcp: lookup vm-tngmmydbt12: i/o timeout” agentID=/agent_id/2669c627-66ea-49ef-95e0-cecfd440c3ed component=agent-builtin type=qan_mysql_slowlog_agent

Dec 17 06:18:36 vm-tngmmydbt12 pmm-agent: time=“2023-12-17T06:18:36.875+02:00” level=error msg=“cannot select @@slow_query_log_file: dial tcp: lookup vm-tngmmydbt12: i/o timeout” agentID=/agent_id/2669c627-66ea-49ef-95e0-cecfd440c3ed component=agent-builtin type=qan_mysql_slowlog_agent

SELinux is disabled at both hosts
$ sestatus
SELinux status: disabled

@avi_vainshtein, A slow log that is only 170 bytes in size is a clear misconfiguration of your MySQL server. Additionally, PMM logs are telling you that the user connecting from PMM to MySQL does not have proper privileges. You need to go back to our docs and follow all the instructions for setting up permissions for slow log.

@matthewb Both hosts are from our Lab/Test environment, with no heavy traffic, therefore usually no slow queries (with long_query_time = 2).

It seems that the problem is with the “pmm” db account which i’ve set for MySQL PMM monitoring.
As soon as i replaced the “pmm” account by MySQL “root” account - PMM Query Analytics started to present QAN information.

So some permissions are still missing in “pmm” db account.
Following are the settings of pmm which i’ve set (for MySQL 8.0.33 DB)

CREATE USER ‘pmm’@‘127.0.0.1’ IDENTIFIED WITH mysql_native_password BY ‘XXXXXX’ WITH MAX_USER_CONNECTIONS 10;
GRANT SELECT, PROCESS, REPLICATION CLIENT, RELOAD, BACKUP_ADMIN ON . TO ‘pmm’@‘127.0.0.1’;
flush privileges;
CREATE USER ‘pmm’@‘localhost’ IDENTIFIED WITH mysql_native_password BY ‘XXXXXX’ WITH MAX_USER_CONNECTIONS 10;
GRANT SELECT, PROCESS, REPLICATION CLIENT, RELOAD, BACKUP_ADMIN ON . TO ‘pmm’@‘localhost’;
CREATE USER ‘pmm’@‘%’ IDENTIFIED WITH mysql_native_password BY ‘XXXXXX’ WITH MAX_USER_CONNECTIONS 10;
GRANT SELECT, PROCESS, REPLICATION CLIENT, RELOAD, BACKUP_ADMIN ON . TO ‘pmm’@‘%’;
flush privileges;

Which permissions are still missing here - for slow log monitoring ?

Hello @avi_vainshtein,
There are no permissions needed for slow log monitoring. Slow log monitoring reads the file on disk. It does not interact with MySQL permissions. Since it worked with “root”, I suspect whatever permissions you changed on the ‘pmm’ user had not taken effect. I would revert back to the ‘pmm’ user and check again. You should look at the pmm agent error logs on the host. They will show you which, if any, permissions are missing.

in pmm-admin summary generated files, do you see any errors or warnings in pmm-agent log or in slowlog log?

Hi @avi_vainshtein , I noticed that you are currently using an outdated PMM client version 2.20, while the PMM server is up-to-date. Although this is unlikely to be the cause of the issue with QAN, I suggest that you update the client to the latest version if possible. This could help improve the overall performance and stability of the system.