Questionable QAN Statistics in PMM2.30

glynastill · October 4, 2022, 3:59pm

Hi,

I’m running PMM 2.30 and struggling to make sense of what QAN is collecting; to be more precise QAN “Query count” seems to be collecting some completely incoherent stats along with valid ones.

In a test to make sense of the stats I chose a particular host, then a specific query digest and selected the count_star figure from the events_statements_summary_by_digest table, wait 10 minutes then select the figure again, by my reckoning the difference should at least in some way correlate to the Query count in QAN for that time period. However here is the gist of what I see in QAN for the duration between the same timestamps for 3 sample queries:

Query 1: count_star increment = 29826 VS QAN 49.71 QPS / “sum 29.83k”
Query 2: count_star increment = 5963 VS QAN 9.91 QPS / “sum 5.95k”
Query 3: count_star increment = 3368 VS QAN 14.79k QPS / “sum 8.87m”

The first two sums tally, but the 3rd is complete nonsense. By my calculation this 3rd digest is about 5.61 qps.

In my past contact with grafana I recall having sudden jumps due to counter stats having data from “old incarnations” of the same host, and I wondered if this could be happening here; indeed this host has been destroyed and re-incarnated a few times over the past few months, but not in recent weeks.

Following that theory I tried setting DATA_RETENTION on the pmm-server to 24h, which clears out the old QAN stats, but the issue persists.

Can anyone explain what might be going off, or where I can dig down to see some raw data within pmm? I recall writing promql to debug grafana issues in the past but my memory is alas a blank on how - could anyone humour me and tell me how to dig into this issue?

Many thanks
Glyn

glynastill · October 5, 2022, 9:53am

So it looks like I’ve got a situation where QAN is getting an inflated figure. Here’s what I did to compare.

Ran the following on the minute on the database host until I saw the uplifted figure in QAN:
SELECT current_timestamp, digest_text, count_star FROM performance_schema.events_statements_summary_by_digest WHERE digest = 'bdf11382a52f1c39d113a784524af465';

This logged:

|    2022-10-05 09:13:57|SHOW SLAVE STATUS |41379212|
|    2022-10-05 09:14:57|SHOW SLAVE STATUS |41380265|
|    2022-10-05 09:15:57|SHOW SLAVE STATUS |41381319|
|    2022-10-05 09:16:57|SHOW SLAVE STATUS |41382372|

Then on the pmm server within clickhouse:

select queryid, node_name, period_start, num_queries from metrics where period_start>'2022-10-05 09:14:00' and period_start<'2022-10-05 09:16:00' and queryid = 'bdf11382a52f1c39d113a784524af465' and node_name = '<database hostname here>';

┌─queryid──────────────────────────┬─node_name───────────┬────────period_start─┬─num_queries─┐
│ bdf11382a52f1c39d113a784524af465 │ <database hostname> │ 2022-10-05 09:15:00 │    41381320 │
└──────────────────────────────────┴─────────────────────┴─────────────────────┴─────────────┘

Topic		Replies	Views
PMM QAN - Query Analytics API Percona Monitoring and Management (PMM)	1	528	December 9, 2021
QAN display too more qps than the real qps PMM 2.x pmm	1	693	September 9, 2022
QAN not showing any data in PMM2 x Percona MySQL 8.0.34 PMM 2.x	5	127	November 8, 2024
Error in QAN page PMM 2.x	15	463	May 23, 2024
Incomprehensible amount! PMM 2.x mysql , percona	8	1025	August 29, 2022

Questionable QAN Statistics in PMM2.30

Related topics