PMM perfschema Query Analytics on high load instances - missing queries

Matias_Sanchez · February 7, 2021, 3:13am

Hi Guys!
Got a big concern on PMM QAN Analytics, on our actual scenario we’re currently using PMM2 QAN on Mariadb Instance where there is a really demanding throughput of different queries.
We are on PMM 2.12.0 server and client.
We were able to config pmm-client to extract QAN --query-source=perfschema as we would love to get full stats on all queries and slowlog with a value of long_query_time=0 would have a big performance impact on server side.
The issue now is that performance digesting is constantly losing digest info as follow:

mysqladmin ext -ri1 -udba -p | grep -i digest
| Performance_schema_digest_lost | 120 |
| Performance_schema_digest_lost | 900 |
| Performance_schema_digest_lost | 977 |
| Performance_schema_digest_lost | 422 |
| Performance_schema_digest_lost | 928 |

And we also configured performance_schema_digests_size=-1 and performance_schema_events_stages_history_long_size=-1 and still at their highest values possible and digest_lost is always present.
The pseudo solution we’ve found its to periodically truncate performance_schema.events_statements_summary_by_digest and performane_schema.events_statements_history_long, in order to let pmm2-client to re-fetch newest digest .
Is there any pmm2-client configuration recommended on this high load scenarios to reduce data lost? We are constantly getting a small partial queries on QAN side, not showing all queries.
Is there any configuration to indicate pmm2-client to “clean” already fetched data from perfschema??, for example, this truncate perfschema tables operation after fetching last digests??

Thanks you in advance for any help

matthewb · February 7, 2021, 2:31pm

Hello,
As stated in the manual you should not set a value of -1 for either parameter. Instead you should set this parameter to the max value of 1048576. If you are still losing queries even with this max value, then I will try to have a member of our PMM team give a comment.

Matias_Sanchez · February 7, 2021, 11:00pm

Hello Matthweb thans you so much for your reply!
We actually did run all servers will the highest perfschema values possible of 1048576 regardless the extra RAM being used, but still losing queries most of the time , and also found that pmm-client queries on perfschema tables are quite demanding when these tables are on max size (these statement perfschema queries got on top of load).
That’s why looking for different strategies, found that if perfschema digest tables gets empty periodically a more complete sample is taken, even if i’m sure as it’s not normal usage the full stats are not ok.
Has pmm2-client a cache-like feature in order to extract perfschema digest and delete periodically from mysql itself which one has already fetched?
Maybe our mysql usage is beyond normal throughput of queries and mysql nor pmm2-client aren’t ready to manage this quantity of differents digest

Thanks you so much for the help!!

Peter · February 8, 2021, 5:24pm

Lets face it if you will have say 1M of different queries every minute it would be expensive to process and store…

Generally we would not recommend going over much more than defaults with performance_schema table sizes as it can get PMM queries expensive. Yes you will loose some of the queries but hopefully it is not the most important one.

One thing also you can use is to clean those performance_schema tables periodically to see if it improves results

Also if you’re running Percona Server consider using slow query log with sampling it typically allows to balance the overhead with ensuring all types of queries are presented.

Matias_Sanchez · February 9, 2021, 1:54pm

Hi Peter, thanks so much for you reply.
Totally agree it would be so expensive to pretend gather and process all different queries.
I found that best solution is to clean performance_schema tables periodically as i got bigger sample this way , but had the feeling that best case scenario is to clean them after pmm2-client fetches data. In the case of best frequency , my “tempo” on custom cleaning perf_tables might not be precise nor deterministic, i mean, pmm-client gather data one per 30 seconds and if I batch perfschema clean once per 30 seconds it will not run immediately after each pmm-client run.
Is there any way to add this clean operations as part of pmm-client activity? to sync it with gathering. Maybe as a fake “custom metrics”? Tried this but truncate operation does not fit on custom metrics specs as much as I tested.

Thanks a lot for your help at this with any idea.
Regards

Peter · February 9, 2021, 2:08pm

Hi,

Well it is not there right now. I’d submit a feature request on JIRA or even better a pull request adding such option

https://jira.percona.com/projects/PMM/issues/PMM-7361?filter=allopenissues

Matias_Sanchez · February 9, 2021, 3:41pm

Hi Peter, wow that would be awesome as a specific flagged option on pmm client side; mostly on these cases of heavy sql throughput. Following it up.

Thanks you so much for you kindness and help.

Best regards.

Topic		Replies	Views
Slow log as query source but still get expensive perfschema queries PMM 2.x	2	75	December 13, 2024
Qan mysql perfschema agent status is waiting MySQL & MariaDB	3	470	March 24, 2024
QAN - increase query length and examples/explain PMM 2.x	4	2275	October 8, 2024
Unhide QAN queries instead of ? expecting actual query column values Percona Monitoring and Management (PMM)	7	775	March 28, 2023
Query not showing properly in examples tab in qan PMM 2.x	2	1107	April 12, 2023

PMM perfschema Query Analytics on high load instances - missing queries

Related topics