High CPU Load after updating from Percona 8.0.39 to 8.0.41 on 2 out of 7

alexanderdeprez · March 4, 2025, 10:01am

After upgrading all my percona servers from 8.0.39 to 8.0.41, two of my servers show very high CPU load. At night we sync data from external sources with means allot of deletes, inserts and some analytical queries. A peak in CPU usage was normal, but since the update it spikes to 100% for hours. The other serves, where the issues are not present have the same hardware, same percona version, and similar type of data and load, but don’t have the hours long spiking.

System:
Debian bookworm - kernel 6.1.0-31-amd64 with percona 8.0.41-32

Using pt-config-diff I verified the config between working servers, and troubled server is the same.

I’m a bit stuck in how to troubleshoot any further.
Additional info: I have PMM 2.44 to help me troubleshoot (alhtough this has not really helped so far).

Thanks in advance for any help.

anil.joshi · March 4, 2025, 1:48pm

@alexanderdeprez

What is the role of those 2 servers where you observed high CPU load ? I mean are they part of any cross region (DR) or the Source (Read/Write) or if only serves the Read (Replica) ?

If (OS, DB )everything looks exactly same then reviewing the workload might be good idea. You can check the slow query logs or pt-query-digest report to get down to the bottleneck queries.

Since you having PMM then you can check QAN section as well to measure the query performance and differences

You can also check various [MySQL/Innodb] specific dashboard to check for the performance or other differences to find the hotspot.

High CPU usage mainly relatable to the unoptimized workload or queries. Fixing such DML’s would for sure reduce the consumption.

This blogpost - A Simple Approach to Troubleshooting High CPU in MySQL might come handy in order to get down to the heavily resource (CPU) consumed DML’s.

alexanderdeprez · March 11, 2025, 7:41am

Hi, thank you for your reply. The servers are actualy just standalone. It seems that some bad queries could be the cause, but I find it hard to troubleshoot usign PMM since the load in QAN is influenced by the fact that CPU is at 100% for hours. So this causes even basic queries to have a high load in QAN because they now take lots of time, not because they are heavy on the system, but because the system is overloaded. So at this point load, duration,… is not an accurate representation.

I just updated my PMM 2.44 to PMM 3 with associated agents, maybe this wel help me with better insight.

The blog post you refer to is one I have read, and applied the techniques, but similar to my above statement, it seems lots of queries have a high load, again, because the system is overloaded.

Any additional tips would be helpfull to help figure out what queries cause an issue.
To give you an idea of the scale. The total size of all databases is around 1T, about 150 databases on each server, and at night, thousands of queries are done. Until the update we would only see minor spikes that max out at around 70% utilization, while now it goes up to 100% and stays like that for about 6 hours.

anil.joshi · March 28, 2025, 5:23pm

@alexanderdeprez Especially post upgrade or restart due to cold buffer for sometimes you might observe performance dropout but that shouldn’t go for that long. It could be some culprit query impacting other workload or might be some locking/contention problem due to high concurrency.

A quick observation in show full processlist and show engine innodb status\G give an idea what’s going on. You should also check the slow query logs if there any such genuine un-optimized workload exist which can be tweaked with indexes or optimisation further.

Other than this , if performance_schema instrumentation is enabled you can check for below tables which provide the running query details and their associated locks information and state picture.

select * from performance_schema.data_locks;
select * from performance_schema.data_lock_waits;

Or you can also check directly below the view which combine shows the waiting and blocking statements if any.

select * from sys.innodb_lock_waits;

Topic		Replies	Views
CPU% increases up to 80% after 1 week uptime with insignificant load Other MySQL® Questions	2	914	October 5, 2012
How Investigating High CPU Usage By using Percona Other MySQL® Questions mysql , percona	1	1011	September 7, 2023
Problem with CPU usage 25-30 % with no running Queries PMM 1.x	4	644	March 22, 2019
CPU usage unusually high on Debian 8 (jessie) Percona Server for MySQL 5.7	1	606	April 26, 2017
DB server all cores are 100% Percona Server for MySQL 8.0 mysql , percona	9	642	January 5, 2024

High CPU Load after updating from Percona 8.0.39 to 8.0.41 on 2 out of 7

Related topics