We have Percona/MySQL server 5.7.25-28 running on 20 cores (40 with HT) with 256GB of RAM
The system experiences relatively high loads serving up to 18,000 queries per second with ~30% load average under normal conditions.
As the database evolves and grows, we have noticed inexplicable behavior of MySQL server.
Right after restart, it serves requests very quickly, no queries in the slow log, system load average is low.
Over time, it takes more and more CPU time to serve the same request rate, slow requests gradually appear in the log.
Because the DB replies slower, clients establish more connections, making DB even more slower rendering it to almost unresponsive state.
The only remedy is restart of MySQL. Right now we have to make it every day
What could it be? How to diagnose? Which metrics to analyze?
Thanks!