Slow logs getting written to beginning of two files simultaneously

resuni · September 15, 2025, 6:18pm

I first want to clarify that this is NOT the common logrotate problem where logrotate moves the file and MySQL continues writing to the old descriptor. We have our logrotate set up properly to flush logs after rotating. This is a different problem entirely.

Starting on February 8th, all of the rotated slow log files are extremely small, containing either only one query or just the heading you always see at the beginning of the file. This is happening on most of our production Percona servers.

To clarify what I mean, here are the rotated slow log files before and after February 8th (note the file size):

-rw-r-----. 1 mysql mysql  629011 Feb  6  2025 slow-log-20250206.gz
-rw-r-----. 1 mysql mysql  558975 Feb  7  2025 slow-log-20250207.gz
-rw-r-----. 1 mysql mysql     208 Feb  8  2025 slow-log-20250208.gz
-rw-r-----. 1 mysql mysql     221 Feb  9  2025 slow-log-20250209.gz

This is an example of the only content getting written to the small files:

/usr/sbin/mysqld, Version: 8.0.40-31 (Percona Server (GPL), Release 31, Revision 49317865), Time: 2025-02-06T07:17:10.778415Z. started with:

This has persisted ever since. If I scroll down to the latest files, I see two files getting written to at the same time:

-rw-r-----. 1 mysql mysql     241 Sep 15 13:47 slow-log.old
-rw-r-----. 1 mysql mysql     241 Sep 15 13:47 slow-log

If I monitor these files, they are constantly getting overwritten. In other words, MySQL seems to be writing to the beginning of both these files instead of appending to one. I tried deleting slow-log.old and immediately flushing the logs. A short while later, MySQL will re-create this file, containing just that heading message, sometimes with an additional slow query.

I don’t know where this “slow-log.old” file is coming from. We have never had MySQL configured to write to a file by this name, and we have never had logrotate configured to rotate to this name. To try and answer the question of why these two files are getting written to simultaneously, I double checked the inodes to see if they’re hard-linked to each other, but each file’s inode number is different.

We use IaC here, so I have a pretty good paper trail of the changes that get made to the production environment. The only thing on record that seems like it could be related to this is upgrading our PMM server to PMM 3. It doesn’t make a lot of sense that this would be related, though. We updated the PMM server to 3.x, but we kept the PMM clients on 2.x, and they’re still running 2.x to this day. However, I do see the occasional log entry in /var/log/messages complaining that the slow log doesn’t exist:

Sep 15 08:58:49 1224584-mc02 pmm-agent[3069]: time="2025-09-15T08:58:49.127-04:00" level=warning msg="Failed to stat new file: stat /var/log/mysql/slow-log: no such file or directory." agentID=cb1a0552-9059-4ed9-8296-4a81c9d0acc8 component=slowlog/reader file=/var/log/mysql/slow-log type=qan_mysql_slowlog_agent

I don’t see any pattern in the timestamps. They just seem to get written randomly.

To summarize the strangeness of this problem:

Two slow log files are getting written to simultaneously - two distinct files, no hard links.
MySQL seems to be writing each entry to the beginning of the file rather than appending to it, constantly overwriting what’s already there.
The name “slow-log.old” is not a name we’ve ever configured MySQL or logrotate to use.
Looking at our change history via DNF history and IaC (Ansible), no changes related to Percona had been made around the time this started happening, other than updating to PMM server 3, which runs on its own dedicated VM.
This problem is occurring across most of our MySQL servers. Many of these servers are in a cluster, but some are standalone.

Has anyone ever seen anything like this?

matthewb · September 15, 2025, 6:50pm

github.com/percona/pmm

agent/agents/mysql/slowlog/slowlog.go

main


      
          }
          
          // rotateSlowLog removes slowlog file and calls FLUSH LOGS.
          func (s *SlowLog) rotateSlowLog(ctx context.Context, slowLogPath string) error {
          	db, err := sql.Open("mysql", s.params.DSN)
          	if err != nil {
          		return errors.Wrap(err, "cannot open database connection")
          	}
          	defer db.Close() //nolint:errcheck
          
          	old := slowLogPath + ".old"
          	if err = os.Remove(old); err != nil && !os.IsNotExist(err) {
          		s.l.Warnf("Cannot remove previous old slowlog file: %s.", err)
          	}
          
          	// We have to rename slowlog file, not remove it, before flushing logs:
          	// https://www.percona.com/blog/2007/12/09/be-careful-rotating-mysql-logs/
          	// This problem is especially bad with MySQL in Docker - it locks completely even on small files.
          	//
          	// Reader will continue to read old file from open file descriptor until EOF.
          	if err = os.Rename(slowLogPath, old); err != nil {

slow-log.old comes from the PMM agent when it is rotating the file.

The small log files with date stamps are probably generated from logrotated.

resuni · September 15, 2025, 7:55pm

If PMM is also trying to rotate logs, that could explain this weird behavior (although there would still be a lot of questions).

According to the docs, it looks like I can disable this by setting –-size-slow-logs to a negative value. However, the docs don’t say how to check what the existing value is, or even what the default value is. I don’t see any such value in any of the configs in /usr/local/percona/pmm2, so, if it is enabled, I’m guessing it’s defaulting to something? What is the default value?

matthewb · September 15, 2025, 8:14pm

That’s because agent config is not stored locally; agent config is stored on the PMM server in the pgsql database.

Run ps -Af | grep pmm to see all the parameters passed to the agent.

resuni · September 15, 2025, 8:26pm

Assuming you mean on the agent side, the pmm-agent process isn’t specifying –-size-slow-logs:

root        3114       1  0 Sep09 ?        00:02:57 /usr/sbin/pmm-agent --config-file=/usr/local/percona/pmm2/config/pmm-agent.yaml

There are also a number of percona/pmm2/exporters/vmagent processes, but those aren’t passing a --size-slow-logs parameter either.

resuni · September 15, 2025, 8:45pm

Does this mean that size-slow-logs is a setting I can change on the server in order to apply universally? Can I set it to -1 on the PMM server to disable slow log rotation for all connected agents? If so, where do I do this? I can’t find any such settings in the UI at all, nor have I found where this is stored in PMM’s pgsql.

resuni · September 23, 2025, 5:19pm

This problem seems to have been resolved by updating my PMM clients and server. I was still running PMM 2.x clients with a PMM 3.0 server. I updated the server and all the clients to 3.4 and now the slow logs seem to be getting written to & rotated by logrotate as expected.

Topic		Replies	Views
QAN: Slow Log Configuration + Bug in Logrotation PMM 1.x	7	2334	November 29, 2016
Handling slow log with suffixes (with max_slowlog_size and max_slowlog_files) PMM 2.x	4	1150	November 9, 2020
Slow query log keeping in check PMM 1.x	7	922	October 18, 2016
pt-agent not rotate mysql-slow-log Percona Cloud Tools	3	863	August 27, 2014
pmm-agent doesn't send slowlog until refreshing slowlog log file PMM 2.x	1	620	August 7, 2020

Slow logs getting written to beginning of two files simultaneously

Related topics