I have binary logging set up for master / slave replication from MySQL 5.5 to Percona 5.5. It was working fine for over a month, but a couple of days ago (April 12) the server ran out of drive space due to a sudden increase in binary log file growth on the master. It went from a few 100 MB log files per day to a few per minute. The ibdata file hasn’t shown the same rate of growth, holding steady at around 20 GB. I reviewed the error logs on the master and saw that it started producing the error below on April 6th. After some research I changed the innodb_log_file_size config to 256M. Previously it was at the default value. I restarted the server but soon started seeing the same problem of high log file growth rate. At this point I’ve shut down logging (and replication) for fear of running out of drive space again.
At this point I’m not sure how to proceed and am reaching out for suggestions. Googling the problem didn’t yield much. It seems odd that the log files would grow at such a high rate while the ibdata file wouldn’t. The log files were adding the equivalent of the total ibdata file size every hour.
Thank you,
Victorio
130406 17:56:39 InnoDB: ERROR: the age of the last checkpoint is 9435301,
InnoDB: which exceeds the log group capacity 9433498.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.
innodb_log_file_size and binary logs are independent in this case. To check what is writing the binlogs, you can get a sample with mysqlbinlog and investigate more about your binlogs, you can get a sample and check the queries there.
Also, if you are using binlog_format ROW, keep in mind that if you change a range of rows in your master (UPDATE table SET status=0 WHERE id < 1000) it will generate 1000 or entries in your binary log file.
To check the binary log file if you are using binlog_format ROW, you have to use “mysqlbinlog -v --base64-output=DECODE-ROWS log_file”
Thanks for the reply. I’m using STATEMENT format. As suggested, I used mysqlbinlog to look at some of the log files. Mostly it’s INSERTs interspersed with UPDATEs on individual rows. It does seem like there are some largish UPDATEs, possibly on the same row, which I think could explain my problem? However, I can’t tell easily since I’m using less on the command line to look at these logs. I thought maybe I could use pt-query-digest to get a summary report on these logs, but when I ran it on my log file all I got was the output copied below. Note that no profile data is being reported. I’m very new to this tool, so perhaps I’m not using it properly? All I did was run pt-query-digest followed by the log file name.
# 8.4s user time, 320ms system time, 856.91M rss, 1002.16M vsz
# Current date: Sun Apr 14 22:12:01 2013
# Hostname: Sync_Server
# Files: mysql-bin.001286
# Overall: 0 total, 1 unique, 0 QPS, 0x concurrency ______________________
# Attribute total min max avg 95% stddev median
# ============ ======= ======= ======= ======= ======= ======= =======
# Query size 101.71M 101.71M 101.71M 101.71M 101.71M 0 101.71M
# Profile
# Rank Query ID Response time Calls R/Call V/M Item
# =========== =========== =========== =========== =========== ===== ======
I don’t think you need a summary from pt-query-digest. The problem should be a large number of statements, so by going through your logfiles you should see a large number of statements. Previously, the number of statements was less. Maybe by going through the logfiles you can see why there is an explosion in the number of statements.
Unfortunately I deleted the old logs (prior to the start of the problem) to free up disk space. I ran a line count on one of the log files during the problem period (I assume this is a good proxy for the statement count) and got back about 1000 lines. I’m not running Percona Monitoring.
It seems like there are multiple UPDATEs being made to the same set of 3 or 4 rows. Again, I can’t tell very easily because I’m just sending the output of mysqlbinlog to less. That’s where I was hoping pt-query-digest could help, but it doesn’t produce the output I expected. Regardless, these rows have a text column which contains a JSON array which can be on the order of a few MB in size. The UPDATEs are appending the JSON array stored in this text column. These JSON arrays are messages for mobile devices, which are purged (deleted) when a mobile device connects to the server and pulls its messages. For some reason these particular messages aren’t getting purged.
My hypothesis is that the log growth is caused by these UPDATEs. Since the logs are stored in STATEMENT format and the entire JSON array for a given device is stored as a single column in a single row, each time the JSON array is appended the log is written with the entire JSON array. At a few to many MB per JSON array it would only take 20 or 30 UPDATEs (about what I’m seeing) to fill a 100 MB log file.
Does this make sense as an explanation for what I’m seeing? I’m mostly a software developer and not a DB expert, so I’m not too experienced with these types of issues.