How to speed up DELETE with innodb table


We have a logging table that contains a months worth of data (about 10Million records) and we run a clean up script everyday that deletes a days worth of records (about 250k) so that we only have the last 31 days worth of information retained. This is taking hours to run!!

OS: Ubuntu 32bit version
MySQL version: 5.0.51a-3ubuntu5.1-log

Some (hopefully) useful MySQL info has been included in the info.txt along with output from top, vmstat and iostat

We have tried various DELETE querys
DELETE FROM log WHERE DATEDIFF(current_date(),dt)>31;

and even added a AUTO INC index called myid to see if that speeds up things
SELECT @l_id:=MIN(myid) FROM log;
SELECT @m_id:=MAX(myid) FROM log WHERE DATEDIFF(current_date(),dt)>31;
DELETE FROM log WHERE myid between @l_id and @m_id;

A select works very fast
mysql> SELECT count() FROM cpelog WHERE myid between @l_id and @m_id;
| count(
) |
| 259200 |
1 row in set (0.19 sec)

What can we do to speed up the delete?

Do I have too many indexes as I see that the index size is is greater that the table data size and approaching the 4GB physcial limit to my RAM?I see that that my SWAP does not seems to be used though so sure if that is the problem.

Any help most appreciated.

Out of curiosity and as a half baked idea towards a solution I also tried the following.

Copy all the the data I want to keep into an identical table and once that is complete drop the old log table and rename the new one

mysql> INSERT INTO log_temp SELECT * from log where DATE_SUB(CURDATE(), INTERVAL 31 DAY) <= dt;
Query OK, 9740800 rows affected (1 hour 8 min 39.67 sec)
Records: 9740800 Duplicates: 0 Warnings: 0

Obviously this is not the solution either as it takes over a hour to run. But it is amazing that it takes nearly as long to copy 9,740,800 rows to a new table as it does to delete 250,000.

My brain is melting here…

Well I dropped 3 of the indexes on this table only leaving the index on the dt column and got the desired result I was looking for.

mysql> delete FROM log WHERE dt < DATE_SUB(CURDATE(), INTERVAL 31 DAY);
Query OK, 259200 rows affected (6.45 sec)

It would seem that I will have to revisit the usefulness of indexes when it comes to the various queries that are run on this table. As it would seem that 4 is 2 many for the table of this size.

Hope this info is useful to some other sucker

Have you tried deleting in “chunks”, not all at once, by repeating your delete statement as in :

DELETE FROM log WHERE DATEDIFF(current_date(),dt)>31 LIMIT 5000

Sometimes this gives better results. Besides, you ensure your application is somehow responsive all the time (as long as the number of records is not too big)…

One other thing, you might be better with a constant instead of a function call in your WHERE clause.

So something like (assuming we’re ‘2008-11-25’:

DELETE FROM log WHERE dt < ‘2008-10-25’

can use an index on column dt