I have an InnoDB table with approximately 300 rows, which is regenerated every 3 seconds. (This is a redundant statistics table for a website with a very heavy load.)
Right now, the generation of the table is done in a transaction by deleting all rows and then inserting all necessary new ones. My question is, would it be faster (does it make any difference at all) to not DELETE/INSERT the rows, but do an INSERT … ON DUPLICATE KEY UPDATE query and then delete the rows that were unaffected by this operation (and thus, are now obsolete)?
And with the TRUNCATE->INSERT alternative you avoid searches that has to happen for each UPDATE/DELETE query.
So for a table with 300 rows where you also will delete the rows that was affected I think you are best off with trunacting and inserting the new recods.
I know this is not an answer to your question, but why not use HEAP instead of InnoDB? This would completely eliminate disk I/O and as the content is regenerated every few minutes, you should not really care too much about loosing data. As update is relatively fast, table locks should not be an issue here.
As for DEL/INS vs UPDATE, you could take few thousand samples from binary logs (to match ± average workload) and run few benchmarks (probably by running two copies of SuperSmack - one for sequential DEL/INS or UPDATE procedures and the other one - for concurrent SELECT workload).
The reason I chose InnoDB and the DELETE/INSERT method is that it’s very important that we get consistent data from the database at all times. With InnoDB, I can do my updates (DEL/INS) in a transaction and then COMMIT the changes in one step, meaning that all reads from the table will result in “correct” data.
The problem with the MEMORY (or HEAP) storage engine is that it doesn’t support transactions, so in that case, the DEL/INS method is no longer usable. (Having people read an empty table, even for just a split second is not an option here.) The same goes for TRUNCATE, as you can’t use it in a transaction, since it drops and recreates the table.
Anyway, thanks for your suggestions. I’ll try benchmarking both methods and see what I come up with that way.