I have a statistical database that contains a lot of data for all of the different hardware/software in our organization. The database is partitioned by week using MyISAM tables using the latest 5.1 release (installed it on Monday).
I have a new database that I’m migrating everything to and I’m starting to see some behaviors that I need to plan for so everyone can use the data in a timely manner while inserting bulk data.
There are a lot of “load data into” queries that run throughout the day and there are some users who write very long running queries that are blocking the updates from happening. I’ve thought about several options (short of smacking users) and need some opinions.
Create a blackhole database that handles all of the inserts and then replicates to the main database for reading
Switch to InnoDB (not sure if this will help if “load data” will lock the entire table anyway)
Insert into another table into the main database and create an event to insert that data into the main table
We add about 7 million rows a day into a single table. This has been ramping up since late 2007 and we add new statistics to report on every few weeks. For every new statistic that is added increases the daily load by at least 24,000 rows (depending on the situation).
The reason it is partitioned that way is because when we do reporting, we mostly get data from the last week or two so it’s more efficient to partition by week and remove some indexes than to partition by month.
I’m not very experienced with this kind of data usage. But I guess switching to InnoDB is the best thing you could do as it is the easiest method but is still most promising. You could try InnoDB for new tables only as a test and see how well it performs, since converting all old tables takes a while (~15 GB/hour but very dependent on hardware). Iirc LOAD DATA INFILE will not lock the table, that would defeat the purpose of row based locking.