peter_zaitsev (peter_zaitsev) wrote,
@ 2005-01-10 16:43:00
MySQL: Loading large tables with Unique Keys
Normally MySQL is rather fast loading data in MyISAM table, but there is exception, which is when it can’t rebuild indexes by sort but builds them row by row instead. It can be happening due to wrong configuration (ie too small myisam_max_sort_file_size or myisam_max_extra_sort_file_size) or it could be just lack of optimization, if you’re having large (does not fit in memory) PRIMARY or UNIQUE indexes.
Below is the internal letter I’ve sent out on this subject which I guessed would be good to share.
Today on my play box I tried to load data into MyISAM table (which was previously dumped as mysqldump --tab)
The data was some 1.3G, 15.000.000 rows, 512MB memory one the box.
One could could call it trivial fast task, unfortunately I had
unique key on varchar(128) as part of the schema.
The load took some 3 hours before I aborted it finding out it was just about 20% done.
The problem is - unique keys are always rebuilt using key_cache, which means we’re down to some 100-200 rows/sec as soon as index becomes significantly larger than memory.
You can’t go away with ALTER TABLE DISABLE KEYS as it does not affect unique keys.
Actually even Innodb with all its transactional overhead and inability to repair keys by sort can do better if SET UNIQUE_CHECKS=0 is used.
What I would guess could be done for MyISAM ?
At least it could support UNIQUE_CHECKS=0 leaving on user
responsibility if it is really so. If I’m restoring from backup I’m pretty sure about that.
If we want neat solution check if key is really unique could be done after data is loaded. It is much faster, especially for physically sorted key we end up with.
The workaround which I found so far is really ugly, however I’ve seen users using it with good success.
- You can create table of the same structure without keys,
- load data into it to get correct .MYD,
- Create table with all keys defined and copy over .frm and .MYI files from it,
- followed by FLUSH TABLES.
- Now you can use REPAIR TABLE to rebuild all keys by sort, including UNIQUE keys.
2005-03-09 11:44 pm UTC (link)
I have started using your trick, to create a table and copy over the *.frm/MYI files and do a repair with myisamchk.
Let me tell you that this is the most impressive, slickest way of getting around mysql’s lack of speed. I don’t think mysql behaves large myisam_max_* settings. Although show variables will show them set very high (600G), it always defaults to repair by key cache.
I used to wait around 4-6 hours for my tables to convert. Today, I swear to god it was TOPS 15 minutes, and that included a --sort-records after.
Now the tables are blazing. Very slick Peter.
I hope you keep writing these good articles, i come by and read them… never post until now.