varchar index killing csv import

ievolve · February 24, 2007, 7:18pm

I have a large csv file of about 2 million domain names from urlblacklist.com

I have an app that queries the table a lot and in order to lookup the domain names fast they need to be indexed. The table is innodb that has columns for domain and categoryID, the combination of these two are the unique key.

When I try to load 2 million records into the table (load infile) it gets to about 1.5 million then starts to have a real hard time. I let it run for 12 hours but it didnt finish so I gave up.

My guess is that this is because it is having to check for a unique key on every insert, so as the number of records grows I am put in a worse and worse situation.

Is there something I can do to make this work? Is mysql just not the right tool? The oracle buffs out there are saying I am just using the wrong platform, is there something I can do to reasonably use mysql for this application?

Peter · February 25, 2007, 6:25am

You’re right.

During Load Data Infile MySQL checks UNIQUE indexes at once while other indexes are built after load is done.

To optimize it you can ether:

have your key_buffer large enough for index fits well into it.
load URLs sorted by URL in this case unique checks will be fast.

Alexey · February 25, 2007, 9:26am

[B]ievolve wrote on Sat, 24 February 2007 19:48[/B]

Is mysql just not the right tool? The oracle buffs out there are saying I am just using the wrong platform, is there something I can do to reasonably use mysql for this application?

I think InnoDB is wrong tool for this. As soon as your active data set size outgrows buffer pool, you start getting very serious performance problems.

Peter · February 25, 2007, 10:46am

If you data set outgrowth caches you’ll have the problems in any case. Innodb is similar to MyISAM in this case, it just can be much larger because indexes are not packed.

There is no silver bullet out where and each choice has benefits and drawbacks.

In this case the concern was not performance but the fact alerts are dropped because of MyISAM table locks.

Topic		Replies	Views
Alter table disable keys and unique indexes Other MySQL® Questions	2	6294	October 17, 2006
Slow inserts For Large Innodb table Other MySQL® Questions	1	541	July 8, 2009
Load large data from file into innodb table Other MySQL® Questions	4	2837	March 15, 2010
How to import large datasets Other MySQL® Questions	5	829	December 3, 2006
Inserts gradually slowing down Other MySQL® Questions	3	525	August 23, 2008

varchar index killing csv import

Related topics