I have done bulk-insertion in MySQL using multiple row insertions method by using python script. My testing CSV file consists of 100,000 lines only (real one consists of millions lines).
so I perform bulk-insertion.
and by using UNIX time command the results looks like
for 10 lines 100 lines 1000 lines
user 13.675 seconds user 11.948 seconds user 9.908 seconds
sys 0.192 seconds sys 0.076 seconds sys 0.080 seconds
for 10,000 lines 100,000 lines
user 11.672 seconds user 12.024 seconds
sys 0.072 seconds sys 0.079 seconds
to make it more simple I add user + sys and results are
10 rows 13.867 seconds
100 rows 12.024 seconds
1000 rows 9.988 seconds
10000 rows 11.744 seconds
100,000 rows 12.103 seconds
I am trying to figure out what is main reason that multiple rows insertion is better around 500-1000 lines and it becomes counter productive below 500 lines and above 1000 lines. I search over internet and found different answer.
some mention that it depends on
max_allowed_packet, bulk_insert_buffer_size, key_buffer_size .
I try these parameters but unable to see any effects.
My question is why bulk-insertion is optimum between 500-1000 rows per insert and its counter productive other than this range and what are the main factors.
I am already using this settings
max_allowed_packet = 16M
Can some body please point out a benchmark about bulk insertion efficiency or any idea how can I address why this is counter productive above certain point. I will really thankful for any small hint or idea. thanx