Parallel Import in MySQL not improving performance...

ErikEngerd · August 1, 2012, 12:56pm

Hi all,

As suggested in many post and for instance also in the book ‘High Performance MySQL’ it is a good idea to restore different tables of a database in parallel. But, it seems like this is not giving a significant improvement in restore time at all.

For instance. I first tried the following. Using an awk script, I split up one sequential mysql dump file (SQL format) into separate SQLs, taking into account the bits at the start and the end of the files. Then I imported those files in parallel using xargs with the -P option. In this I made sure to import the biggest files first to minimize total import time.

On one system where I tried this (windows, 1 quad core with mysql 5.0.82-community-nt-log), using 4 parallel import, I did not achieve any increase in restore time at all. The total time was practically the same. Yet what I saw in ‘show processlist’ was that it was executing SQL statements for different tables in parallel.

Then I though it could be related to windows and/or this old MySQL version so I tried it on one of the newer systems. This is a dual 6-core processor machine running Red Hat EL 6.2 and mysql 5.1.61. This time, after googling around some more I dumped the database in mysqlimport format with one txt file per table. Then I used mysqlimport with the --use-threads=4 option.

In the latter case, I am seeing a performance improvement, going from 2 hours 45 minutes with a single SQL file imported in a single thread to 2 hours with 4 threads using mysqlimport.

I am willing to ignore the results of the windows system (as that will be replaced anyway in the near future), but on the linux system, I would have expected a much higher performance increase.

Any ideas as to what could be the problem? The mysql configuration on the windows and linux systems is similar (the linux my.cnf is based on the windows my.ini file). It looks like there is some bottleneck in the database which is preventing truly parallel insertion.

Also, it is perhaps good to mention that we are using a single innodb data file and not file per table. I experimented also with using file per table on my laptop but got a much lower performance.

Any ideas as to what could be going on?

Cheers
Erik

ErikEngerd · August 1, 2012, 1:08pm

Some more info.
On both machines a Dell RAID card is used with write back caching and battery backup. Also the database on the second system (let’s focus only on the linux system) is about 80GB in size. The largest table is approx. 7 GB.

Also, I have attached a mysql config file with passwords and hostnames removed.

The database is a slave to another database but during restore the slave is (obviously) turned off. Also, there is another system which is a slave to this database. The database is not being using during backup as the application connects only to the master.

Also, the database uses innodb tables exclusively for the restored database.

Topic		Replies	Views
Mysqldump backup based restore taking lot time General Questions	3	3580	January 9, 2018
Speedup the import of Mysql Dump to Gcloud SQL Other MySQL® Questions	2	737	November 3, 2020
How to import large datasets Other MySQL® Questions	5	813	December 3, 2006
How to import large datasets Other MySQL® Questions	2	1063	June 30, 2014
xfering data from table to table within a single databse Other MySQL® Questions	0	294	September 8, 2008

Parallel Import in MySQL not improving performance...

Related topics