load data overhead

I’m trying to restore data from text file (been generated as ‘select … into outfile …’) in a Percona server version: 5.5.28-29.3.

I launched query ‘load data infile ‘…’ ignore into table tbl_new’, but it takes already 5 days and I can’t understend what happening?

The ‘old’ table takes 38.6 GiB (information of phpmyadmin), data file size is 18.7 GiB, but the new table already takes 46.5 GiB and query still in progress!

In adddition I chenged table schema of the new one and drop primary key autoincrement (int(11)) and set to primary the new field hash_d - bigint(20), but the old table has both fields and id - autoincrement (int(11)) and hash_d - bigint(20). Table size about 220M rows.

CREATE TABLE tbl_old ( id_d int(11) unsigned NOT NULL AUTO_INCREMENT, hash_d bigint(20) NOT NULL DEFAULT ‘0’, name_d varchar(300) NOT NULL, id_z smallint(5) unsigned NOT NULL, hash_p bigint(20) NOT NULL DEFAULT ‘0’, ins_date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, st_b smallint(5) unsigned NOT NULL, s_type int(11) NOT NULL DEFAULT ‘0’, id_st int(11) DEFAULT ‘0’, r_code smallint(3) NOT NULL DEFAULT ‘-1’, PRIMARY KEY (id_d), UNIQUE KEY idx_hd (hash_d), KEY id_st (id_st), KEY id_z (id_z), KEY idx_st_b (st_b), KEY idx_p (hash_p)) ENGINE=InnoDB DEFAULT CHARSET=utf8CREATE TABLE tbl_new ( hash_d bigint(20) NOT NULL DEFAULT ‘0’, name_d varchar(300) NOT NULL, id_z smallint(5) unsigned NOT NULL, hash_p bigint(20) NOT NULL DEFAULT ‘0’, ins_date timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, st_b smallint(5) unsigned NOT NULL, s_type int(11) NOT NULL DEFAULT ‘0’, id_st int(11) DEFAULT ‘0’, r_code smallint(3) NOT NULL DEFAULT ‘-1’, PRIMARY KEY (hash_d), KEY id_st (id_st), KEY id_z (id_z), KEY idx_st_b (st_b), KEY idx_p (hash_p)) ENGINE=InnoDB DEFAULT CHARSET=utf8

Can you show us the output of show engine innodb status and show full processlist?

If you have changed the PK from INT to BIGINT then the table will be bigger in size. INT is 4 bytes and BIGINT is 8 bytes. Also, take in account that in InnoDB every secondary key has the value of the PK attached to it. So now, every KEY will have a hidden second column with a BIGINT of 8 bytes.

Thanks alot!
This fact, about hidden column in index, I did not know. Unfortunatly I can’t show engine status because the query already finished. But when I tested different types of data and structures, I got contradictory results. The stuctures above are good example of your explanation of key’s structure, but when I tryed anothe structires, like

REATE TABLE tbl_int_key ( id int(10) unsigned NOT NULL AUTO_INCREMENT, field_1 bigint(20) NOT NULL DEFAULT ‘0’, field_2 bigint(20) NOT NULL DEFAULT ‘0’, PRIMARY KEY (id), UNIQUE KEY idex_fiealds (field_1,field_2), KEY idx_domainTo (field_2,field_1)) ENGINE=InnoDB DEFAULT CHARSET=utf8CREATE TABLE tbl_d_bigint_key ( field_1 bigint(20) NOT NULL DEFAULT ‘0’, field_2 bigint(20) NOT NULL DEFAULT ‘0’, PRIMARY KEY (field_1,field_2), KEY idx_domainTo (field_2,field_1)) ENGINE=InnoDB DEFAULT CHARSET=utf8

The first one (tbl_int_key) table was almost in 40% bigger then the second one (tbl_d_bigint_key), whenever the primary key must be bigger in the second table (tbl_d_bigint_key).

Is the answer lies in the quantity of keys in the tables?