I have a very large partitioned table that I need to add some new columns too. It will take days to migrate and I can’t have the system offline while I do it. Perfect job for pt-online-schema-change!
Normally this means I need double the space on my disk (old copy + new copy) to do the modification with pt-online-schema-change. We don’t have enough space currently and we could add it, but then if we add it to the file system (xfs) with the database tables on it we can’t give it back when we are done. You can’t reduce the size of an xfs file system. There are multiple TB of data that the table uses.
The big idea was to add the column(s) and modify the tablespace (new tablespace on new disk) at the same time thus writing the table on a new disk using pt-online-schema-change. Then in a second go just modify the tablespace back to the original tablespace/disk (‘TABLESPACE = innodb_file_per_table’). Leaving the temporary work disk empty again so we can release it back to the wild to be used for other things.
Here is my alter statement
alter table vmv_base
ADD COLUMN transcript_id VARCHAR(20) DEFAULT NULL AFTER gene_id,
REORGANIZE PARTITION pSynonymous,pNonSynonymous,pNoCall INTO (
PARTITION pSynonymous VALUES IN (0) TABLESPACE = tmp1,
PARTITION pNonSynonymous VALUES IN (1) TABLESPACE = tmp1,
PARTITION pNoCall VALUES IN (2) TABLESPACE = tmp1)
The Problem: You can’t combine other alter statements with a REORGANIZE PARTITION alter.
So I can move the table or alter the table but I can’t do both at the same time. In my case one w/o the other does not work I will blow out the disk space or just move the table without altering it.
I looked into the the ‘DATA DIRECTORY’ options on the partitions but that does not seem to be supported at all for alter table(just create), plus the alter statement also uses the problematic REORGANIZE PARTITION in the same was as above with the same limitation as my statement above.
Adding column(s) with mysql’s ALTER statement with algorithm=online results in the table being rebuilt in in the same directory (needing twice the disk space too) so that is a non starter.
I tried using two –alter clauses with pt-online-schema-change, but that only executes the second alter.
In my mind that would be the cleanest way to make this all work is allow multiple alter statements to pt-online-schema-change to get around the rule that says you can’t do both these alterations in one go.
Before I go off and basically emulate the pt-online-schema-change tool on my own, is there an easy solution I am overlooking? Some other tool maybe…
- create new table on new tablespace
- write my own script to loop over the data and insert it
- stop processing
- rename the tables.
- drop the old table.