I want to update a table that holds just under 1 billion rows and 200GB of data. I’m confident that I can use the pt-osc for this. But are there any considerations to take into account? For example: staying away from backup windows, etc.
Here are some pre-flight checks:
- Make sure there are no long running transactions at the start of the operation and before the RENAME stage of pt-osc.
- Make sure to check if there are any foreign keys
- Check for triggers, latest MySQL version should support triggers but it would be best to test first with --dry-run option before --execute
- Where are you going to use pt-osc eg. standalone, master-slave, PXC/Galera cluster, etc?
- Generally we recommend testing in your staging environment with similar setup to check if everything goes well before doing it in production.
Thanks jrivera for the quick answer!The update has already been applied to test and acceptance systems, in each case first with the --dry-run option.Databases were updated successfully, but both databases are tiny in size compared to the production database.Yes, the table has foreign keys pointing to other tables. I’m using the --alter-foreign-keys-method auto parameter for that.The table is never part of long running transactions. That is for sure.The database is part of a MASTER-SLAVE setup. I’m using the --recursion-method hosts parameter for that.Anything else that should need my attention prior to the update?So, generally speaking, there is no size/rowcount restriction to the usage of pt-osc, correct?
There is no size or row count limitation but the bigger the table (row count) the longer the pt-osc will take. If you have reads going to the slaves and you don’t want the slaves to lag too much you may want to add --check-slave-lag and --max-lag. Do not execute pt-osc from the slave side or else you’d encounter this - pt-online-schema-change (if misused) can't save the day - Percona Database Performance Blog
Thanks again. I will continue with confidence then