Questions for pt-online-schema-change

Hello

We just started using Percona Toolkit and are in testing this tool before we use it for our production service.
The Online Schema Change feature of Percona is very intreating for us. So we’re currently testing pt-online-schema-change command

In the manual page (" www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-cha nge.html "), there is an explanation:

“When the tool finishes copying data into the new table, it uses an atomic RENAME TABLE operation to simultaneously rename the original and new tables. After this is complete, the tool drops the original table.”

  • How does this tool determine if copying data is finished?
  • If too much new data is being inserted on the original table with very fast rate, when does this tool finish copying and renaming?

We have a MySQL table named “sessions” which has more than 200M records, and 3M records are being inserted per day on this table now. We like to use pt-onlince-schema-chage on this table.

We had made a test case on our development database.

  • We executed pt-online-schema-change to add new columns
  • While it’s being executed, we were inserting new data as production service will be doing the same
  • Database stays online successfully, but the remaining time of copying keeps increasing as we are inserting data.
    We assume that copying data will never stop in this case, so changing schema will be never finished.

Our company provides a mobile application analytics service, and our traffic is form 224 different countries. This means that inserting new data will be executed in 24 hours without break time.

*In our case, can we still this tool for online-schema-change? If not, is there any other solution for this issue?

Perconais very impressive tool for us. We hope we can use this tool for our production service.

Thank you!

pt-online-schema-change at the beginning creates triggers on a table that syncs all the new data being inserted, updated or deleted on the fly. This means that internal MySQL mechanisms are taking care of any new data incoming to a changed table. After the triggers are created, the tool copies all existed before data in chunks to a new table. It can additionally take care of current server load and replication lag on slaves and pauses it’s work if necessary.
Note that the triggers are never “paused” or destroyed, so during the whole process any changes to a table will have to be made twice - one for original table and second time for new destination table.
Final rename table is a very fast operation, so the writes will not be practically stopped.

Thank you przemek!
We had misunderstood some points of how MySQL triggers works. Thank you for clarifying this. We will create a new test, and then post the result here.

@sunku, since you’re original post was in early 2013, I was curious if you could share some of your experience with the pt-online-schema-change tool.

We’re also considering changes to a large table (~130 million rows) which is constantly updated. Did you run into any issues or have to employ workarounds for anything? Do you use replication? (if so, how do you implement changes across all servers? has row order, resulting from triggers, been an issue)

Thanks!