Pt-online-schema-change breaks table replication in AWS DMS

I’m not sure how far this will go, but figured I’d ask here.

To give context, we are using AWS DMS to replicate CDC binlog records to S3 from an Aurora MySQL instance. However due to binlog precision only being to the second we opted to add an additional column to all tables that auto-updates with a millisecond precision timestamp.

We have several very large tables that can only be modified by pt-online-schema-change. However, when executing this command our AWS DMS replication server complains that the schema of the table no longer matches (column count with new column) the incoming CDC records and simply skips replication entirely.

I’m curious if anyone has seen this issue before, as normal ALTER commands function fine.

1 Like

What command exactly?

1 Like
pt-online-schema-change --host=<host> --user=<user> --ask-pass --execute --no-drop-old-table --no-check-alter --alter="ADD COLUMN db_row_update_stamp  TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3)" D=<db>,t=<REPLACE_TABLE_NAME_HERE>

Here’s the command we ran to make our table changes.

1 Like

Without knowing the internals of AWS DMS, this is difficult to diagnose. pt-osc does not change the original table at all. It creates a new table, a copy of original, modifies that table, then copies rows from original to new. Lastly, it drops old table, renames new table to match original name.

Have you been able to identify where in the process of pt-osc the DMS starts to complain? Maybe the DMS is looking/watching for an actual ALTER TABLE command on the original table, which never happens, thus it starts complaining after the RENAME?

1 Like

I think the RENAME is where this falls through. I was initially thinking that because the new table gets renamed to the old table name, that AWS DMS doesn’t catch this. For whatever reason the renaming to an existing, tracked table causes the schema mismatch.

I figured it would just pick up the fact that both tables get renamed as it does support tracking RENAME statements. But I know percona is doing a sort of transaction based rename to make the alter instant.

A workaround I was considering testing was waiting until just before the rename process executes (ie at 90% or 95%) and pausing replication. Once the rename happens, resume at a binlog stamp after the rename and just live with a minor gap in replication. This is obviously not ideal, though.

1 Like

It appears this is an issue with DMS specifically, the RENAME statement is failing even without Percona toolkit being used. So resolving this issue and opening it up on the AWS side.

For posterity, DMS complains that RENAME is not supported on server version 3.4.3 w/ Aurora MySQL 5.6 as a source, and S3 as a target.

1 Like