Auto increment bug experienced using PT Online Schema Change on RDS

Heyo Percona and MySQL experts. I’m hoping someone has some ideas on how I can either reproduce an issue or how it may have happened.

We ran Percona Toolkit’s Online Schema Change (version 3.5.4) against our MySQL RDS instance (version 8.0.28). Shortly after running what looked like a successful collation change on a couple columns, we noticed that somehow the auto increment was off, causing lookup joins to return invalid data. Somehow the auto increment value reset to something a few thousand lower! Roughly two or three day’s worth of inserts.

I thought maybe it was MySQL 8 caching on information_schema. It turns out someone had a similar problem a while back: https://bugs.mysql.com/bug.php?id=91038 . However, we are unable to reproduce this issue. We took a snapshot, inserted a bunch or records, saw that the information_schema’s cached value of the auto-increment was too low, ran the migration again, but this time, the new auto-increment value was correct and preserved. Beyond that, the value of information_schema_stats_expiry is a day, so that doesn’t exactly line up.

Another interesting point. Unless django was eating the errors, we should have seen issues with trying to insert a record and the auto increment value already being in use and we did not see those errors in our application logs.

Anyone have suggestions, thoughts, explanations, or anything else that I can use as a lifeline to understand what happened? We can require information_schema_stats_expiry = 0 on migrations, but since we can’t reproduce, we can’t say that this is the fix.

Not sure where to go from here; would love any help. Thanks and cheers!

Could you please share full pt-online-schema-change command you used and its output?