This is my first post and I am reaching out to the Percona community to provide me some guidance with a problem I am having.
We are our Master / Slave replication. Two of our Drupal tables have fallen out of synch. The tables are “users” and “users_roles”. Replication halts when an update event is made on a record within the users table that while exists on Master, it is not on Slave. By our count on Slave, the “users” table is missing 300 records while users_roles is missing 10. All other tables within the database appear to be in sync. There are no sync issues with other databases on this Master that are being replicated to this Slave. Below is the error from a custom monitoring script I created to detect this:
Slave_IO_Running: Yes Slave_SQL_Running: No Last_Error: Could not execute Update_rows event on table adid_prod_d7.users; Can't find record in 'users', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.017097, end_log_pos 165292698 Seconds_Behind_Master: NULL Last_IO_Error: Last_SQL_Error: Could not execute Update_rows event on table adid_prod_d7.users; Can't find record in 'users', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.017097, end_log_pos 165292698
To remediate this problem, I manually stop the slave process (STOP SLAVE), set the global skip counter to 1 and then start the slave. This works until the next time this happens.
This issue first happened back in October and to remediate the problem I followed the directions from this Percona article: https://www.percona.com/blog/2008/06/29/resyncing-table-on-mysql-slave/. This worked as the “users” and “users_roles” tables had an equal number of rows but this week the same problem occurred several times and much to our surprise the same two tables (“users” and “users_roles”) were out of sync again. When I followed the directions to this article, I made sure I took the websites offline so that nothing was being written or read from the Master.
I have downloaded the Percona toolkit and have started to look at the documentation. I will admit that I do not know what to look for when running the pt-mysql-summary command or how this toolkit will affect my production environment. Will I have to notify my customers that my sites will be offlinett to run this toolkit? Besides backing up the databases prior to invoking pt-table-sync, what do I need to do beforehand and what are the best options to use.
Thus far I am thinking the command will be:
pt-table-sync --dry-run --sync-to-master --replicate -h slavedb
I am looking for some guidance here.