Hello,
I am running into an issue after restoring from backup on our percona cluster to a slave node where the slave process starts up and is running[1]; however, after the next few transactions, replication breaks and the Slave_SQL_Running turns to NO because of the ERR_KEY_NOT_FOUND [2] message below. We are using percona xtrabackup to take the backup and restore. Usually when this error occurs, the master node has updated a schema and the slave does not get the updated schema. We did try to inject an empty transaction while leaving gtid running. After trying to restore multiple times, the transactions change each time and did not point to same table or database.
We have done steps similar to a percona blog [3] using mysqlbinlog to verify that the transactions for particular tables are on the master and the slave node.
Has anyone in the community hit this issue before?
[1]
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
[2] Last_Error: Could not execute Update_rows event on table test_database.test_table; Can’t find record in ‘test_table’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’s master log mysql-bin.002551, end_log_pos 7245503
[3] [url]https://www.percona.com/blog/2010/05/06/debugging-problems-with-row-based-replication/[/url]