Pt-slave-repair is a supplement to the original pt-slave-restart tool

pt-slave-repair is a supplement to the original pt-slave-restart tool, which provides automatic repair for MySQL master-slave synchronization replication errors and restores interrupted SQL thread replication threads.

Principle:

1.When the synchronization error 1062 (primary key conflict, duplicate) and 1032 (data loss) are detected, the first step is to perform a binlog environment check. If binlog_format is not equal to ROW and binlog_row_image is not equal to FULL, then the main program will exit. If the error number is not 1062 or 1032, the main program will directly exit.

2. Obtain the information of show slave status to get binlog, position, and gtid information.

3. Connect to the master database to parse the binlog. If it is a DELETE deletion statement, it will be skipped directly.

4. Disable slave_parallel_workers multi-thread parallel replication.

5. If GITD replication mode is enabled, use SET gtid_next method; if position point replication mode is enabled, use SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1 method.

6. If it is an UPDATE/INSERT statement, parse the BINLOG into specific SQL, reverse the SQL, and convert it into REPLACE INTO.

7. Insert the parsed REPLACE INTO statement reversely into the slave to keep the data consistent, and then perform operation 5.

8. Set the slave as read-only mode.

9. Similarly, ultimately make the show slave status synchronized as double YES (normal synchronization).

test:

1. First, set up the master-slave replication environment and insert three records into the master database. Once the slave database has finished synchronizing these three records, truncate the table on the slave side. Then, perform a full-table update on the master database. This will simulate the occurrence of error 1032.

2. Run the pt-slave-repair tool for repair.
shell> ./pt-slave-repair -H 192.168.198.239 -P 3346 -u repl -p 123456 -d test