How to run pt-table-sync on two replicas

I have one master environment and two slaves.

MASTER: 192.168.200.14
SLAVE: 192.168.200.74
SLAVE: 192.168.200.12

I ran pt-table-checksum and noticed a difference in slave 200.74.

According to the pt-table-sync manual, “changes are always made at the replication source, never directly on the replica.”

If the changes are made directly at the source (MASTER), I believe that through the binlog, the changes (replace) will also happen on 200.12, which has no differences. Am I right?

The command I intend to execute is this:

pt-table-sync --execute --replicate=percona.checksums --sync-to-master --databases=teste–tables=cadlogenviosmsespecifico h=192.168.200.74,u=usr_xxxxxx,p=xxxxxx --socket=/var/lib/mysql/mysql.sock

The question is:

Does it apply the correction to 200.74 without applying it to the master? Or does it apply it to the master and the data comes through the binlog, also executing on 200.12?

I’m a little confused about how the tool works when you have two replicas.

Correction, you have a source environment, and two replicas.

Correct

–sync-to-master

The correct flag is --sync-to-source

Yes. As it states, changes NEVER take place directly on replicas. Changes always go through replication. If the data is correct on 200.12, then the UPDATE does nothing.

Perfect, understood! Thanks for your help.

And in the case of missing data in 200.74, for example? Pt-table-sync tends to perform replace/insert commands, correct?

Assuming that:

  • Data is missing in replica 200.74;
  • The data is correct in Source 200.14;
  • The data is correct in replica 200.12.

How does pt-table-sync behave, since the data exists in the Source and in one of the replicas?

Usually when data is missing in the replica, it performs a replace, correct? So, if the data that is missing in one of the replicas already exists in the source, does it delete the data from the source and insert it again so that the data reaches the replicas?

Doesn’t this tend to cause a replication error because the data already exists in one of the replicas? Or is the delete that is executed at the source also sent to the slaves?

This flow is a little confusing to me.

Hello @andryosribeiro

This tool changes data, so for maximum safety, you should back up your data before using it. When synchronizing a server that is a replication replica with the --replicate or --sync-to-source methods, it always makes the changes on the replication source, never the replication replica directly. This is in general the only safe way to bring a replica back in sync with its source; changes to the replica are usually the source of the problems in the first place. However, the changes it makes on the source should be no-op changes that set the data to their current values, and actually affect only the replica.
It sets the data to its current values on master and replicates that same data to all the replicas.

You can use –print option to understand what all statements it is going to execute.

1 Like