A bug for pt-table-checksum

hello,everyone!
I have tested pt-table-checksum on Multi master and multi slave environment.But a bug was discovered.
Environment introduction:
Master 1:192.168.10.30
Master 2:192.168.10.31
Master 3:192.168.10.210
Master 4:192.168.10.211
Everyone is the master of the other three,and everyone is also the slave of the other three.
图片1
Configure and every slave’s is correct:




A bug has occurred——Checksum but replica doesn’t connect master:

please help me ,thanks!

1 Like

Why are you not using PXC or Group Replication? You have essentially re-invented the wheel with this non-standard configuration. Circular replication is never advised and can lead to all sorts of problems.

I suggest using --recursion-method=dsn and set the replicas individually. Are all servers running in STATEMENT mode? They have to be if you want to checksum “through” an intermediate source to another replica.

1 Like

This configuration will be used for special scenarios,and it doesn’t cause circular replication.
I have also tryed to use --recursion-method=dsn,but this problem still exists!Does pt-table-checksum not supports this kind of configuration?I think it should support this model

Can you solve the problem in the next version?

1 Like

This model is extremely non-standard and does not follow any MySQL Best Practices for replication. I am very curious as to why this topology is better for you than PXC or Group Replication? PXC and GR do exactly what you are trying to do but without all of the replication channels.

It is circular replication: INSERT on .210 replicates to .211, which replicates to .31 which replicates back to .210 (per your diagram). Because this INSERT is tagged with .210’s own server_id, .210 ignores the repeated message.

Can you solve the problem in the next version?

I cannot tell what the problem is from your screenshots. Why did replication stop? Can you investigate with SHOW REPLICA STATUS and update?

1 Like

Let me describe the problem:
When 1 master 3 slaves,the slave configure “–channel=”,it cause replication stop.
But 1 master 1 slave,even if I configure “–channel=”,this problem doesn‘t exist.

1 Like

Try running with PTDEBUG=1 pt-table-checksum ... and see if you find any STOP REPLICA commands.

1 Like

Slave’s configure


show slave hosts & checksum command


Please analyze,thanks


Is the expression of parameter “–channel=13,23,43” incorrect?

1 Like

The manual says you can only specify a single channel for the --channel parameter. Even reading through the code says it is a single value parameter.

1 Like

Oh,That’s it! Why can’t I configure multiple?

Are you going to change the rules?

1 Like

Because that’s not how you are supposed to use the tool. You run pt-t-c on master1 and compare the checksums generated on master1 with master2/3/4. That’s it. One server has to be the actual source of truth. You pick. Run pt-t-c on that server and verify all the others match it.

1 Like

OK,thanks all the same.

1 Like