I am going to use percona-toolkit first time. I want to check replication integrity and data differences between the servers. I have very complex setup for replication which consists of multi-master replication active-active toplology (A & B) which synchronizing each other also every master has 3 slaves with one slave utilizing replication filters under each master means only replicating some of the tables (D & F) in the picture. Now i want to check replication integrity & deficiencies from master B between in entire replication cluster. Replication topology look likes this.
Master A <-----> Master B
| \ …|
C D …E F
The problem is replication filters how to checksum those slaves ? I don’t want to use no-check-replication-filters because as per documentation it might break replication which i can’t afford on production setup.
The only way i think to run the tool two times with --recursion-method=dsn by putting A,B,C,E in dsn table firsttime to check only those servers. Secondly, create dsn table again by inserting D,F slaves to check only replication filtered slaves from master B. Please suggest, this is the correct approch as per replication topology or not.
Secondly, my current database size is quite huge around couple of hundreds GB and around 20% of tables have around 7-8 million rows plus i am using mix of MyISAM and INNoDB tables so what chunk-size and chunk-limit i should use or keep it to default ? as i don’t want to put any extra load on servers. I finalized the following two commands
This will run for server’s without replication filters slaves i.e on A,B,C and E. And as per my understanding following commands (1 & 2) checksum mydatabase tables as per dsn table host list and replicates test.checksum table from master B to all slaves and will checksum each slave as per dsn table host list and compares the checksum of master B and slave and will report accordingly on output.
And it will take care of replication incremental data as well (might will checksum only data on slave which was checksumed on base master B earlier). Later pt-table-sync can be used to fix the differences between master/slaves.
Command 1:
$ ./pt-table-checksum --empty-replicate-table --replicate=test.checksum --create-replicate-table --recursion-method=dsn =D=test, t=dsns --user=username --password=pass --databases=mydatabase --host=master B --user=XXXX --password=XXXX
This will run for only replication filters slaves from master B i.e. slave D and slave F
Command 2:
$ ./pt-table-checksum --empty-replicate-table --replicate=test.checksum --create-replicate-table --recursion-method=dsn =D=test, t=dsns --user=username --password=pass --databases=employees --host=master B --user=XXXX --password=XXXX
Please suggest accordingly all my steps are OK to run on production system or is there any better way to execute pt-table-checksum as per replication topology i described.