pt-table-checksum with replicate_do_db

Hello All,

I believe my problem is mostly down to simply understanding the pt-table-checksum documentation, but I need a bit of help none the less.

My goal is to checksum in an environment where only two dbs of many are replicated to the slave. With this being the case, the percona.checksums table never replicates to the slave. Is there a way to do this without writing the checksums to the table? Is my only option to just include the percona db in my replication?

This means master has more database tables as compare to slave because of restricted set of databases replicated via --replicate-do-db. So, checksum on those databases will not fall to slave. Did you tried --no-check-replication-filters option of pt-table-checksum ? Can you show us my.cnf from master/slave and exact command used for pt-table-checksum ? Yes, restrict specific database tables in checksum process is good idea.

Based in what you explained with this sentence: my goal is to checksum in an environment where only two dbs of many are replicated to the slave

You will be able to checksum just those database schemas that exists on both master and slave and the table storing the checksums should be on both servers as the final results in terms of diffs are queried on the slave side.


SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; 

As you’re using replicate_do_db to get just two schemas replicated, have you tried to start the checksum targeting just those two schemas, having the checksums table created in one of them? This way you can:

if replicate_do_db is like:

[mysqld]
replicate_do_db=a,b

Create the checksums table on schema a on the master side:

CREATE TABLE IF NOT EXISTS a.checksums (
db CHAR(64) NOT NULL,
tbl CHAR(64) NOT NULL,
chunk INT NOT NULL,
chunk_time FLOAT NULL,
chunk_index VARCHAR(200) NULL,
lower_boundary TEXT NULL,
upper_boundary TEXT NULL,
this_crc CHAR(40) NOT NULL,
this_cnt INT NOT NULL,
master_crc CHAR(40) NULL,
master_cnt INT NULL,
ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (db, tbl, chunk),
INDEX ts_db_tbl (ts, db, tbl)
) ENGINE=InnoDB;

Start the checksum targeting master server:

pt-table-checksum -h <master_dbserver_ip_address> -P <master_dbserver_port> ignore-databases b,c,d,e,f,g,h,i --replicate a.checksums --chunk-size=100

Obviously, there are many other options you can use on pt-table-checksum and carry on following the documentation to avoid any disruption on your production environment.

Let us know please whether it make sense or not, cheers.