Table-checksum and table-sync

mark-gci · July 27, 2021, 6:34pm

pt-table-sync feels as though it could be more efficiently used if provided with --where params based upon the results from pt-table-checksum. It doesn’t seem to do so when running --sync-to-master, though. It feels inefficient re-process an entire large table.

I can see which tables need attention from the CRC and CNT values in checksums. Does anyone know how to compose a where statement based upon the content in the checksums table, or perhaps have a different workflow suggestion?

matthewb · July 27, 2021, 6:58pm

When you run pt-table-sync, you need to pass --replicate parameter and specify the database.table of where the checksum data is stored. This will make pt-t-s read the checksum results first and only look at those tables with differences.

mark-gci · July 27, 2021, 7:27pm

Thank you, yes. I’m using --replicate and it is applying only to those tables, but it doesn’t seem to process only part of the table. It may be that some of our tables are in worse shape than I knew, but based upon the structure of the checksums table it seemed as though targeted processing might be available rather than going through the whole table a second time.

matthewb · July 27, 2021, 7:40pm

Hi @mark-gci,
If you have evidence that pt-t-s while using --replicate is checking chunks other than where this_crc!=master_crc, please open a bug report at https://jira.percona.com/ and that is indeed a performance issue. Keep in mind that pt-t-s will operate at the chunk level, so if you have a chunk of 1M rows, then pt-t-s will check all in that chunk. You can configure pt-t-c to use a static chunk size if you find this is the case.

mark-gci · July 27, 2021, 8:13pm

I’m not sure I could provide that evidence. It may not be happening. The whole point of this discussion is to understand more about how to determine which part of a table relates to a given checksum+crc.

matthewb · July 28, 2021, 3:43pm

That information can be found in the percona.checksums table, which holds the results of the pt-t-c process. You can see the upper and lower index boundaries.

Topic		Replies	Views
pt-table-checksum: Recheck fixed chunk Percona Toolkit	1	549	November 1, 2012
Pt-table-checksum MySQL & MariaDB	6	1414	August 24, 2022
pt-table-sync on master-master replication usage help Percona Toolkit	1	1454	September 20, 2016
pt-table-checksum with replicate_do_db Percona Toolkit	2	746	April 4, 2016
pt-table-sync usage with master-slave replication Percona Toolkit	3	3995	April 24, 2014

Table-checksum and table-sync

Related topics