Not the answer you need?
Register and ask your own question!

pt-table-checksum not showing diffs and reporting success

jhbremerjhbremer EntrantCurrent User Role Beginner
Hello,

I've used pt-table-checksum in the past to great success. However I'm having some trouble using it with a Percona Cluster setup. Currently my setup is a 3 node cluster, and two of these nodes each have their own standard replication slave, for a total of 5 servers.

I'm using the DSN method to define connections to each cluster node as well as the slave instances. One of the slaves is physically located near the cluster so no SSH tunnel is necessary. When running the tool here, it successfully discovers diffs on the slave node.

However, the 2nd slave server is physically far away and can only be connected to via SSH tunnel. I have set up a tunnel, and entered the information into the DSN table. I've verified I can connect to the percona server on the slave over the tunnel. When running pt-table-checksum for this slave, it reports 0 diffs found when I am certain there is a difference on a specific table. There are no error message that I can see.

Any ideas?

Thanks!

Comments

  • carlos.salguerocarlos.salguero Percona Toolkit Developer Percona Staff Role
    Hello,
    Could you run pt-table-checksum enabling debug info?
    PTDEBUG=1 pt-table-checksum <params>

    Thanks
  • jhbremerjhbremer Entrant Current User Role Beginner
    Thanks for the reply. Here is the debug output (at least the last bit of it where it is checking the slave in question). I've replaced server names, etc.:

    # pt_table_checksum:10925 40168 Table on [brokenSlaveServer] has 242 rows
    # pt_table_checksum:12118 40168 use_repl_db
    # pt_table_checksum:12139 40168 USE `percona`
    # pt_table_checksum:10991 40168 DELETE FROM `percona`.`checksums` WHERE db = ? AND tbl = ?
    # pt_table_checksum:12118 40168 use_repl_db
    # pt_table_checksum:12139 40168 USE `ftwill`
    # NibbleIterator:6508 40168 init callback returned 1
    # NibbleIterator:6544 40168 Nibble: REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /*!99997*/ ?, ?, ?, ?, ?, ?, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', CRC32(`description`), CRC32(`type`), CRC32(`sorder`), CRC32(`document`), CRC32(`other`), CRC32(`pages`), `aid`, CONCAT(ISNULL(`description`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `ftwill`.`5500Type` /*checksum table*/ params:
    # pt_table_checksum:12204 40168 EXPLAIN SELECT COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', CRC32(`description`), CRC32(`type`), CRC32(`sorder`), CRC32(`document`), CRC32(`other`), CRC32(`pages`), `aid`, CONCAT(ISNULL(`description`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `ftwill`.`5500Type` /*explain checksum table*/ params:
    # pt_table_checksum:12214 40168 EXPLAIN plan: $VAR1 = {
    # extra => '',
    # id => '1',
    # key => undef,
    # key_len => undef,
    # possible_keys => undef,
    # ref => undef,
    # rows => '242',
    # select_type => 'SIMPLE',
    # table => '5500Type',
    # type => 'ALL'
    # };
    #
    # Quoter:4123 40168 Serializing $VAR1 = [];
    #
    # Quoter:4123 40168 Serializing $VAR1 = [];
    #
    # Retry:7976 40168 Try 1 of 2
    # pt_table_checksum:11584 40168 REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT /*!99997*/ ?, ?, ?, ?, ?, ?, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', CRC32(`description`), CRC32(`type`), CRC32(`sorder`), CRC32(`document`), CRC32(`other`), CRC32(`pages`), `aid`, CONCAT(ISNULL(`description`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `ftwill`.`5500Type` /*checksum table*/ lower boundary: upper boundary:
    # pt_table_checksum:11605 40168 SHOW WARNINGS
    # Retry:7993 40168 Try code succeeded
    # pt_table_checksum:11106 40168 Nibble time: 0.00342392921447754
    # NibbleIterator:6552 40168 0 rows in nibble 1
    # NibbleIterator:6564 40168 No rows in nibble or nibble skipped
    # pt_table_checksum:11170 40168 Total avg rate: 70679
    # WeightedAvgRate:9500 40168 Master op time: 242 n / 0.00342392921447754 s
    # WeightedAvgRate:9512 40168 Initial avg rate: 70679.0312652322 n/s
    # WeightedAvgRate:9516 40168 Adjust n to 35339
    # ReplicaLagWaiter:8592 40168 Checking slave lag
    # MasterSlave:5435 40168 DBI::db=HASH(0x2f85210) SHOW SLAVE STATUS
    # ReplicaLagWaiter:8595 40168 [workingSlaveServer] slave lag: 0
    # MasterSlave:5435 40168 DBI::db=HASH(0x2f39dc0) SHOW SLAVE STATUS
    # ReplicaLagWaiter:8595 40168 [brokenSlaveServer] slave lag: 0
    # ReplicaLagWaiter:8628 40168 All slaves caught up
    # MySQLStatusWaiter:9389 40168 Checking status variables
    # pt_table_checksum:10675 40168 SHOW GLOBAL STATUS LIKE ? Threads_running
    # MySQLStatusWaiter:9392 40168 Threads_running = 2
    # MySQLStatusWaiter:9419 40168 All var vals are low enough
    # OobNibbleIterator:7175 40168 Done nibbling past boundaries
    # NibbleIterator:6573 40168 Done nibbling
    # pt_table_checksum:11229 40168 Checking slave diffs
    # pt_table_checksum:12298 40168 SELECT MAX(chunk) FROM `percona`.`checksums` WHERE db='ftwill' AND tbl='5500Type' AND master_crc IS NOT NULL
    # pt_table_checksum:12316 40168 Getting last checksum on [clusterNode1]
    # pt_table_checksum:12319 40168 [clusterNode1] max chunk: 1
    # pt_table_checksum:12316 40168 Getting last checksum on [workingSlaveServer]
    # pt_table_checksum:12319 40168 [workingSlaveServer] max chunk: 1
    # pt_table_checksum:12316 40168 Getting last checksum on [clusterNode3]
    # pt_table_checksum:12319 40168 [clusterNode3] max chunk: 1
    # pt_table_checksum:12316 40168 Getting last checksum on [brokenSlaveServer]
    # pt_table_checksum:12319 40168 [brokenSlaveServer] max chunk: 1
    # RowChecksum:6256 40168 SELECT CONCAT(db, '.', tbl) AS `table`, chunk, chunk_index, lower_boundary, upper_boundary, COALESCE(this_cnt-master_cnt, 0) AS cnt_diff, COALESCE(this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc), 0) AS crc_diff, this_cnt, master_cnt, this_crc, master_crc FROM `percona`.`checksums` WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) AND (db='ftwill' AND tbl='5500Type')
    # pt_table_checksum:11261 40168 0 checksum diffs on [clusterNode1]
    # RowChecksum:6256 40168 SELECT CONCAT(db, '.', tbl) AS `table`, chunk, chunk_index, lower_boundary, upper_boundary, COALESCE(this_cnt-master_cnt, 0) AS cnt_diff, COALESCE(this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc), 0) AS crc_diff, this_cnt, master_cnt, this_crc, master_crc FROM `percona`.`checksums` WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) AND (db='ftwill' AND tbl='5500Type')
    # pt_table_checksum:11261 40168 0 checksum diffs on [workingSlaveServer]
    # RowChecksum:6256 40168 SELECT CONCAT(db, '.', tbl) AS `table`, chunk, chunk_index, lower_boundary, upper_boundary, COALESCE(this_cnt-master_cnt, 0) AS cnt_diff, COALESCE(this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc), 0) AS crc_diff, this_cnt, master_cnt, this_crc, master_crc FROM `percona`.`checksums` WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) AND (db='ftwill' AND tbl='5500Type')
    # pt_table_checksum:11261 40168 0 checksum diffs on [clusterNode3]
    # RowChecksum:6256 40168 SELECT CONCAT(db, '.', tbl) AS `table`, chunk, chunk_index, lower_boundary, upper_boundary, COALESCE(this_cnt-master_cnt, 0) AS cnt_diff, COALESCE(this_crc <> master_crc OR ISNULL(master_crc) <> ISNULL(this_crc), 0) AS crc_diff, this_cnt, master_cnt, this_crc, master_crc FROM `percona`.`checksums` WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) AND (db='ftwill' AND tbl='5500Type')
    # pt_table_checksum:11261 40168 0 checksum diffs on [brokenSlaveServer]
    # OobNibbleIterator:7183 40168 Finish nibble_sth
    # OobNibbleIterator:7183 40168 Finish explain_nibble_sth
    # SchemaIterator:7680 40168 Getting next schema object from dbh DBI::db=HASH(0x2f2c9c0)
    # SchemaIterator:7751 40168 No more tables in database ftwill
    # SchemaIterator:7756 40168 No more databases
    # pt_table_checksum:11456 40168 Exit status 0 oktorun 1 have time 1
    # Cxn:3806 40168 Destroying cxn
    # Cxn:3815 40168 DBI::db=HASH(0x2f39dc0) Disconnecting dbh on [brokenSlaveServer] h=127.0.0.1,P=4306
    # Cxn:3806 40168 Destroying cxn
    # Cxn:3815 40168 DBI::db=HASH(0x2f3da10) Disconnecting dbh on [clusterNode3] h=10.*.*.*,P=3306
    # Cxn:3806 40168 Destroying cxn
    # Cxn:3815 40168 DBI::db=HASH(0x2f85210) Disconnecting dbh on [workingSlaveServer] h=10.*.*.*,P=3306
    # Cxn:3806 40168 Destroying cxn
    # Cxn:3815 40168 DBI::db=HASH(0x2f85258) Disconnecting dbh on [clusterNode1] h=10.*.*.*,P=3306
    # Cxn:3806 40168 Destroying cxn
    # Cxn:3815 40168 DBI::db=HASH(0x2f2c9c0) Disconnecting dbh on [clusterNode2] h=10.*.*.*,P=3306
  • carlos.salguerocarlos.salguero Percona Toolkit Developer Percona Staff Role
    Sadly, I cannot see any problem from that log. I've never tried using a MySQL connection from Perl, using and ssh tunnel. That might be the problem but I cannot tell you for sure.
    Could you share the data in the dsns table (stripping user/pass) ?
    Perl DBI should be able to connect over an ssh tunnel. The only thing I can think about at this moment is that in the dsns table, use 127.0.0.1 and not localhost to force a tcp connection.

    Regards
  • jhbremerjhbremer Entrant Current User Role Beginner
    Sure, screenshot of DSNs table attached. It is currently set to 127.0.0.1 for the tunnel connection.
  • carlos.salguerocarlos.salguero Percona Toolkit Developer Percona Staff Role
    If you are using an ssh tunnel, how can all the instances can be using the 3306 port?
    Shouldn't they be mapped to the tunneled port? I mean, remotely they are using the 3306 port but, locally, you has had to map the 3306 to a different local port, right?
  • jhbremerjhbremer Entrant Current User Role Beginner
    Hi - the 4 connections via 3306 from the DSN table are accessible directly without the need for a tunnel. Each server has its own IP as they are open to each other.

    The one connecting via 4306 is using the ssh tunnel. Local tunnel listening to 4306 and forwarding to the remote host at 3306.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.