Pt-table-checksum connection errors after upgrade to 3.5.5

Hi,

after upgrading to Percona Toolkit 3.5.5 I’m getting connection errors from pt-table-checksum that do not occur with versions 3.3.1 or 3.4.0. (MySQL 5.7.42 on Debian 10)

Somehow between creating and checking the checksum records, the 3.5.4/5 versions seem to forget the connection information and try to connect as the system user (root@localhost), not with the connection data specified on the command line.

The call I’m using is basically this:

./pt-table-checksum.3.4.0 --recursion-method dsn=D=percona,t=dsns6 --replicate=percona.checksum_db01_02 --create-replicate-table --no-check-binlog-format --defaults-file=${CNFFILE} --tables=test_schema.user_data

That works perfectly with 3.3.1 and 3.4.0. With versions 3.5.4 or 3.5.5, the error is:

Error checksumming table test_schema.user_data: DBI connect(‘;;mysql_read_default_group=client’,‘’,…) failed: Access denied for user ‘root’@‘localhost’ (using password: NO) at ./pt-table-checksum.3.5.4 line 1636.

In the master MySQL error.log I see:

2023-10-06T14:20:15.227046+02:00 668740 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. […]
2023-10-06T14:20:15.233841+02:00 668742 [Note] Access denied for user ‘root’@‘localhost’ (using password: NO)
2023-10-06T14:20:15.235383+02:00 668743 [Note] Access denied for user ‘root’@‘localhost’ (using password: NO)

I tried using both a DSN and the user/password/host parameters instead of the CNFFILE, but the error remains the same.

The checksum record is created an can be seen on both servers.

It looks like the error occurs when pt-table-checksum tries to check the checksum records, because it shows me “ERRORS 1”, but “DIFFS 0” in the results (but DIFFS should be 1, because the table was patched to different data on the slave).

Excerpt of the output with PT_DEBUG=1:

# pt_table_checksum:11741 11062 REPLACE INTO percona.checksum_db01_02 (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT ?, ?, ?, ?, ?, ?, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS(‘#’, id, convert(data using utf8mb4), convert(name using utf8mb4), CONCAT(ISNULL(data), ISNULL(name)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM test_schema.user_data /checksum table/ lower boundary: upper boundary:
# pt_table_checksum:11762 11062 SHOW WARNINGS
# pt_table_checksum:11768 11062 Ignoring warning: 1592 Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE… SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave.
# Retry:8142 11062 Try code succeeded
# pt_table_checksum:11224 11062 Nibble time: 0.000952959060668945
# NibbleIterator:6701 11062 0 rows in nibble 1
# NibbleIterator:6713 11062 No rows in nibble or nibble skipped
# pt_table_checksum:11288 11062 Total avg rate: 5246
# WeightedAvgRate:9570 11062 Master op time: 5 n / 0.000952959060668945 s
# WeightedAvgRate:9582 11062 Initial avg rate: 5246.8151113335 n/s
# WeightedAvgRate:9586 11062 Adjust n to 2623
# pt_table_checksum:11317 11062 Updated chunk size: 2623
# ReplicaLagWaiter:8765 11062 Checking slave lag
# MasterSlave:5509 11062 DBI::db=HASH(0x55b0a51ad748) SHOW SLAVE STATUS
# ReplicaLagWaiter:8782 11062 DB02 slave lag: 0
# ReplicaLagWaiter:8815 11062 All slaves caught up
# MySQLStatusWaiter:9459 11062 Checking status variables
# pt_table_checksum:10793 11062 SHOW GLOBAL STATUS LIKE ? Threads_running
# MySQLStatusWaiter:9462 11062 Threads_running = 2
# MySQLStatusWaiter:9489 11062 All var vals are low enough
# OobNibbleIterator:7324 11062 Done nibbling past boundaries
# NibbleIterator:6722 11062 Done nibbling
# pt_table_checksum:11348 11062 Checking slave diffs
# Cxn:3807 11062 DB01 SHOW VARIABLES LIKE ‘wsrep_on’
# DSNParser:1585 11062 DBI:mysql:;;mysql_read_default_group=client
# DSNParser:1634 11062 DBI:mysql:;;mysql_read_default_group=client undef undef RaiseError=>1, PrintError=>0, ShowErrorStatement=>1, mysql_enable_utf8=>0, AutoCommit=>0
# DSNParser:1634 11062 DBI:mysql:;;mysql_read_default_group=client undef undef RaiseError=>1, PrintError=>0, ShowErrorStatement=>1, mysql_enable_utf8=>0, AutoCommit=>0
# OobNibbleIterator:7332 11062 Finish explain_nibble_sth
# OobNibbleIterator:7332 11062 Finish nibble_sth
10-06T14:59:44 Error checksumming table test_schema.user_data: DBI connect(‘;;mysql_read_default_group=client’,‘’,…) failed: Access denied for user ‘root’@‘localhost’ (using password: NO) at ./pt-table-checksum.3.5.4 line 1636.

Hello @steffen,
This looks like a bug. Can you please open a report at https://jira.percona.com/ with your test case? Please also include a working test case from the earlier version of toolkit, as that will help our developers narrow down where the issue might be.

Hi @matthewb, it looks like I’m not he first one with this issue, [PT-2250] pt-table-checksum reports error if recursion method is DSN - Percona JIRA seems very similar to my problem. I added my notes to that ticket, I hope that helps.

Off topic, but perhaps something funny for the weekend: I asked ChatGPT about ways to speed up pt-table-checksum. One suggestion was:

  • Use a Replica:
  • If possible, run the checksum on a replica rather than the master to avoid impacting the master’s performance. Just be sure the replica is not lagging behind the master.

I never thought of that.

If you don’t want those unsafe statement warnings flooding your log files (it’s going to happen with pt-table-checksum) you can add:

log_error_suppression_list = “10908”

to your server configs.

1 Like

@Matthew_Lenz Thanks for the hint.