percona-toolkit 2.2.13: damaged data on master after pt-table-sync

Hi!

OS: Ubuntu 12.04.5 LTS \n \l
DB server: mariadb-server-5.5 5.5.42+maria-1~precise
Percona-toolkit: 2.2.13

We have two mariadb servers (also tried mysql-server-5.5 ( 5.5.41-0ubuntu0.12.04.1 )) with master->slave replication. Recently the replication has got broken so I decided to use pt-table-sync to fix it.

First I used pt-table-checksum to identifiy which tables are out of sync:

pt-table-checksum --chunk-size 500000 --empty-replicate-table --replicate=tracker.checksum -d tracker --tables tracker.chatlogs h=mysql-master,u=root,p=passw0rd

And then I used pt-table-sync:

pt-table-sync --execute --replicate tracker.checksum -d tracker --tables tracker.chatlogs h=mysql-master.site,u=root,p=passw0rd

Everything ran without any problems except for the fact that some data that had cyrillic symbols got damaged on master server. I saw ‘?’ signs instead of normal cyrillic letters on both master and slave. That means that pt-table-sync somehow managed to damage the data on master DB.

MariaDB [tracker]> describe chatlogs;
±------------±----------------------±-----±----±--------±---------------+
| Field | Type | Null | Key | Default | Extra |
±------------±----------------------±-----±----±--------±---------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| user_id | mediumint(8) | NO | MUL | NULL | |
| username | varchar(20) | NO | MUL | | |
| username_to | varchar(20) | NO | MUL | | |
| IP | varchar(8) | NO | | | |
| time | int(11) | NO | MUL | NULL | |
| message | varchar(1000) | NO | | NULL | |
| blocked | varchar(20) | NO | MUL | | |
±------------±----------------------±-----±----±--------±---------------+
8 rows in set (0.00 sec)

MariaDB [tracker]> show create table chatlogs\G
*************************** 1. row ***************************
Table: chatlogs
Create Table: CREATE TABLE chatlogs (
id mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
user_id mediumint(8) NOT NULL,
username varchar(20) NOT NULL DEFAULT ‘’,
username_to varchar(20) NOT NULL DEFAULT ‘’,
IP varchar(8) NOT NULL DEFAULT ‘’,
time int(11) NOT NULL,
message varchar(1000) NOT NULL,
blocked varchar(20) NOT NULL DEFAULT ‘’,
PRIMARY KEY (id),
KEY user_id (user_id) USING BTREE,
KEY time (time) USING BTREE,
KEY username_to_name (username_to,username) USING BTREE,
KEY id_blocked (id,blocked) USING BTREE,
KEY username_to (username_to) USING BTREE,
KEY username (username) USING BTREE,
KEY blocked (blocked) USING BTREE,
KEY username_id (username,id) USING BTREE,
KEY id_username (id,username) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=156476 DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
1 row in set (0.00 sec)

Before pt-table-sync (on master):

MariaDB [tracker]> SELECT id, FROM_UNIXTIME(time) FROM chatlogs WHERE message LIKE ‘%? ?%’ OR blocked LIKE ‘%? ?%’ ORDER BY time DESC;
Empty set (0.17 sec)

After pt-table-sync (on master):

MariaDB [tracker]> SELECT id, FROM_UNIXTIME(time) FROM chatlogs WHERE message LIKE ‘%? ?%’ OR blocked LIKE ‘%? ?%’ ORDER BY time DESC;
±-------±--------------------+
| id | FROM_UNIXTIME(time) |
±-------±--------------------+
| 156475 | 2015-02-23 12:48:15 |
| 156474 | 2015-02-23 12:47:47 |
| 156472 | 2015-02-23 10:55:57 |
| 156471 | 2015-02-23 08:04:15 |
| 156469 | 2015-02-23 07:34:57 |
| 156468 | 2015-02-23 07:34:29 |
| 156467 | 2015-02-23 05:57:14 |
| 156466 | 2015-02-23 05:51:29 |
| 156465 | 2015-02-23 05:45:44 |
| 156464 | 2015-02-23 05:43:37 |
| 156463 | 2015-02-23 05:21:37 |
| 156462 | 2015-02-23 05:15:51 |
| 156461 | 2015-02-23 05:15:30 |
| 156460 | 2015-02-22 21:18:18 |
| 156459 | 2015-02-22 21:16:30 |
±-------±--------------------+
15 rows in set (0.17 sec)

I can provide dumps for this table for both master and slave before sync so that it would be possible to reproduce the issue.

Could anybody please tell if I’m doing anything wrong in order to sync slave db with master and why this sort of thing could happen?

Thanks in advance.


Best regards,
Sergey Arlashin

I think I have the same issue: http://www.percona.com/forums/questions-discussions/percona-toolkit/33543-percorna-toolkit-2-2-13-%E2%80%8Bpt-online-schema-change-bus-error-core-dumped

were you able to solve it?

thanks

cirpo

A bug has been filed since Percona Toolkit version 2.2.12, https://bugs.launchpad.net/percona-toolkit/+bug/1400472

pt-table-sync fails to check table character set before syncing. Best practice is to use the --print option as a safety check to see if the queries that will be run on the master are correct.

Workaround is use --charset=utf8 option.

Thank you! --charset=utf8 solved the issue.
But I’m a bit surprised that a bug like this can cause data loss on master. Thanks god there was no financial data in that database.


Best regards,
Sergey Arlashin