pt-archiver scrambling a blob column

We currently are trying to use pt-archiver to archive and trim old data out of our MySQL 5.7 database on AWS RDS. The archiver works as expected in our tasks except for one problem - it seems to mangle the binary data in one column. A normal mysqldump of the same row is able to be restored correctly. The normal method for handing binary data is to use --hex-blob (with mysqldump) or hex() and unhex() on the field, but I see no way to do that with pt-archiver. To get around this limitation we’re currently performing a dump with mysqldump then using pt-archiver to trim the rows afterwards. This method is adequate but much slower and more cumbersome. Is there a way to preserve binary data archived from pt-archiver? Or a way to hex() that field during the archive process? Here is the command we are using:

pt-archiver --file /data/test.dump.$(date +%s).sql --where “_created < ‘2014-12-31 23:59:59’ AND _created > ‘2014-01-01 00:00:00’” --source D=client_db,t=messages --charset ‘utf8’ --statistics --progress 10000

the resulting archive is able to be restored without complaint, but the binary data field is mojibake garbage.

Hello William.valadez

Thank you for reporting this to our Forum. I have just checked in with the development team, and they would like to investigate this in greater detail. Would you be willing to create an entry in JIRA, our bug database, please? It will then automatically enrol you as a follower of progress.

You can read more about how to do that here:

It would be much appreciated!

Sure thing, will do. Thanks Lorraine.

Filed as PT-1750

Thanks, William, appreciated.

This could be easily workarounded if pt-archiver supported virtual columns, but I made a quick test and apparently it fails to parse the CREATE TABLE.