Invalid INSERT statement syntax in binlog

I’m testing out some new hardware with the latest and greatest software versions prior to migrating a current Percona 5.1 environment across to it. However, I’m running into some real weirdness.

I have a master/slave replication setup, both on identical hardware, and I’m using the sysbench utility’s OLTP benchmark to test things out. I’m running the benchmark on the master for 5 - 10 minutes at a time and watching the slave to see how far out of sync it goes and how fast it catches up.

A few minutes into the test, the slave’s SQL thread dies (Slave_SQL_Running = No) and the Last_SQL_Error shows a clearly invalid SQL statement as follows:

Somehow, a colon has snuck into the primary key value (4:389).

So, thinking there was some corruption in the transfer to the slave, I login to the master, dump the binlog involved, and lo an behold the same invalid syntax is there:

$ mysqlbinlog us-db01-bin.000030|grep INSERT |grep 4:389

INSERT INTO sbtest values(4:389,0,’ ',‘aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy’)

OK, this has got me confused. How could an statement with invalid syntax get into the binlog? Just to be sure, I tried out the statement on the master, and as expected it failed with the same syntax error it did on the slave.

Searching through the same binlog there seem to be three or four more similar offending statements further down amongst the tens of thousands of valid ones so even if I skip this statement, it’s going to fail again very soon.

Ideas anyone?

Environment is:

  • CentOS 6.3 x86_64
  • Both master and slave running in KVM VMs on our own hardware
  • Percona 5.5.25a-rel27.1.277.rhel6 (from the Percona yum repository) as follows:
    Percona-Server-client-55.x86_64 5.5.25a-rel27.1.277.rhel6 @percona
    Percona-Server-server-55.x86_64 5.5.25a-rel27.1.277.rhel6 @percona
    Percona-Server-shared-55.x86_64 5.5.25a-rel27.1.277.rhel6 @percona
    Percona-Server-shared-compat.x86_64 5.5.25a-rel27.1.277.rhel6 @percona
  • SSD for all drives

Thanks,
Toby.

[Edit for more info]
I should probably also say that the table involved (sysbench is using just one) is on InnoDB. Following is the my.cnf - it’s identical on both machines except for the server-id:

port=3306
socket=/var/lib/mysql/mysql.sock

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql

We are on BBU RAID so this should provide high reliability with little

performance degredation

sync_binlog=1

Disabling symbolic-links is recommended to prevent assorted security risks

symbolic-links=0

max_connections=250

default-time-zone=UTC

sql_mode=“NO_AUTO_VALUE_ON_ZERO”

default-storage-engine=innodb
character-set-server=utf8
collation-server=utf8_bin

long_query_time=10
slow_query_log=true
slow_query_log_file=/var/log/mysql/log-slow-queries.log

max_allowed_packet=64M

skip-host-cache
skip-name-resolve

log-bin = /data1/mysql/logs/us-db01-bin
expire_logs_days = 30
binlog_format = MIXED

server-id = 11

innodb_data_home_dir = /var/lib/mysql/
innodb_log_group_home_dir = /data1/mysql/logs
innodb_file_per_table

innodb_buffer_pool_size = 8000M
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 150M
innodb_log_buffer_size = 32M
innodb_flush_log_at_trx_commit = 1
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout = 300

innodb_fast_shutdown = 0

max_heap_table_size = 500M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Update:

I’ve looked at this for a while now, and it seems to be a Percona-specific bug. I uninstalled Percona 5.5.25a-27.1 and installed MySQL 5.5.27 from the dev.mysql.com RPMs, re-ran the same test and it works just fine. I re-installed Percona 5.5.25a-27.1 from the Percona yum repository, re-ran and hit the exact same issue again. Between reinstalls, I completely blew away the contents of /var/lib/mysql and re-created the mysql and benchmark databases from scratch using each time using the newly installed MySQL version. Same my.cnf was used with MySQL and Percona.

My next step is to walk back the older Percona release and see if I can see where this begins.

Interested in any other thoughts on how to debug this in the meantime…

Toby.

OK. No one’s biting on this, so here’s a quick update. It seems that this issue is specific to the Percona binaries provided for RHEL6 in the Percona Yum repository.

If I run either the Percona 5.5 or 5.1 release from the repos, I get the issue. If I run the generic Linux x86_64 RPMs of the identical releases, then no problems. Just for fun, I also tried the MySQL RPMs and no problems there either. All tests are with identical database setups and identical my.cnf.

So in summary, the issue occurs on:

Percona-Server-server-55-5.5.25a-rel27.1.277.rhel6.x86_64 (from yum)
Percona-Server-server-51-5.1.63-rel13.4.443.rhel6.x86_64 (from yum)

The issue doesn’t occur on:

Percona-Server-5.5.25a-rel27.1-277.Linux.x86_64 (from rpm)
Percona-Server-5.5.11-rel20.2-116.Linux.x86_64 (from rpm)
Percona-Server-5.1.63-rel13.4-443.Linux.x86_64 (from rpm)

Could this be something to do with the linkages in the rhel6 packages?

The issue is really simple to repro. Just create a new database, install sysbench, run the OLTP benchmark and either dump the binary log afterwards looking for the above invalid syntax, or even easier, setup a replication slave and watch the slave fail when it hits the bad statement.

Here’s the sysbench config I’m running:

sysbench --test=oltp --db-driver=mysql --num-threads=64 --max-requests=1000000000 --max-time=600 --validate=off --oltp-test-mode=complex --oltp-read-only=off
–oltp-table-name=sbtest --oltp-table-size=100000 --oltp-dist-type=special --mysql-host=localhost --mysql-user=root --mysql-password=password --mysql-table-engine=innodb --mysql-db=sysbench run

Toby.