We have an enterprise customer running a three-node XtraDB Cluster 5.6.34 installation which has now been in service for several months. We have recently run into a bug with a simple update.
The customer’s DB includes a table ‘user’ which has around 120 columns. The primary key is an auto_increment INT(11) field. There are no foreign keys, and no foreign key constraints against the table. The table contains a column ‘google_admin’ VARCHAR(50) NOT NULL. Executing the following command, whether the key is found or not, causes a certification failure, reported by the MySQL client as a deadlock but InnoDB reports no deadlocks:
mysql> update user set google_admin=‘’ where google_admin=‘example@host.org’;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
After some experiment and testing, we determined that if we bring the cluster down to a single node and execute this transaction, it succeeds (if the key is found) or fails (if it is not found) without error:
mysql> update user set google_admin=‘’ where google_admin=‘example@host.org’;Query OK, 1 row affected (4.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update user set google_admin=‘’ where google_admin=‘example@host.org’;Query OK, 0 rows affected (3.04 sec)
Rows matched: 0 Changed: 0 Warnings: 0
As soon as we rejoin a second node to the cluster, this exact transaction against this exact table begins failing certification again. Again, the certification failure occurs regardless of whether or not any table rows are even matched (let alone changed).
Is anyone able to explain this behavior? We do have a workaround, but it is cumbersome, involving a BEGIN TRANSACTION; SELECT … FOR UPDATE; UPDATE … ; COMMIT sequence. However, the very fact that this cumbersome workaround does work seems to suggest that there is not in fact any certification conflict preventing the update. The certification failure is a false alarm.
What we would of course like to determine is why this is happening, and what we can do to prevent it.
Further information:
In the course of attempting to troubleshoot this problem, we attempted to clone the table. This, too, failed certification:
mysql> create table user2 like user;
Query OK, 0 rows affected (0.07 sec)
mysql> insert into user2 select * from user;
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
I suspect, but have not yet verified, that somehow the three copies of the table on the three nodes have discrepancies in their actual contents. I have no theory as to how this may have happened.