Deadlocks

Hi
I have a cluster with 10 nodes balanced on HAProxy as 5 nodes for read and 5 nodes for write

I noticed in apache log a lot of messages with Query failed by reason : Deadlock found when trying to get lock; try restarting transaction

[Thu Nov 05 16:12:36 2015] [error] [client xxx.xxx.xxx.xxx] Query failed by reason : Deadlock found when trying to get lock; try restarting transaction (UPDATE users SET keep_alive=NOW(), ip=‘xxx.xxx.xxx.xxx’, now_playing_type=0, just_started=0, last_watchdog=NOW() WHERE (mac=‘00:00:00:00:00:00’)), referer: [url]http://xxxx.xxx/c/index.html?referrer=file:///home/web/services.html[/url]

What i found:

If i disable all “write” nodes in haproxy and just leave one server, it is ok any Deadlock message in apache

If i enable just 2 “write” nodes in haproxy then emmidiatly i can see messages with Deadlock

I wanted to find LATEST DETECTED DEADLOCK in SHOW ENGINE INNODB STATUS
But there is anything about LATEST DETECTED DEADLOCK

Here too

mysql> SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME = ‘innodb_deadlocks’;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect…
Connection id: 117754
Current database: *** NONE ***

±---------------+
| VARIABLE_VALUE |
±---------------+
| 0 |
±---------------+
1 row in set (0.00 sec)

mysql> SHOW GLOBAL STATUS LIKE ‘innodb_deadlocks’;
±-----------------±------+
| Variable_name | Value |
±-----------------±------+
| Innodb_deadlocks | 0 |
±-----------------±------+
1 row in set (0.00 sec)

All nodes have the same latest version

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 135138
Server version: 5.6.26-74.0-56-log Percona XtraDB Cluster (GPL), Release rel74.0, Revision 624ef81, WSREP version 25.12, wsrep_25.12

Copyright (c) 2009-2015 Percona LLC and/or its affiliates
Copyright (c) 2000, 2015, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Can you help me to find source of problem

Cluster or haproxy ?

I’ll cut to the chase and share you this link - [url]http://www.severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and-multi-node-reads[/url], basically the reason behind what you’re seeing is Galera’s use of optimistic locking. The workaround is to send queries that create deadlocks to a single node, while non-conflicting queries can be distributed to the rest of the nodes.

Hi Lelik,
Hi J.Rivera,

we investigated this issue on our cluster lab too.
To us, it seems a bug coming up with Percona XtraDB Cluster 5.6.26-25.12.1.
All prior versions work fine and there are no deadlocks. We upgraded to 5.6.26 two weeks ago and had those errors.
After downgrading to 5.6.25-25.12.1 everything is fine again.
The double check on upgrading again showed those errors again.

Indicating figure is wsrep_local_cert_failures. These status variable is up counting on each deadlock.

We strongly advise not to use 5.6.26.

Cheers

Chris

Ah, I thought I was going mad! I have the same problem, also as Lelik1985 describes. Cannot get any deadlock counters or information out of show engine innodb status, and hugely increased deadlocks since upgrading to 5.6.26. Increasing wsrep_retry_autocommit even to large values makes no difference because it’s happening in the local cert stage. Attached is a graph with the midpoint showing the time of the upgrade to 5.6.26 - edit: sorry it won’t let me upload the graph…

[URL=“Bug #1516933 “innodb_print_all_deadlocks doesn't work” : Bugs : Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC”]https://bugs.launchpad.net/percona-x...r/+bug/1516933[/URL]
[URL=“https://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/43228-deadlock-logging-not-working”]https://www.percona.com/forums/quest...ng-not-working[/URL]

The bug report I raised explains this well:
[url]https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1516933[/url]

Basically there’s a difference between local and cluster deadlocks.
I do agree there’s a difference with local_cert_failures with 5.6.26, whether that’s an intended behaviour change with the underlying galera or not, who knows…

@jrivera can you share how to implement the HAproxy to send the updates that create deadlocks to one node only while the rest transactions will be redistributed to other nodes as well? The article you posted is to send all traffic for writing to only one node making the cluster into master and two slaves (we have 3 nodes cluster with round robin and HAProxy)