Deadlocks hang MySQL Cluster Node

Hi

I have a 3 node master, master master cluster running percona-xtradb-cluster-full-57. We have run the cluster for many months while we have been building a platform. We have performed lots of performance tests and have had great results. (We only target 1 node of the cluster)

As is always the case, just before go live you hit problems. I have deployed my last Java service that is a simple exporter service that grabs records waiting for export and exports to a XML endpoint. The Java service has issues and when we run two instances of it we see dead locks. We are working to fix the Java App.

The issue that I would like help with is that after the dead locks happen (maybe 5 or 10 reported) mysql seems to hang on the node that we are targeting.MySQL still accepts new connections but the query’s just hang. This is every query even queries like show global status and use database.

When Mysql is in this state it cant be stopped and my only option is to kill -9 mysqld pid. Once restarted it re-joins the cluster with out issue. The exporter is only working against a 9000 record set and the server is not under any load.

I don’t know if this is a MySQL problem or related to the fact that this is a node in the cluster.

No errors are reported in the mysql error log and slow queries is also no help.

Has anyone else seen this behaviour?

Regards

David

Hi David,

It depends on what really happened. I wonder if this new Java service is using any of the features that are unsupported and unsafe in PXC/Galera, like table locks, tables without primary key, etc.
Do you have the pxc_strict_mode set to enforcing? You can find details on this function here: [URL=“Redirecting”]https://www.percona.com/doc/percona-...rict-mode.html[/URL] and what is considered as unsafe.

Did you manage to get more details while this node was stuck? For example,

show processlist;
show engine innodb status\G

, etc?
What PXC version are you using? Can you post your configuration file?