not able to desync and flush tables with read lock

Hi,

I have a cluster of 3 nodes and when i try to desync and flush tables with read lock in order to get a backup of the node for a slave, I get a few errors.

When I run the following commands in the command line on one of the nodes:

global wsrep_desync=ON; flush tables with read lock;

I get the following error from the command line:

ERROR 1213 (40001): wsrep desync failed for FTWRL

In my error log, I see these error msgs:

2017-03-09 16:33:57 18704 [Note] WSREP: Member 0.0 (loaddb-56-node1) desyncs itself from group
2017-03-09 16:33:57 18704 [Note] WSREP: Shifting SYNCED → DONOR/DESYNCED (TO: 11504643)
2017-03-09 16:34:08 18704 [Warning] WSREP: Member 0.0 (loaddb-56-node1) requested state transfer from ‘self-desync’, but it is impossible to select State Transfer donor: Resource temporarily unavailable
2017-03-09 16:34:08 18704 [ERROR] WSREP: Node desync failed.: 11 (Resource temporarily unavailable) at galera/src/replicator_smm.cpp:desync():1680
2017-03-09 16:34:08 18704 [Warning] WSREP: FTWRL desync failed 3 for schema: (null), query: flush tables with read lock

Also, when I run the following commands after running the commands listed above:

unlock tables; set global wsrep_desync=OFF;

the error log show that it is shifting from donor/desynced to synced but when I type “show status like ‘wsrep%’”, it shows that the node is still donor/desynced.

2017-03-09 16:36:30 18704 [Note] WSREP: Member 0.0 (loaddb-56-node1) resyncs itself to group
2017-03-09 16:36:30 18704 [Note] WSREP: Shifting DONOR/DESYNCED → JOINED (TO: 11504643)
2017-03-09 16:36:30 18704 [Note] WSREP: Member 0.0 (loaddb-56-node1) synced with group.

The nodes in the cluster use Ubuntu 14.04.4 LTS and Percona xtradb cluster version 5.6.32-25.17-1.trusty.

Lastly, in order to sync the node back into the cluster, I need to force kill the mysql process and then let it SST.

Any help/feedback is appreciated.

Best,

Hi,

First of all, in the latest PXC versions, a FTWRL does implicitly turn on desync mode, so no need to set the desync mode explicitly any more.
Secondly, I can see that apparently this buggy behavior is back:

maybe you’ve hit some side effect of this problem. Any way, doing this sequence:
flush tables with read lock; set global wsrep_desync=1;
puts a node in permanent Donor/Desynced mode, where only killing mysql can unblock it.