Hi,
It is not a secret that whole cluster becames frozen in case of any node goes out of disk space.
This behavior is described here: [PXC-1871] LP #1525300: Whole cluster freezes if one node goes full - Percona JIRA
And also there was a mention, that behavior is expcted so the issue “will not be fixed”.
But why this behaviour is “expected”? If one node fails, it means that transaction can be applied on all nodes except problematic one. This should run into CEV voting or something, IMO.
Who can advice either there are some workaround for this situation, either it is planned to be fixed? Thanks.
1 Like
Hi @Oleksandr_Bezpiatov,
Just because a node is out of disk space does not mean it is a failed node. That node can still serve SELECT queries and if data is removed, that free space can be used for new writes. Since a node without free space is still a valid member to answer queries, then it must ack any new writes. Since it cannot ack writes, the cluster can’t move forward and stalls, but the node can still respond to normal heart beats which tells the other nodes that he’s OK and he’s a functioning member.
The best workaround is to be proactive and use PMM to monitor free disk space to alert before the issue arrises.
1 Like
Yep, we are monitoring free disk space proactively, but issues can still occur and disk space can be dramatically reduced at any time (that is our case).
IMO, PMM+alerting is not that much related to High Availability (that is Percona Cluster was made for), since High availability should follow the rule of minimized manual actions taken by human and maximizing automation.
Stall RW cluster because of the one node out of disk space – is a typical case when Cluster should act proactively to detect and resolve problematic node. Still it’s my opinion, but on live highloaded system such issues can still occur, even if we have dedicated monitoring SRE people (we have ones). In this case to unblock stalled cluster we need to shutdown mysql on problematic node manually, and this also takes some time to happen.
1 Like
What I did on this is create a cluster wide DB and a single table that every minute or two it writes to this table and connects via localhost or 127.0.0.1 This table has a timestamp and a node name field.. It times the transaction and if the insert takes more than XXX seconds the local script force kills the mysql process (Yes I am aware this is generally not a good thing to do and it will more than likely require a SST to recover) I run this on all of my systems except the very last server in my “failover” chain (you can run it on all the systems but if I am down to one I would rather still have SOME reads work) This general process seems to work pretty well.. and helps in other cases where a cluster stall for some weird reason happens. It is one of my “last defense” items that is pretty drastic but works well as long as you understand what you are doing