Is there a way, when recovering from failed node scenarios, to recover from being reduced to a single node without locking the cluster when the next node rejoins?
I have a 3 node cluster with the addition of an arbitrator, and a node had to be taken out for maintenance. When it went to rejoin, it synched from one of the two remaining nodes, hung, and crashed, and took the donor node with it. This has left me in a single node state - and I;d like to be abel to get back to multi-node without taking an outage.
I’m using xtrabackup and xbstream as my SST method, but I’ve noticed when two nodes synch this way, it locks the donor node. Is there a way around this? Given enough nodes (even having just two already in synch) adding a third node allows one node to remain active and serve requests. But what do you do when you’re down to one node and need to recover without locking the whole cluster for an hour while the data synchs?