We have a fairly large cluster that went down due to the hardware problems, and we could recover only one node that is currently running Production Apps.
Few attempts to start the two other nodes resulted in full SST , (using Xtrabackup) and at the end of that process - XtraBackup just hung , and connections on the master node kept creeping up . I assume that is due to the app clients trying to write to the master node, and some of the statement included DDL modifications that created a lock.
We need an advise on the least painful method of rebuilding the two remaining nodes that does not lock the master node.
Appreciate your time,