@Mulen The “self-healing” of Kubernetes pods is a function of specifying for Kubernetes how many pods of a particular type in a particular cluster it should ensure are up. You can specify host locations as well. So, in the case of a MySQL cluster, you can, for example, specify that Kubernetes must keep three DB nodes up and that each one should be located on a different specified host. That way, you can (as we do) ensure that you have each node located on a disparate chassis (in our case), which means that we can lose up to one entire chassis and have MySQL Cluster still running without interruption.
If the specified host goes down, Kubernetes continues to try to bring back up the missing MySQL node. Of course it fails as long as the host is down, because you specify that that pod/node MUST run on that host. But as soon as the host is up again, Kubernetes brings back up the MySQL pod, which “heals” the Kubernetes cluster. At that point, MySQL Cluster itself takes over the “healing” of the MySQL cluster.
As soon as MySQL Cluster recognizes that the missing node has rejoined the group, MySQL Cluster uses its “redo logs” (much like Oracle RAC) to resync the transactions with the recovered node. The node is not fully “rejoined” until its dataset is in sync with the rest of the cluster, at which point it again becomes a full member.
You can have more than three nodes in a MySQL cluster, so that you could, for example, lose two nodes out of five without going into split-brain, read-only mode. That’s overkill for our use-case, so we don’t bother. But in our use-case, if two nodes are down, then our cluster does go into split-brain, read-only mode. And then you have to go through a manual process to recover the MySQL cluster. It’s not odious, but it is manual. But we only had such problems when we ran on bare-metal, VMs, or Docker Swarm. Since moving to Kubernetes, we haven’t had to do that. Now we live in sighs of relief instead of curses and wasted hours!
Your points about the difference in performance between Router vs HAProxy are irrelevant to our use-cases. Perhaps in some use-cases some fine-grained performance difference might make “the difference,” but I’d have to dig into the details of how and why some particular metric would be the difference-maker, but it doesn’t matter for us. We experience fantastic performance out of our cluster! We do run only SSDs and cluster over 10GB copper. But certainly the Router isn’t any bottleneck.
Other degree audit systems that compete with ours either don’t even try to run live degree audits, or their performance is something like one audit per 30-seconds, or an audit per minute. By stark contrast, our degree audit instances run upwards of 30 audits per second. So, we’re getting better than “adequate” performance out of our MySQL cluster. We don’t feel the motivation to “dig deeper” into comparisons between Router and HAProxy. Again, some other use-case might be different. But we find Router’s performance to be just fine. And we thereby have a vanilla MySQL cluster that does nothing exotic.
There is definitely a learning curve to this approach! But the end result is by FAR the most solid and highest-performance clustering approach (including for the DB layer) of the MANY approaches we have tried over the past two decades. Oracle RAC is the gold standard, if you can afford it. We have brought up PeopleSoft from bare metal on Oracle RAC, and it is impressive. But “for the rest of us” that can’t pass along such exorbitant costs to our customers, the approach we are now using has comparable reliability to Oracle RAC in an open-source context. And that’s impressive in its own right!
IMO, there just isn’t a motivation for the Percona fork anymore. We’ve been there, done that, got the blood/sweat-stained t-shirt, and have no interest in ever going back.