I’ve gone through the galera options here: http://www.codership.com/wiki/doku.php?id=galera_parameters_ 0.8
But was looking for a bit more detail on the following timeout options 3 and how they interact:
Specifically, lets say we have a 3 node cluster, and one of the nodes (node A) starts experiencing increased latency. If node B doesn’t hear from it for more than the suspect_timeout, it tries to remove it from the cluster. My questions specifically have to do with what happens if node C doesn’t agree that node A is unavailable?
First Question: If node B still doesn’t hear from it within the inactive_timeout, what happens? Does node A get removed from the cluster regardless of what node C believes?
Second Question: If my above interpretation of inactive_timeout is correct, then what does consensus_timeout do? I mean, if one node can unilaterally decide on cluster membership after inactive_timeout is reached, why would there be a consensus issue?
Third Question: consensus_timeout probably has a point that I’m not seeing. If so, what happens after the consensus_timeout is reached and they can’t come to a consensus? Does the node get removed anyway, or does it stay in the cluster, or does the whole cluster just lock up? (I’ve seen some cases where the cluster freezes up and everyone goes into the Initialized state, and I’m wondering if this is the reason.)
I know there are a bunch of questions above, but if you had answers to any of them individually, the answers would be very appreciated!