Percona XtraDB cluster: leanest possible configuration


I am trying to setup leanest possible configuration of Percona XtraDB cluster. I am planning to run multiple clusters comprising of 2 real nodes each. According to Percona documentation, it is strongly advised to have odd number of nodes in each cluster to avoid split brain, so I will need to add garbd to each cluster. My question is: is it possible to run garbd on one of the two real nodes of cluster? For example, cluster has two real nodes (two servers): node1 and node2. What if I run garbd on node1? In case if node1 and node2 loose network connectivity between each other, node1 (which also runs garbs) will have the quorum and continue service, whereas node2 will have service stopped… Am I missing something here?

If this is still not advised, what if I run multiple garbd daemons for my multiple clusters on one dedicated server? is it possible to run multiple garbds on one server? Do I understand correct that in this case garbds will need to use different ports? Do I understand correct that bandwidth requirement for this server will have to be equal to sum of bandwidths of all clusters for which it runs garbds?

Thank you very much in advance!

Instead of running garbd on data nodes, this function is what you are looking for I think:

Running multiple garbds on single server using different TCP ports should just work, but indeed you will need multiplied network bandwidth to handle traffic from all clusters.

przemek, thank you so much for fast response! I did not know about weighted quorum option.
Can you please explain me why garbd is still a recommended option if it is possible to use weighted quorum to prevent split brain?

One scenario is connection broken between two datacenters with both equal number of nodes. If you always know that DC A is more important than B, so never continue work on B when A is down, then just assign more weight to A. But what if connection between DCs is lost but only one of them have still connection to the world? You would probably want to choose that one to continue working. In this case, independent garbd node in 3rd location will help maintain quorum in the right DC. But remember to make sure the garbd node has proper bandwidth and reliable link. Makes sense?
Also you may end up with split brain between two server racks - garbd in 3rd rack will help :slight_smile:

I see, thanks. I am planning to have additional monitoring on top of Percona XtraDB cluster anyway, so I should be able to detect situation when one of the nodes lost connection to the outer world.

Do I understand correct that if I will end up with only one node connected to the world and accepting queries (even if this is a node with less quorum weight), there will be no split brain risk anyways and two nodes will continue replicating?.. Or disconnected node with higher weight will become a replication donor and prevent connected node with less weight from accepting queries from the world?