Cluster remains in Primary state after 2 of 4 nodes are stopped

Hi there,

I am running a Percona XtraDB Cluster with 4 nodes in my lab environment.

After stopping 2 nodes, I still see that the cluster status is reported as Primary.

My expectation was that with only 2 nodes left (and no sufficient quorum), the cluster should switch to Non-Primary. Instead, it remains in Primary state.

Could someone please clarify why this happens?

Is my understanding of quorum incorrect in this scenario? Shouldn’t 2 out of 4 nodes mean the cluster loses quorum?

Are there configuration settings that affect this behavior?

Thank you in advance for your help and insights.

Never recommended, even in lab environments.

It should. Look at wsrep_cluster_size status. Look at wsrep_provider_options and check that no servers have increased weights as additional weight adds to quorum count.

1 Like

Hi @matthewb , thank you for your response.

To be honest, I have been instructed by my boss to add the fourth node to the test cluster :open_mouth:

I would expect the cluster status to be non-Primary with only two nodes out of four, but it isn’t. Indeed, it doesn’t seem to be increased weights to quorum count….

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

mysql> SHOW STATUS LIKE ‘wsrep_cluster_size’;
±-------------------±------+
| Variable_name | Value |
±-------------------±------+
| wsrep_cluster_size | 2 |
±-------------------±------+
1 row in set (0.00 sec)

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

mysql> SHOW STATUS LIKE ‘wsrep_cluster_status’;
±---------------------±--------+
| Variable_name | Value |
±---------------------±--------+
| wsrep_cluster_status | Primary |
±---------------------±--------+
1 row in set (0.00 sec)

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

mysql> SHOW STATUS LIKE ‘wsrep_provider_options’;
Empty set (0.00 sec)

Is there any other way how to explain or troubleshoot this behavior?

This is a variable; SHOW GLOBAL VARIABLES Look for pc.weight within this long list of parameters. The value should be 1 for all nodes.

Also, provide the my.cnf for the nodes, and perhaps some of the last 50 lines of mysql error log when you stopped the 2 nodes. There will be information in the log about quorum votes/calculations/etc.

1 Like

Send this forum post to your boss where Percona, the creators of PXC, say not to use an even number of nodes. :slight_smile:

1 Like

Thank you for the honest response.

After another group discussion, it was said to me that, “if it helps, you can add a fifth node”. Therefore, I will close this post with the conclusion that a cluster should have an odd number of nodes. To troubleshoot possible issues with quorum or Primary/Non-Primary state, you should also review metrics and configuration such as wsrep_provider_options, pc.weight, the my.cnf file for the nodes, as well as the MySQL error log.

@SQLCaesar, if you have concerns about data size, or the resources of a 5th node, consider making the 5th node a simple arbitrator using garbd. This node stores no data, but still functions as a voting member regarding quorum counts.

1 Like