Objective: Test behaviour of cluster upon 2 consecutive pod failure
Cluster Detail: 3 node Group replication cluster
Rejoin disabled (for testing): SET PERSIST group_replication_start_on_boot=OFF;
Platform: Kubernetes
Image used: percona/percona-server:8.0.34-aarch64
Steps:
-
On a 3 node healthy cluster setup, we disabled the rejoin:
SET PERSIST group_replication_start_on_boot=OFF;
-
Issued pod delete command:
kubectl delete mysql-1 mysql-2
-
Identified both the primary (mysql-1) and secondary (mysql-2) were shutdown via mysql logs and left the replica group.
-
The only available node was mysql-0 as PRIMARY and ONLINE, accepting writes.
PFA screenshot:
-
Same behaviour happens when we delete mysql-0 and mysql-1. mysql-2 eventually gets elected as PRIMARY in few seconds and is PRIMARY & ONLINE accepting writes.
Question:
- As per the behaviour, the cluster must be in RO mode when majority of the mysql nodes are down. Why is the above cluster still writable?
- Is there any mysql configuration we need to explicitly set to disable this behaviour?