Node is not working normal after one in 2-nodes cluster down

Hi guys,

I just setup 2 nodes cluster for testing. I know we should not use even number of nodes.
However, everything works after setup. Suddenly, node 1 in my cluster is down. So node 2 has wsrep_cluster_status=non-Primary, wsrep_cluster_size=0.
These below action I try to resolve but not successfully, including:

  • SET GLOBAL wsrep_provider_options=‘pc.ignore_quorum=true’; → ERROR 1210 (HY000): Incorrect arguments to SET
  • SET GLOBAL wsrep_provider_options=‘pc.ignore_sb=true’; → ERROR 1210 (HY000): Incorrect arguments to SET
  • Comment out #wsrep_cluster_address=gcomm:// in /etc/mysql/mysql.conf.d/mysqld.cnf
  • Add wsrep_provider_options=“pc.bootstrap=true;debug=yes;pc.ignore_quorum=true;pc.ignore_sb=true” in /etc/mysql/mysql.conf.d/mysqld.cnf
  • restart node 2 again

All of these action do not working. Every SELECT query return result ERROR 1047 (08S01): WSREP has not yet prepared node for application use and wsrep_cluster_status always non-Primary, wsrep_cluster_size=0

Enter command systemctl status mysql@bootstrap.service show result:

× mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/default/mysql.bootstrap
     Loaded: loaded (/lib/systemd/system/mysql@.service; disabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Tue 2023-10-10 10:59:58 +07; 6s ago
    Process: 29848 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
    Process: 29885 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 29887 ExecStartPre=/bin/sh -c VAR=`bash /usr/bin/mysql-systemd galera-recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR>
    Process: 29935 ExecStart=/usr/sbin/mysqld $EXTRA_ARGS $_WSREP_START_POSITION (code=exited, status=1/FAILURE)
    Process: 29938 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
   Main PID: 29935 (code=exited, status=1/FAILURE)
     Status: "Server shutdown complete"
        CPU: 1.326s

Oct 10 10:59:57 cloud20230615211.lanit.com.vn systemd[1]: Starting Percona XtraDB Cluster with config /etc/default/mysql.bootstrap...
Oct 10 10:59:58 cloud20230615211.lanit.com.vn systemd[1]: mysql@bootstrap.service: Main process exited, code=exited, status=1/FAILURE
Oct 10 10:59:58 cloud20230615211.lanit.com.vn mysql-systemd[29938]:  WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Oct 10 10:59:58 cloud20230615211.lanit.com.vn mysql-systemd[29938]:  WARNING: mysql may be already dead
Oct 10 10:59:58 cloud20230615211.lanit.com.vn systemd[1]: mysql@bootstrap.service: Failed with result 'exit-code'.
Oct 10 10:59:58 cloud20230615211.lanit.com.vn systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/default/mysql.bootstrap.
Oct 10 10:59:58 cloud20230615211.lanit.com.vn systemd[1]: mysql@bootstrap.service: Consumed 1.326s CPU time.

Node 1 is permanently shutdown because I currently stop that VPS

Reference is here

Can anyone recommend what I should do next? Thanks for any help

Hi @cuongpham ,

These below action I try to resolve but not successfully, including

What would you consider as “resolve”?

SET GLOBAL wsrep_provider_options=‘pc.ignore_quorum=true’; will work, but you have to set it before node_1 leaves the cluster.

The node detected that it is partitioned and has no quorum, so it goes to non-primary. It is not possible to make it being primary without restarting the node with the option of new cluster creation.
If you just want to perform some actions on this node, try

set wsrep_on=0;

but beware of all changes you do. In case node_1 appears in the cluster and both form primary view again, they will be inconsistent if you modify didata on node_2 in the meantime.