Full SST after clean shutdown of all nodes in the cluster and bootstrapping one node

pxchxw · April 15, 2015, 1:13pm

Hello,

We are running a 3 node PXC cluster on 5.6.21. After gracefully shutting down all nodes in the cluster, we bootstrapped one node with the most advanced seqno in grastate.dat file (/etc/init.d/mysql bootstrap-pxc). When starting up the rest of the nodes (service mysql start), it triggers full SST, which takes a long time. We are wondering how we can avoid full SST and if IST is possible after bootstrapping a node in the cluster. And what is the best way to restart a cluster after gracefully shutting down all nodes?

Any help is appreciated.

Svolentin · April 15, 2015, 1:29pm

bootstrap-pxc means to all cluster members MUST make SST instead of IST.
By another side, looks like PXC don’t have mechanisms to detect inconsistency to activate IST when inconsistent nodes just need to read forward from bootstrap initiator when they have some data.

wagnerbianchi · April 16, 2015, 4:31pm

Since the PXC 5.6.19, you don’t need to bootstrap the cluster again after a graceful shutdown, if we based our assumptions on the documentation:

[QUOTE]
Percona XtraDB Cluster

Check the complete PERCONA XTRADB CLUSTER 5.6.19-25.6 changelog here:

[URL=“Percona XtraDB Cluster 5.6.19-25.6”]http://www.percona.com/doc/percona-x...6.19-25.6.html[/URL]

pxchxw · April 20, 2015, 2:19pm

Thanks. Is the new feature described in
[url]Auto-bootstrapping an all-down cluster: Percona XtraDB Cluster

The blog mentioned gvwstate.dat will not exist on a node if it was shutdown cleanly, only if the mysqld was uncleanly terminated. This file should exist and be the same on all the nodes for the auto-recovery to work.

So it seems the auto-recovery will not work after a graceful shutdown.

In our case, we are running PXC 5.6.21 (>5.6.19). After all nodes are cleanly shutdown, serivce mysql start does not work, it hang and eventually timed out. One of the nodes has to be bootstrapped to get it restarted. Once we do that, the remaining nodes will be recovered by full SST, which takes hours for large database.

Is it by design that a cluster cannnot be cleanly restarted after all nodes are cleanly shutdown?

Thanks.

wagnerbianchi · April 23, 2015, 1:09pm

It’s good to keep track of the node to shutdown using the MySQL Error Log (tail -f it), the init script can give you a timeout, but, it’s just the init script timeout. Most of time the shutdown processes is running yet behind the scenes.

BTW, let’s organize this thread and check what’s really going on. I’ve got a Galera Cluster with three nodes running here on my side. All the version information about what I’ve been running here is that below:


+------------------------+----------------+
| variable_name | variable_value |
+------------------------+----------------+
| WSREP_PROTOCOL_VERSION | 6 |
| WSREP_PROVIDER_VERSION | 3.8(rf6147dd) |
+------------------------+----------------+

mysqld Ver 5.6.21-70.1-56 for Linux on x86_64 (Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150)

After some tests, I’d say that considering the version I’m using, it’s crystal clear that the gvwstate.dat will survive just in case of a node/cluster crash. But this is not a guarantee that the cluster could be brought back online with no bootstrapping again. After a clean shutdown, the file will not survive and the cluster must be bootstrapped, what’s a little bit weird if we recap the docs. All the cluster’s nodes has pc.recover as its default (true) and ny other configuration was added to wsrep_provider_options in my.cnf.

Not sure if I have inconsistencies among cluster’s nodes and because that I’m going to keep investigating this problem.

Topic		Replies	Views
3 node cluster - gracefully shutdown ALL nodes - cluster fail to start Percona XtraDB Cluster 5.x	3	6561	July 21, 2014
Verify Node after error during shutdown Percona XtraDB Cluster 5.x	3	910	August 18, 2015
How to stop and start an XtraDB Mysql cluster of 3 nodes/ Percona XtraDB Cluster 8.x	7	1732	September 19, 2023
Can't restart node after brutal shutdown in 3 nodes pxc 57 - systemd related ? Percona XtraDB Cluster 5.x	8	7210	September 15, 2021
SST recovery loop after node reboot Percona XtraDB Cluster 5.x	2	596	August 5, 2019

Full SST after clean shutdown of all nodes in the cluster and bootstrapping one node

Related topics