Forced reboot a node and doesn't start automatically

Hi

I’m testing a 3-node percona cluster and when a node is forced rebooted, mysql doesn’t start automatically. systemctl status mysql gives me:

mysql-systemd[618]: WARNING: Node has been rebooted, /var/lib/mysql/grastate.dat: seqno = -1, mysql service has not been started automatically

Looking at the systemctl definition file, I see it calls to /usr/bin/mysql-systemd check-grastate

Which does the following:

check_grastate_dat() {
local seqno=-1
local uptime=$(awk ‘{print int($1/60)}’ /proc/uptime)
if [ $uptime -lt 5 ]; then
if [ -f $grastate_loc ]; then
seqno=$(grep ‘seqno:’ $grastate_loc | cut -d: -f2 | tr -d ’ ')
if [ $seqno -eq -1 ]; then
log_warning_msg “Node has been rebooted, $grastate_loc: seqno = $seqno, mysql service has not been started automatically”
exit 1
fi
else

So, if a node is forced rebooted, and uptime < 5 minutes, systemctl start mysql throws an error. After 5 minutes, if I run the same command systemctl start mysql, starts with no problem (the warning is still logged).

Why ? I assume the intention is the operator can check everything is ok and after 5 minutes, can start it manually. But if I want a completely automatic system (for example a forced reboot after a power failure) is it safe to remove the “exit 1” line so it can start automatically?

Thanks

This is a safety gate to prevent unsafe automatic startup in an invalid Galera state. This was added in PXC-2985. This is only blocked on unclean shutdowns.

It is not advisable to remove exit 1 in the script.

I’ve read it and I have the same question. When a node was forced rebooted, “systemctl start mysql” will fail the first 5 minutes of uptime. After that, it will start with no errors. Nothing happens in these first 5 minutes. Why is it safe to start it after 5 minutes?

It’s not that it’s already “safe” to start mysqld after 5 minutes has passed, but the manual mysqld startup by a DBA would avoid the following cases:

  • Letting a node auto-start and join a cluster, which may cause cluster inconsistencies
  • Avoid auto-starting nodes in cases where all members of the cluster were rebooted, as this may cause an endless restart loop

Since the DBA manually restarted mysqld, the expectation is that they are aware that it is “safe” to start the node and let it join the cluster.

hi. Thanks for the reply. I understand, but it doesn’t cover scenarios on a power failure and no DBA is doing anything. The idea is having a full 24x7 available, with no manual intervention, and in this case, when a power failure happens, the node will not start mysqld automatically and require a DBA starts it manually. Most (of almost all of them) high availability services cover this scenario, with no manual intervention.

Right, for cases like this, I believe the solution is to migrate to PXC Operator and let the Kubernetes controller and statefulset handle pod restarts.

For on-premises or self-managed clusters, it is not advisable, because if all nodes restart abruptly almost simultaneously while DMLs are active on any given node, the DBA would need to start each member with --wsrep_recover to repair grastate.dat, determine which node has the latest commit, and start it as a bootstrapped node. Automatic service startup could start a member that is behind, resulting in either inconsistencies or data loss.

Moreover, editing the mysql-systemd script would not scale and would require you to edit the file each time you upgrade. Your other option is to create a script that will start the service after an OS reboot if the other nodes are up and running.

Hey,

I also stumbled across that. In my case, I manually checked and was sure, that it was safe to start MySQL again. (because it was a one-node-cluster for testing purposes only).
I still could not do it, because even a manual “systemctl start mysql” is prevented within the first 5 minutes.

Can you elaborate, how I would write a script to start the systemd-service without removing your 5-Minute-Limit?
Or are you suggesting to start MySQL completely outside of systemd and implement our own process management?

Automatic service startup could start a member that is behind, resulting in either inconsistencies or data loss.

Would mysql actually start with other nodes configured in `wsrep_cluster_address`, when none of these nodes is reachable and the node itself experienced an unclean shutdown?

From my observation, a crashed node will have a grastate like:

# GALERA saved state
version*:* 2.1
uuid*:* e256c10a-d2ea-11ed-96e2-c7a28cc52bd1
seqno*:* -1
safe_to_bootstrap*:* 0

And due to the “safe_to_bootstrap” being 0, the node knows, that it requires a state transfer from other nodes before starting.

If all nodes crashed around the same time, then no node will be reachable for a state transfer and startup will fail.

I can imagine how this can end up in a restart loop, but I don’t quite get, how this would end up in an unhealthy node ever entering “PRIMARY” state?

I also stumbled across that. In my case, I manually checked and was sure, that it was safe to start MySQL again. (because it was a one-node-cluster for testing purposes only).

In one node cluster, you can force bootstrap that node and it should start. You should do this while starting the first node,

systemctl start mysql@bootstrap.service

Can you elaborate, how I would write a script to start the systemd-service without removing your 5-Minute-Limit?
Or are you suggesting to start MySQL completely outside of systemd and implement our own process management?

You should check this document that shows to check all the cases when nodes shut down.

The document should clear your all doubts regarding crashes and restarts.