After power loss mysql can't start, bootstrap works

JollyJumper · December 19, 2023, 2:47pm

hello,

I have a problem with my mysql cluster after losing power.
I did as per instructions (3 servers failed):
I checked the grastat.dat file on each of them with the command:

mysqld_safe --wsrep-recover

then, on the one that was the most advanced (the one that crashed/shut down as last) I changed the entry\line in the grastat.dat file from 0 to 1

after that, I started boostrep with the command:

systemctl start mysql@bootstrap.service
systemctl status mysql@bootstrap.service (looks OK - green light ;-))

then on that server I wanted to start mysql cluster with command:

service mysql start << unfortunately the service does not start

I also tried the command:
service mysql start << also without success

what can I do more?

I have only seen on one server the file for autorecovery :
/var/lib/mysql/gvwstate.dat
Is this a normal situation?

matthewb · December 19, 2023, 3:38pm

That is all you need to do. Nothing more. Bootstrapping “starts” the cluster so you don’t need to do any more ‘start mysql’ commands. Use ps to verify that mysqld is running. Go start the other nodes.

JollyJumper · December 19, 2023, 3:52pm

root@server3:/home/root# ps -ef | grep mysqld
mysql       2991       1  4 16:45 ?        00:00:07 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=d005ce72-f111-11eb-975e-a3aadeac8607:271691167
root        3133    2873  0 16:48 pts/1    00:00:00 grep --color=auto mysqld

how to run others node?
I must to set 1 in grastat.dat and run bootstrap by command ? :
systemctl start mysql@bootstrap.service

Thx

root@server3:/home/root# systemctl status mysqld
● mysql.service - Percona XtraDB Cluster
     Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Tue 2023-12-19 16:10:27 CET; 44min ago
   Main PID: 2599 (code=exited, status=1/FAILURE)
     Status: "Server startup in progress"

Dec 19 16:09:53 server3 systemd[1]: Starting Percona XtraDB Cluster...
Dec 19 16:10:27 server3 systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE
Dec 19 16:10:27 server3 mysql-systemd[2605]:  WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 19 16:10:27 server3 mysql-systemd[2605]:  WARNING: mysql may be already dead
Dec 19 16:10:27 server3 systemd[1]: mysql.service: Failed with result 'exit-code'.
Dec 19 16:10:27 server3 systemd[1]: Failed to start Percona XtraDB Cluster.

ps. i check again grastat.dat file on the server3 (from this server i started bootstrap) I have “1” in the grastat.dat file below data:

# GALERA saved state
version: 2.1
uuid:    d005ce72-f111-11eb-975e-a3aadeac8607
seqno:   -1
safe_to_bootstrap: 1

when i stop service bootstrap i have another infos:

# GALERA saved state
version: 2.1
uuid:    d005ce72-f111-11eb-975e-a3aadeac8607
seqno:   271691169
safe_to_bootstrap: 1

Ivan_Groenewold · December 19, 2023, 6:00pm

The other nodes can be started normally with systemctl start mysql.
No need to bootstrap or edit the files.

JollyJumper · December 19, 2023, 6:07pm

This is how it worked until today’s crash.
All I had to do was change in grastate.dat 0 to 1 and restart the server.
It was not necessary to use the bootstrap command.
After the restart, the cluster work and all nodes will connect one by one

Currently, I can’t run the MySQL service.
On each server there is the same problem - the service can’t start.

Dec 19 16:09:53 server3 systemd[1]: Starting Percona XtraDB Cluster…
Dec 19 16:10:27 server3 systemd[1]: mysql.service: Main process exited, code=exited, status=1/FAILURE
Dec 19 16:10:27 server3 mysql-systemd[2605]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 19 16:10:27 server3 mysql-systemd[2605]: WARNING: mysql may be already dead
Dec 19 16:10:27 server3 systemd[1]: mysql.service: Failed with result ‘exit-code’.
Dec 19 16:10:27 server3 systemd[1]: Failed to start Percona XtraDB Cluster.

JollyJumper · December 19, 2023, 9:02pm

Now it’s a little better, but what does it mean:

mysql.service - Percona XtraDB Cluster
     Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-19 21:42:02 CET; 6min ago
    Process: 2654 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
    Process: 2707 ExecStartPre=/usr/bin/mysql-systemd check-grastate (code=exited, status=0/SUCCESS)
    Process: 2737 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 2739 ExecStartPre=/bin/sh -c VAR=`bash /usr/bin/mysql-systemd galera-recovery`; [ $? -eq 0 ] && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1 (c>
    Process: 4158 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 4161 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 2865 (mysqld)
     Status: "Server is operational"
      Tasks: 64 (limit: 19050)
     Memory: 1.5G
     CGroup: /system.slice/mysql.service
             └─2865 /usr/sbin/mysqld --wsrep_start_position=d005ce72-f111-11eb-975e-a3aadeac8607:271691170

Dec 19 21:36:20 server3 systemd[1]: Starting Percona XtraDB Cluster...
Dec 19 21:41:56 server3 systemd[1]: mysql.service: Got notification message from PID 3932, but reception only permitted for main PID 2865
Dec 19 21:41:56 server3 systemd[1]: mysql.service: Got notification message from PID 3976, but reception only permitted for main PID 2865
Dec 19 21:41:58 server3 systemd[1]: mysql.service: Got notification message from PID 3976, but reception only permitted for main PID 2865
Dec 19 21:41:58 server3 systemd[1]: mysql.service: Got notification message from PID 3976, but reception only permitted for main PID 2865
Dec 19 21:42:00 server3 systemd[1]: mysql.service: Got notification message from PID 3976, but reception only permitted for main PID 2865
Dec 19 21:42:02 server3 mysql-systemd[4161]:  SUCCESS!
Dec 19 21:42:02 server3 systemd[1]: Started Percona XtraDB Cluster.

on the second server when is running bootstrap is:

 mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/default/mysql.bootstrap
     Loaded: loaded (/lib/systemd/system/mysql@.service; disabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-19 21:28:22 CET; 35min ago
    Process: 5724 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
    Process: 5772 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 5779 ExecStartPre=/bin/sh -c VAR=`bash /usr/bin/mysql-systemd galera-recovery`; [ $? -eq 0 ] && systemctl set-environment _WSR>
    Process: 5913 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 5915 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 5828 (mysqld)
     Status: "Server is operational"
      Tasks: 70 (limit: 19050)
     Memory: 1.2G
     CGroup: /system.slice/system-mysql.slice/mysql@bootstrap.service
             └─5828 /usr/sbin/mysqld --wsrep-new-cluster --wsrep_start_position=d005ce72-f111-11eb-975e-a3aadeac8607:271691167

Dec 19 21:28:20 server2 systemd[1]: Starting Percona XtraDB Cluster with config /etc/default/mysql.bootstrap...
Dec 19 21:28:22 server2 mysql-systemd[5915]:  SUCCESS!
Dec 19 21:28:22 server2 systemd[1]: Started Percona XtraDB Cluster with config /etc/default/mysql.bootstrap.

AND, only on the one server looks great:

mysql.service - Percona XtraDB Cluster
     Loaded: loaded (/lib/systemd/system/mysql.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2023-12-19 21:59:59 CET; 8min ago
    Process: 960 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
    Process: 1095 ExecStartPre=/usr/bin/mysql-systemd check-grastate (code=exited, status=0/SUCCESS)
    Process: 1135 ExecStartPre=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 1137 ExecStartPre=/bin/sh -c VAR=`bash /usr/bin/mysql-systemd galera-recovery`; [ $? -eq 0 ] && systemctl set-en>
    Process: 1995 ExecStartPost=/bin/sh -c systemctl unset-environment _WSREP_START_POSITION (code=exited, status=0/SUCCESS)
    Process: 1997 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
   Main PID: 1195 (mysqld)
     Status: "Server is operational"
      Tasks: 69 (limit: 19050)
     Memory: 1.2G
     CGroup: /system.slice/mysql.service
             └─1195 /usr/sbin/mysqld --wsrep_start_position=d005ce72-f111-11eb-975e-a3aadeac8607:271691205

Dec 19 21:59:53 server1 systemd[1]: Starting Percona XtraDB Cluster...
Dec 19 21:59:59 server1 mysql-systemd[1997]:  SUCCESS!
Dec 19 21:59:59 server1 systemd[1]: Started Percona XtraDB Cluster.

matthewb · December 19, 2023, 9:35pm

You only bootstrap the very first node!! Do not bootstrap any other nodes! Do not edit grastate.dat on other nodes.

While PXC is running, seqno: -1 is correct to see in grastate.dat. This value is only updated on clean shutdown.

JollyJumper · December 19, 2023, 9:44pm

Thanks.
Now Is everything ok.

Summary:

Find most advanced node
Run bootstrap on the most advanced node.
Run MySQL on the next node2 and wait for synchro.
After waiting:
Run MySQL on the next node3 and wait for the synchro.
After waiting:
Shutdown bootstrap on the most advanced node
Run MySQL on the most advanced node.
Wait for synchro between all node’s

matthewb · December 19, 2023, 9:55pm

You can skip 5, 6, and 7. Those steps are not needed. “Bootstrap” simply means “start a cluster from this point and continue running normally”. You do not need to stop the bootstrap and re-start ‘normally’.

JollyJumper · December 20, 2023, 9:44am

hello mattheweb,
could You tell me what mean below warning from the one node:

server1 pacemaker-controld[1969]:  warning: Another DC detected: server2 (op=noop)

server2 - this is server from where bootstrap was started
on the server3 - this problem/warning doesn’t exist

How can I check the status of the cluster and synchronization?
so that I can be sure that everything is working 100% well?

I use these commands:
pcs cluster status
service pacemaker status
service corosync status

matthewb · December 20, 2023, 4:15pm

Sorry, I don’t know anything about pacemaker/corosync. Those are external, 3rd party, tools.

Log in to any MySQL node, and SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size' and also wsrep_cluster_status. If you see 3 members online in Primary state, then your cluster is good.

Topic		Replies	Views
Can't Start My Cluster :( It's So Urgent Percona XtraDB Cluster 5.x	3	10267	August 3, 2017
First node fails to restart after bootstrap Percona XtraDB Cluster 5.x	13	20489	April 5, 2017
How to reset Percona XtraDB Cluster on all nodes? Percona XtraDB Cluster 5.x	8	9123	December 12, 2016
How to stop and start an XtraDB Mysql cluster of 3 nodes/ Percona XtraDB Cluster 8.x	7	1874	September 19, 2023
Some Help with Percona Cluster please Other MySQL® Questions	3	884	November 4, 2015

After power loss mysql can't start, bootstrap works

Related topics