It seems like one of my nodes on the cluster restarted.
2022-01-21T02:49:40.127771Z 0 [Note] [MY-000000] [Galera] Created page /db1/gcache.page.000006 of size 491375560 bytes
2022-01-21T02:49:48.274502Z 0 [Note] [MY-000000] [Galera] (87846d53-828c, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: ssl://192.168.4.71:4567
2022-01-21T02:49:49.274802Z 0 [Note] [MY-000000] [Galera] (87846d53-828c, 'ssl://0.0.0.0:4567') reconnecting to 5a42bfda-83f1 (ssl://192.168.4.71:4567), attempt 0
2022-01-21T02:49:51.072488Z 0 [Note] [MY-000000] [Galera] SSL handshake successful, remote endpoint ssl://192.168.4.71:4567 local endpoint ssl://192.168.2.61:36868 cipher: TLS_AES_256_GCM_SHA384 compression: none
2022-01-21T02:49:51.132812Z 0 [Note] [MY-000000] [Galera] SSL handshake successful, remote endpoint ssl://192.168.4.71:52112 local endpoint ssl://192.168.2.61:4567 cipher: TLS_AES_256_GCM_SHA384 compression: none
2022-01-21T02:49:51.181165Z 0 [Note] [MY-000000] [Galera] (87846d53-828c, 'ssl://0.0.0.0:4567') connection established to 5a42bfda-83f1 ssl://192.168.4.71:4567
2022-01-21T02:49:51.181433Z 0 [Note] [MY-000000] [Galera] (87846d53-828c, 'ssl://0.0.0.0:4567') connection established to 5a42bfda-83f1 ssl://192.168.4.71:4567
At some point node 1 tried to communicate with node 2 after syncing
2022-01-21T05:36:09.143488Z 17 [Note] [MY-000000] [Galera] Non-primary view
2022-01-21T05:36:09.143505Z 17 [Note] [MY-000000] [WSREP] Server status change connected -> connected
2022-01-21T05:36:09.143527Z 17 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2022-01-21T05:36:09.987315Z 0 [Note] [MY-000000] [Galera] (87846d53-828c, 'ssl://0.0.0.0:4567') connection to peer 00000000-0000 with addr ssl://192.168.4.71:4567 timed out, no messages seen in PT3S, sock
et stats: rtt: 229 rttvar: 124 rto: 204000 lost: 0 last_data_recv: 3004 cwnd: 11 last_queued_since: 4520675288595656 last_delivered_since: 4520675288595656 send_queue_length: 0 send_queue_bytes: 0 (gmcast
.peer_timeout)
I can log into node 1 but I get the following message
select * from db.files;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use
I then stopped node 2 (node 2 wouldnt allow me to query. Can’t connect to the server on node 2) and tried the query on node 1 again and got the same error. Do I need to restart node 1, all of the bin files are the datadir on node 1. Or is there a way to jump start the node for application use.
SHOW STATUS LIKE '%wsrep_%';
wsrep_ready | OFF
wsrep_local_state_comment | Initialized
I’ve tried rebooting node 1 and I got
edit the grastate.dat file manually and set safe_to_bootstrap
The the grastate.dat file is empty on node 1.
On node 2 the node that is down and wouldn’t respond does have content.
# GALERA saved state
version: 2.1
uuid: bc54efc2-52bc-11ec-9ba9-fbe9dc3b2f33
seqno: -1
safe_to_bootstrap: 0
Since node 2 is a lost cause can i just add safe_to_bootstrap:0 to the empty grastate.dat file on node 1?