I have a 5 node cluster. I checked the status and it still says joining. It has been over 10 hours. I tried querying the db and I am getting ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use
The other nodes says the cluster size is 5 as well I also can see the bad node ip address in the wsrep_incoming_addresses status. Do I just reboot the bad node again?
Looking at the error log I am seeing 2022-09-15T08:19:59.021032Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 7223ms to flush 267 pages
messages.
I also noticed my galera.cache file is at max
wsrep_provider_options="gcache.size = 5G"
Size of file is 5.1G Sep 15 09:00 galera.cache
When I run mysql -e "show global status like '%wsrep_local_state_comment%'\G";
from command line I am getting Variable_name: wsrep_local_state_comment
and Value: Joined
When I login to mysql and do a query I am still getting WSREP has not yet prepared node for application use
Log file still has 2022-09-15T10:24:46.252378Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 37403ms to flush 1493 pages
messages
The wsrep_local_recv_queue is 500. I am guessing I should reboot?
1 Like
Hello, did you check for problems in the logs of the bad node? did you check maybe the node is still going through SST?
1 Like
The log has.
0.0 (DB404): State transfer from 3.0 (DB403) complete.
[Galera] Shifting JOINER -> JOINED (TO: 799642)
But when I would look at the wsrep_local_state_comment
it would say joining (even after 2 hours) also the wsrep_ready would say Off. But the SST was complete
1 Like
Look in the datadir of the joining node. Do you have an actual datadir or is it mostly empty? There might be an sst_in_progress
file or some xtarbackup logs to inspect.
1 Like
Datadir was full. i can check the sst_in_progress log. i rebooted and sst_in_progress
file is empty.
1 Like
If your datadir was full, then that’s your issue. You need more disk space.
1 Like
After the reboot and sync still not joining the cluster. Status is joining
there is no sst_in_progress
file.
Here is the error log of the joiner. Joiner still has wsrep_local_state_comment
as joining. I can see all of the nodes in the cluster. But this bad node refuses to join. The wsrep_local_state
status is set to 1.
2022-09-15T21:55:41.141321Z 0 [Note] [MY-000000] [WSREP-SST] Running post-processing...........
2022-09-15T21:55:41.147995Z 0 [Note] [MY-000000] [WSREP-SST] Skipping mysql_upgrade (sst): local version (8.0.27) == donor version (8.0.27)
2022-09-15T21:55:41.241224Z 0 [Note] [MY-000000] [WSREP-SST] Waiting for server instance to start..... This may take some time
2022-09-15T21:55:48.582854Z 0 [Note] [MY-000000] [WSREP-SST] ...........post-processing done
2022-09-15T21:55:49.437058Z 0 [Note] [MY-011952] [InnoDB] If the mysqld execution user is authorized, page cleaner and LRU manager thread priority can be changed. See the man page of setpriority().
2022-09-15T21:55:49.437808Z 4 [Note] [MY-013532] [InnoDB] Using './#ib_16384_0.dblwr' for doublewrite
[Note] [MY-011089] [Server] Data dictionary restarting version '80023'.
[System] [MY-000000] [WSREP] PXC upgrade completed successfully
[Note] [MY-010006] [Server] Using data dictionary with version '80023'.
[Note] [MY-011025] [Repl] Failed to start slave threads for channel ''.
[System] [MY-000000] [WSREP] SST completed
[Note] [MY-000000] [Galera] Receiving IST... 0.0% ( 0/85 events) complete.
[Note] [MY-000000] [Galera] Receiving IST...100.0% (85/85 events) complete.
2022-09-15T21:55:54.972261Z 0 [Note] [MY-000000] [Galera] 3.0 (DB404): State transfer from 1.0 (DB405) complete.
2022-09-15T21:55:54.972301Z 0 [Note] [MY-000000] [Galera] SST leaving flow control
2022-09-15T21:55:54.972317Z 0 [Note] [MY-000000] [Galera] Shifting JOINER -> JOINED (TO: 800712)
2022-09-15T21:56:04.286806Z 0 [Note] [MY-011953] [InnoDB] Page cleaner took 6203ms to flush 100 pages
The joiner immediately goes into Page cleaner message once it is finished the process.
I have plenty of space left. total:1.8T used:1.3T available:493G
. By datadir being full I meant it had al of the expected data. Not that it was actually full.
Donor node has 2022-09-15T21:55:54.972452Z 0 [Note] [MY-000000] [Galera] 3.0 (DB404): State transfer from 1.0 (DB405) complete.
1 Like
Try increasing innodb_io_capacity
by 2x to help with the page cleaner issue. But there’s no more in the log file?
1 Like
I will add that and see how it goes. And no there is no more info in the log file. Only the page cleaner messages.
1 Like