Can't reconnect to cluster after reboot

Ken_Scott · January 30, 2024, 8:17pm

This is now happening on 2 of my 5 nodes. We updating a mailer program on reboot the node can’t join back to the cluster. The only way i know this solve this is to blow away the machine and start from scratch.

Joiner node:

root@webnode2:/home/ken# /etc/init.d/mysql start
Starting mysql (via systemctl): mysql.serviceJob for mysql.service failed because the control process exited with error code.
See “systemctl status mysql.service” and “journalctl -xe” for details.
failed!
root@webnode2:/home/ken# systemctl status mysql.service
● mysql.service - LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon
Loaded: loaded (/etc/init.d/mysql; generated)
Active: failed (Result: exit-code) since Tue 2024-01-30 15:12:55 EST; 40s ago
Docs: man:systemd-sysv-generator(8)
Process: 3979495 ExecStart=/etc/init.d/mysql start (code=exited, status=1/FAILURE)

Jan 30 15:12:35 webnode2.long-mcquade.com systemd[1]: Starting LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon…
Jan 30 15:12:35 webnode2.long-mcquade.com mysql[3979495]: * Stale sst_in_progress file in datadir mysqld
Jan 30 15:12:35 webnode2.long-mcquade.com mysql[3979495]: * Starting MySQL (Percona XtraDB Cluster) database server mysqld
Jan 30 15:12:35 webnode2.long-mcquade.com mysql[3979495]: * State transfer in progress, setting sleep higher mysqld
Jan 30 15:12:55 webnode2.long-mcquade.com mysql[3979495]: * The server quit without updating PID file (/var/run/mysqld/mysqld.pid).
Jan 30 15:12:55 webnode2.long-mcquade.com mysql[3979495]: …fail!
Jan 30 15:12:55 webnode2.long-mcquade.com systemd[1]: mysql.service: Control process exited, code=exited, status=1/FAILURE
Jan 30 15:12:55 webnode2.long-mcquade.com systemd[1]: mysql.service: Failed with result ‘exit-code’.
Jan 30 15:12:55 webnode2.long-mcquade.com systemd[1]: Failed to start LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon.

Doner Node:
2024-01-30T20:12:41.941906Z 0 [Note] WSREP: Member 2.0 (pxc2) requested state transfer from ‘any’. Selected 0.0 (pxc3)(SYNCED) as donor.
2024-01-30T20:12:43.332884Z 0 [Note] WSREP: (f23c4ba4, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2024-01-30T20:12:53.523896Z 0 [Warning] WSREP: 0.0 (pxc3): State transfer to 2.0 (pxc2) failed: -22 (Invalid argument)
2024-01-30T20:12:53.524856Z 0 [Note] WSREP: Member 0.0 (pxc3) synced with group.
2024-01-30T20:12:53.524909Z 0 [Note] WSREP: declaring 225a4946 at tcp://172.26.0.11:4567 stable
2024-01-30T20:12:53.524950Z 0 [Note] WSREP: declaring 38921ada at tcp://172.26.0.9:4567 stable
2024-01-30T20:12:53.524989Z 0 [Note] WSREP: forgetting eb189faa (tcp://172.26.0.12:4567)
2024-01-30T20:12:53.525615Z 0 [Note] WSREP: Node 225a4946 state primary
2024-01-30T20:12:53.526106Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(PRIM,225a4946,183)
memb {
225a4946,0
38921ada,0
f23c4ba4,0
}
joined {
}
left {
}
partitioned {
eb189faa,0
}
)
2024-01-30T20:12:53.526129Z 0 [Note] WSREP: Save the discovered primary-component to disk
2024-01-30T20:12:53.526690Z 0 [Note] WSREP: forgetting eb189faa (tcp://172.26.0.12:4567)
2024-01-30T20:12:53.526773Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2024-01-30T20:12:53.526812Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2024-01-30T20:12:53.527188Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: f32dbad8-bfab-11ee-b9bd-bbf4466ecd50
2024-01-30T20:12:53.527437Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: f32dbad8-bfab-11ee-b9bd-bbf4466ecd50 from 0 (pxc3)
2024-01-30T20:12:53.527453Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: f32dbad8-bfab-11ee-b9bd-bbf4466ecd50 from 1 (pxc5)
2024-01-30T20:12:53.527466Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: f32dbad8-bfab-11ee-b9bd-bbf4466ecd50 from 2 (pxc6)
2024-01-30T20:12:53.527483Z 0 [Note] WSREP: Quorum results:
version = 6,
component = PRIMARY,
conf_id = 171,
members = 3/3 (primary/total),
act_id = 7130896577,
last_appl. = 7130896493,
protocols = 0/9/3 (gcs/repl/appl),
group UUID = 3a0118b7-6c7b-11eb-935d-5fcc5d08be87
2024-01-30T20:12:53.527498Z 0 [Note] WSREP: Flow-control interval: [173, 173]
2024-01-30T20:12:53.527687Z 6 [Note] WSREP: REPL Protocols: 9 (4, 2)
2024-01-30T20:12:53.527725Z 6 [Note] WSREP: REPL Protocols: 9 (4, 2)
2024-01-30T20:12:53.527745Z 6 [Note] WSREP: New cluster view: global state: 3a0118b7-6c7b-11eb-935d-5fcc5d08be87:7130896577, view# 172: Primary, number of nodes: 3, my index: 2, protocol version 3
2024-01-30T20:12:53.527756Z 6 [Note] WSREP: Setting wsrep_ready to true
2024-01-30T20:12:53.527767Z 6 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 4 → 3) (Increment: 4 → 3)
2024-01-30T20:12:53.528755Z 6 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-01-30T20:12:53.531022Z 6 [Note] WSREP: Assign initial position for certification: 7130896577, protocol version: 4
2024-01-30T20:12:53.531133Z 0 [Note] WSREP: Service thread queue flushed.
2024-01-30T20:12:58.835552Z 0 [Note] WSREP: cleaning up eb189faa (tcp://172.26.0.12:4567)

Any thoughts on what I can do to resolve this?

matthewb · January 30, 2024, 9:02pm

This is extremely old way of managing services in linux. You should learn the new systemctl commands.
systemctl start|stop|restart mysql
systemctl start|stop mysql@bootstrap

You need to erase the data directory and clean out whatever failed before and then start it again.

Ken_Scott · January 31, 2024, 7:33pm

Hello, I used “systemctl start mysql” with the same result.

Also where can I find the data directory on Ubuntu?

I was running:

sudo rm -rf /var/lib/mysql* /var/lib/mysql-files*

matthewb · January 31, 2024, 8:06pm

/var/lib/mysql is the default directory. Stop mysql. Verify stopped using ps -Af | grep mysqld. Erase the datadir then start it back up. Please provide a copy of your my.cnf file.

Ken_Scott · January 31, 2024, 8:12pm

I verified that MySQL was stopped and removed the data directory but still the same issue.

here is my config;

!includedir /etc/mysql/conf.d/
!includedir /etc/mysql/percona-xtradb-cluster.conf.d/

[mysqld]

interactive_timeout = 10
wait_timeout = 10

max_connections = 505 # Values < 1000 are typically good
max_user_connections = 500 # Limit one specific user/application
thread_cache_size = 505 # Up to max_connections makes sense

query_cache_type = 1
query_cache_size = 100M

innodb_buffer_pool_size = 5G

Session variables

sort_buffer_size = 10M # Could be too big for many small sorts
tmp_table_size = 32M # Make sure your temporary results do NOT contain BLOB/TEXT attributes

read_buffer_size = 20M # Resist to change this parameter if you do not know what you are doing
read_rnd_buffer_size = 10M # Resist to change this parameter if you do not know what you are doing
join_buffer_size = 50M # Resist to change this parameter if you do not know what you are doing

Other buffers and caches

table_definition_cache = 1400 # As big as many tables you have
table_open_cache = 5000 # connections x tables/connection (~2)
table_open_cache_instances = 16 # New default in 5.7

wsrep_provider=/usr/lib/libgalera_smm.so

wsrep_cluster_name=pxc-cluster
wsrep_cluster_address=gcomm://172.26.0.11,172.26.0.14,172.26.0.9

wsrep_node_name=pxc2
wsrep_node_address=172.26.0.12

wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:passw0rd
pxc_strict_mode=ENFORCING
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=3

matthewb · January 31, 2024, 9:05pm

Ok. Just to confirm, you have 2 other nodes that are online and in PRIMARY state, correct? Then, please do the following:

stop mysql on failed node
erase datadir
erase mysql error log file
start mysql

after it fails, provide error log contents please.

Ken_Scott · January 31, 2024, 10:03pm

Ok, I’ve done that;

root@webnode2:/var/log# systemctl status mysql.service
● mysql.service - LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon
Loaded: loaded (/etc/init.d/mysql; generated)
Active: failed (Result: exit-code) since Wed 2024-01-31 17:01:25 EST; 5s ago
Docs: man:systemd-sysv-generator(8)
Process: 594265 ExecStart=/etc/init.d/mysql start (code=exited, status=1/FAILURE)

Jan 31 17:01:23 webnode2.test-.com systemd[1]: Starting LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon…
Jan 31 17:01:23 webnode2.test.com mysql[594265]: * Starting MySQL (Percona XtraDB Cluster) database server mysqld
Jan 31 17:01:25 webnode2.test.com mysql[594265]: * The server quit without updating PID file (/var/run/mysqld/mysqld.pid).
Jan 31 17:01:25 webnode2.test.com mysql[594265]: …fail!
Jan 31 17:01:25 webnode2.test.com systemd[1]: mysql.service: Control process exited, code=exited, status=1/FAILURE
Jan 31 17:01:25 webnode2.test.com systemd[1]: mysql.service: Failed with result ‘exit-code’.
Jan 31 17:01:25 webnode2.test.com systemd[1]: Failed to start LSB: Start and stop the mysql (Percona XtraDB Cluster) daemon.

matthewb · January 31, 2024, 10:07pm

Hello @Ken_Scott,
that is not the mysql error log. that is systemctl’s messages which never (usually) contain any helpful information. You need to find mysql’s actual error log and zero it out (step 3 above) and then copy/paste it here.

Ken_Scott · February 1, 2024, 8:46pm

This is the MySQL error log:

2024-02-01T20:45:31.866610Z 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|172.26.0.12:4444/xtrabackup_sst//1
2024-02-01T20:45:31.866680Z 2 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 1 → 3) (Increment: 1 → 4)
2024-02-01T20:45:31.866796Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-02-01T20:45:31.866868Z 2 [Note] WSREP: Assign initial position for certification: 7149092259, protocol version: 4
2024-02-01T20:45:31.867012Z 0 [Note] WSREP: Service thread queue flushed.
2024-02-01T20:45:31.867109Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2024-02-01T20:45:31.867158Z 2 [Note] WSREP: Local UUID: b64bc893-c142-11ee-a085-a68742f583c6 != Group UUID: 3a0118b7-6c7b-11eb-935d-5fcc5d08be87
2024-02-01T20:45:31.867277Z 2 [Note] WSREP: State gap can’t be serviced using IST. Switching to SST
2024-02-01T20:45:31.867316Z 2 [Note] WSREP: Failed to prepare for incremental state transfer: Local state UUID (b64bc893-c142-11ee-a085-a68742f583c6) does not match group state UUID (3a0118b7-6c7b-11eb-935d-5fcc5d08be87): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():518. IST will be unavailable.
2024-02-01T20:45:31.869147Z 0 [Note] WSREP: Member 2.0 (pxc2) requested state transfer from ‘any’. Selected 0.0 (pxc3)(SYNCED) as donor.
2024-02-01T20:45:31.869190Z 0 [Note] WSREP: Shifting PRIMARY → JOINER (TO: 7149092265)
2024-02-01T20:45:31.869249Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2024-02-01T20:45:31.869305Z 2 [Note] WSREP: GCache history reset: b64bc893-c142-11ee-a085-a68742f583c6:0 → 3a0118b7-6c7b-11eb-935d-5fcc5d08be87:7149092259
2024-02-01T20:45:32.498536Z WSREP_SST: [INFO] Streaming with xbstream
2024-02-01T20:45:32.510815Z WSREP_SST: [INFO] Proceeding with SST…
2024-02-01T20:45:32.559746Z WSREP_SST: [INFO] …Waiting for SST streaming to complete!
2024-02-01T20:45:33.417683Z 0 [Note] WSREP: (d61ae3cb, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2024-02-01T20:45:43.401239Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2024-02-01T20:45:43.403995Z WSREP_SST: [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR. Check DONOR log
2024-02-01T20:45:43.406751Z WSREP_SST: [ERROR] ******************************************************
2024-02-01T20:45:43.409748Z WSREP_SST: [ERROR] Cleanup after exit with status:2
2024-02-01T20:45:43.428575Z 0 [Warning] WSREP: 0.0 (pxc3): State transfer to 2.0 (pxc2) failed: -22 (Invalid argument)
2024-02-01T20:45:43.428622Z 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():811: Will never receive state. Need to abort.
2024-02-01T20:45:43.428677Z 0 [Note] WSREP: gcomm: terminating thread
2024-02-01T20:45:43.428706Z 0 [Note] WSREP: gcomm: joining thread
2024-02-01T20:45:43.428821Z 0 [Note] WSREP: gcomm: closing backend
2024-02-01T20:45:43.429438Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,225a4946,188)
memb {
d61ae3cb,0
}
joined {
}
left {
}
partitioned {
225a4946,0
38921ada,0
f23c4ba4,0
}
)
2024-02-01T20:45:43.429521Z 0 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2024-02-01T20:45:43.429868Z 0 [Note] WSREP: gcomm: closed
2024-02-01T20:45:43.429903Z 0 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Signal 15 (TERM) caught by ps (3.3.16).
ps:ps/display.c:66: please report this bug
2024-02-01T20:45:43.432109Z WSREP_SST: [ERROR] Removing /var/lib/mysql//.sst/xtrabackup_galera_info file due to signal
2024-02-01T20:45:43.436861Z WSREP_SST: [ERROR] Removing file due to signal

matthewb · February 1, 2024, 10:22pm

On both the donor and the joiner, there should be additional xtrabackup logs that were created as a result of the SST process. Please provide both of those. Since the error occurred within seconds of starting the SST process, I’m leaning towards this being a port issue. Have you confirmed that 4444, 4567, and 4566 are open between all nodes?

Ken_Scott · February 2, 2024, 2:11pm

Yes all ports are open. This issue starts on nodes that had been running for 3 years and only occurred after rebooting.

Ken_Scott · February 2, 2024, 2:13pm

I’m not find any addition logs, are they are stored in a different location?

Ken_Scott · February 2, 2024, 2:28pm

requesting node:

2024-02-02T14:27:06.968274Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘172.26.0.12’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/mysql/my.cnf’ --defaults-group-suffix ‘’ --parent ‘1158781’ --mysqld-version ‘5.7.44-48-57’ ‘’ )
2024-02-02T14:27:07.474380Z WSREP_SST: [INFO] Streaming with xbstream
2024-02-02T14:27:07.977430Z 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|172.26.0.12:4444/xtrabackup_sst//1
2024-02-02T14:27:07.977481Z 2 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 1 → 2) (Increment: 1 → 4)
2024-02-02T14:27:07.977545Z 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-02-02T14:27:07.977595Z 2 [Note] WSREP: Assign initial position for certification: 7155604377, protocol version: 4
2024-02-02T14:27:07.977722Z 0 [Note] WSREP: Service thread queue flushed.
2024-02-02T14:27:07.977783Z 2 [Note] WSREP: Check if state gap can be serviced using IST
2024-02-02T14:27:07.977827Z 2 [Note] WSREP: Local UUID: fa7ffb4b-c1d6-11ee-a672-fe91ed2bc9b3 != Group UUID: 3a0118b7-6c7b-11eb-935d-5fcc5d08be87
2024-02-02T14:27:07.977896Z 2 [Note] WSREP: State gap can’t be serviced using IST. Switching to SST
2024-02-02T14:27:07.977911Z 2 [Note] WSREP: Failed to prepare for incremental state transfer: Local state UUID (fa7ffb4b-c1d6-11ee-a672-fe91ed2bc9b3) does not match group state UUID (3a0118b7-6c7b-11eb-935d-5fcc5d08be87): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():518. IST will be unavailable.
2024-02-02T14:27:07.979327Z 0 [Note] WSREP: Member 1.0 (pxc2) requested state transfer from ‘any’. Selected 0.0 (pxc3)(SYNCED) as donor.
2024-02-02T14:27:07.979366Z 0 [Note] WSREP: Shifting PRIMARY → JOINER (TO: 7155604387)
2024-02-02T14:27:07.979412Z 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2024-02-02T14:27:07.979457Z 2 [Note] WSREP: GCache history reset: fa7ffb4b-c1d6-11ee-a672-fe91ed2bc9b3:0 → 3a0118b7-6c7b-11eb-935d-5fcc5d08be87:7155604377
2024-02-02T14:27:08.616596Z WSREP_SST: [INFO] Streaming with xbstream
2024-02-02T14:27:08.626501Z WSREP_SST: [INFO] Proceeding with SST…
2024-02-02T14:27:08.674417Z WSREP_SST: [INFO] …Waiting for SST streaming to complete!
2024-02-02T14:27:09.466921Z 0 [Note] WSREP: (23e88f89, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2024-02-02T14:27:19.530858Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************
2024-02-02T14:27:19.533523Z WSREP_SST: [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR. Check DONOR log
2024-02-02T14:27:19.536076Z WSREP_SST: [ERROR] ******************************************************
2024-02-02T14:27:19.538683Z 0 [Warning] WSREP: 0.0 (pxc3): State transfer to 1.0 (pxc2) failed: -22 (Invalid argument)
2024-02-02T14:27:19.538770Z 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():811: Will never receive state. Need to abort.
2024-02-02T14:27:19.538800Z 0 [Note] WSREP: gcomm: terminating thread
2024-02-02T14:27:19.538834Z 0 [Note] WSREP: gcomm: joining thread
2024-02-02T14:27:19.538995Z 0 [Note] WSREP: gcomm: closing backend
2024-02-02T14:27:19.539639Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,225a4946,192)
memb {
23e88f89,0
}
joined {
}
left {
}
partitioned {
225a4946,0
38921ada,0
f23c4ba4,0
}
)
2024-02-02T14:27:19.539728Z 0 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2024-02-02T14:27:19.538984Z WSREP_SST: [ERROR] Cleanup after exit with status:2
2024-02-02T14:27:19.540030Z 0 [Note] WSREP: gcomm: closed
2024-02-02T14:27:19.540065Z 0 [Note] WSREP: /usr/sbin/mysqld: Terminated.
2024-02-02T14:27:19.544828Z WSREP_SST: [ERROR] Removing /var/lib/mysql//.sst/xtrabackup_galera_info file due to signal
2024-02-02T14:27:19.550057Z WSREP_SST: [ERROR] Removing file due to signal

Doner Node

2024-02-02T14:27:07.980161Z 0 [Note] WSREP: Member 1.0 (pxc2) requested state transfer from ‘any’. Selected 0.0 (pxc3)(SYNCED) as donor.
2024-02-02T14:27:09.138683Z 0 [Note] WSREP: (f23c4ba4, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2024-02-02T14:27:19.539501Z 0 [Warning] WSREP: 0.0 (pxc3): State transfer to 1.0 (pxc2) failed: -22 (Invalid argument)
2024-02-02T14:27:19.540970Z 0 [Note] WSREP: Member 0.0 (pxc3) synced with group.
2024-02-02T14:27:19.540997Z 0 [Note] WSREP: declaring 225a4946 at tcp://172.26.0.11:4567 stable
2024-02-02T14:27:19.541034Z 0 [Note] WSREP: declaring 38921ada at tcp://172.26.0.9:4567 stable
2024-02-02T14:27:19.541048Z 0 [Note] WSREP: forgetting 23e88f89 (tcp://172.26.0.12:4567)
2024-02-02T14:27:19.541602Z 0 [Note] WSREP: Node 225a4946 state primary
2024-02-02T14:27:19.541896Z 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(PRIM,225a4946,193)
memb {
225a4946,0
38921ada,0
f23c4ba4,0
}
joined {
}
left {
}
partitioned {
23e88f89,0
}
)
2024-02-02T14:27:19.541915Z 0 [Note] WSREP: Save the discovered primary-component to disk
2024-02-02T14:27:19.542449Z 0 [Note] WSREP: forgetting 23e88f89 (tcp://172.26.0.12:4567)
2024-02-02T14:27:19.542532Z 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 2, memb_num = 3
2024-02-02T14:27:19.542571Z 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2024-02-02T14:27:19.543136Z 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 2c00995b-c1d7-11ee-bda8-5aee0bbb8b41
2024-02-02T14:27:19.543379Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 2c00995b-c1d7-11ee-bda8-5aee0bbb8b41 from 0 (pxc3)
2024-02-02T14:27:19.543394Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 2c00995b-c1d7-11ee-bda8-5aee0bbb8b41 from 1 (pxc5)
2024-02-02T14:27:19.543406Z 0 [Note] WSREP: STATE EXCHANGE: got state msg: 2c00995b-c1d7-11ee-bda8-5aee0bbb8b41 from 2 (pxc6)
2024-02-02T14:27:19.543423Z 0 [Note] WSREP: Quorum results:
version = 6,
component = PRIMARY,
conf_id = 181,
members = 3/3 (primary/total),
act_id = 7155604420,
last_appl. = 7155604297,
protocols = 0/9/3 (gcs/repl/appl),
group UUID = 3a0118b7-6c7b-11eb-935d-5fcc5d08be87
2024-02-02T14:27:19.543437Z 0 [Note] WSREP: Flow-control interval: [173, 173]
2024-02-02T14:27:19.543610Z 6 [Note] WSREP: REPL Protocols: 9 (4, 2)
2024-02-02T14:27:19.543646Z 6 [Note] WSREP: REPL Protocols: 9 (4, 2)
2024-02-02T14:27:19.543665Z 6 [Note] WSREP: New cluster view: global state: 3a0118b7-6c7b-11eb-935d-5fcc5d08be87:7155604420, view# 182: Primary, number of nodes: 3, my index: 2, protocol version 3
2024-02-02T14:27:19.543674Z 6 [Note] WSREP: Setting wsrep_ready to true
2024-02-02T14:27:19.543684Z 6 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 4 → 3) (Increment: 4 → 3)
2024-02-02T14:27:19.543784Z 6 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2024-02-02T14:27:19.545839Z 6 [Note] WSREP: Assign initial position for certification: 7155604420, protocol version: 4
2024-02-02T14:27:19.545955Z 0 [Note] WSREP: Service thread queue flushed.
2024-02-02T14:27:24.641379Z 0 [Note] WSREP: cleaning up 23e88f89 (tcp://172.26.0.12:4567)

matthewb · February 3, 2024, 9:33pm

We need to see the error logs of the xtrabackup process. They are inside the $datadir of MySQL on both sides.

Ken_Scott · February 5, 2024, 2:40pm

Unfortunately there are no xtrabackup logs on the requesting server and on the doner the logs are 3 years old and include no updated info.

matthewb · February 5, 2024, 4:27pm

I suggest that you perform a manual SST by taking a xtrabackup backup of node1 (use --galera-info), transfer to node2, prepare as usual, recreate the grastat.dat, and start the node. It should connect and IST from node1.

Topic		Replies	Views
cluster not start after reboot Percona XtraDB Cluster 5.x	1	1805	May 17, 2019
Node failing to join cluster after downtime Percona XtraDB Cluster 8.x	1	810	January 18, 2022
Not able to join the node to the cluster Percona XtraDB Cluster 8.x mysql , percona	15	2672	February 25, 2021
HI EVERYONE! i have a problem wwhen i restart my system ( i have 2 node of Percona XtraDB Cluster 5.7, rehl 7.9 enterprise). i can't start mysql again Percona XtraDB Cluster 5.x community , troubleshooting , mysql , percona	2	1048	September 15, 2021
Percona Mysql cluster working but systemctl status : failed Percona XtraDB Cluster 8.x mysql , percona	2	2198	September 26, 2022

Can't reconnect to cluster after reboot

Session variables

Other buffers and caches

Related topics