Mysql cluster - Will never receive state. Need to abort

Hi you all,
I’m setting up a test cluster and having some strange troubles.

I setup this cluster on virtual box using the same ubuntu 22.04 vms that I’d use
on vsphere.

Here below you can find the mysql_galera_cluster.cnf

[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
port=3306
# Galera Cluster Config
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="mysql_cluster"
wsrep_cluster_address="gcomm://192.168.7.71,192.168.7.72,192.168.7.73"
wsrep_sst_method=rsync
wsrep_node_address="192.168.7.71"
wsrep_node_name="mysql-clusterdev1"

As it should be, the only differences between the “primary” node and the joiners are the
“wsrep_node_address” and “wsrep_node_name” vars.

With virtualbox the cluster works properly and I went to the company’s hypervisor,
but there something changed.

While I used the same machines, scripts and conf file, changing accordingly some values, the primary arose correctly but only with

mysqld -umysql --wsrep-new-cluster -D

When I tried to connect the other nodes I got immediate errors , here you can find the logs:

2024-05-17T14:09:18.587748Z 0 [Warning] [MY-011070] [Server] 'binlog_format' is deprecated and will be removed in a future release.
2024-05-17T14:09:18.590254Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.36-26.18) starting as process 16822
2024-05-17T14:09:18.606108Z 0 [System] [MY-000000] [WSREP] L: Loading provider /usr/lib/galera/libgalera_smm.so initial position: 00000000-0000-0000-0000-000000000000:-1
2024-05-17T14:09:18.606175Z 0 [System] [MY-000000] [WSREP] P: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
2024-05-17T14:09:18.607164Z 0 [System] [MY-000000] [WSREP] P: wsrep_load(): Galera 4.18(r0bc393fb) by Codership Oy <info@codership.com> loaded successfully.
2024-05-17T14:09:18.607249Z 0 [System] [MY-000000] [WSREP] L: Initializing event service v1
2024-05-17T14:09:18.607336Z 0 [System] [MY-000000] [WSREP] P: CRC-32C: using 64-bit x86 acceleration.
2024-05-17T14:09:18.607735Z 0 [System] [MY-000000] [WSREP] P: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2024-05-17T14:09:18.607862Z 0 [System] [MY-000000] [WSREP] P: GCache DEBUG: opened preamble:
Version: 2
UUID: 316d9a20-144d-11ef-8434-92dad2756103
Seqno: -1 - -1
Offset: -1
Synced: 0
2024-05-17T14:09:18.607911Z 0 [System] [MY-000000] [WSREP] P: Recovering GCache ring buffer: version: 2, UUID: 316d9a20-144d-11ef-8434-92dad2756103, offset: -1
2024-05-17T14:09:18.608076Z 0 [System] [MY-000000] [WSREP] P: GCache::RingBuffer initial scan... 0.0% (0/134217752 bytes) complete.
2024-05-17T14:09:18.701668Z 0 [System] [MY-000000] [WSREP] P: GCache::RingBuffer initial scan... 100.0% (134217752/134217752 bytes) complete.
2024-05-17T14:09:18.701853Z 0 [System] [MY-000000] [WSREP] P: Recovering GCache ring buffer: Recovery failed, need to do full reset.
2024-05-17T14:09:18.704527Z 0 [System] [MY-000000] [WSREP] P: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.7.72; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; deb
ug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts =
 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.me
m_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no
; gcs.fc_single_primary = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.vers
ion = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 11; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto;
2024-05-17T14:09:18.722891Z 0 [System] [MY-000000] [WSREP] Start replication
2024-05-17T14:09:18.723069Z 0 [System] [MY-000000] [WSREP] L: Connecting with bootstrap option: 0
2024-05-17T14:09:18.723114Z 0 [System] [MY-000000] [WSREP] P: Setting GCS initial position to 00000000-0000-0000-0000-000000000000:-1
2024-05-17T14:09:18.723205Z 0 [System] [MY-000000] [WSREP] P: protonet asio version 0
2024-05-17T14:09:18.723292Z 0 [System] [MY-000000] [WSREP] P: Using CRC-32C for message checksums.
2024-05-17T14:09:18.723330Z 0 [System] [MY-000000] [WSREP] P: backend: asio
2024-05-17T14:09:18.723453Z 0 [System] [MY-000000] [WSREP] P: gcomm thread scheduling priority set to other:0
2024-05-17T14:09:18.723663Z 0 [System] [MY-000000] [WSREP] P: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2024-05-17T14:09:18.723737Z 0 [System] [MY-000000] [WSREP] P: restore pc from disk failed
2024-05-17T14:09:18.723889Z 0 [System] [MY-000000] [WSREP] P: GMCast version 0
2024-05-17T14:09:18.724084Z 0 [System] [MY-000000] [WSREP] P: (0d286b44-947b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2024-05-17T14:09:18.724107Z 0 [System] [MY-000000] [WSREP] P: (0d286b44-947b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2024-05-17T14:09:18.724346Z 0 [System] [MY-000000] [WSREP] P: EVS version 1
2024-05-17T14:09:18.724472Z 0 [System] [MY-000000] [WSREP] P: gcomm: connecting to group 'mysql_cluster', peer '192.168.7.71:,192.168.7.72:,192.168.7.73:'
2024-05-17T14:09:18.725321Z 0 [System] [MY-000000] [WSREP] P: (0d286b44-947b, 'tcp://0.0.0.0:4567') Found matching local endpoint for a connection, blacklisting address tcp://192.168.7.72:4567
2024-05-17T14:09:18.725863Z 0 [System] [MY-000000] [WSREP] P: (0d286b44-947b, 'tcp://0.0.0.0:4567') connection established to ba087e3c-84c4 tcp://192.168.7.71:4567
2024-05-17T14:09:19.225984Z 0 [System] [MY-000000] [WSREP] P: EVS version upgrade 0 -> 1
2024-05-17T14:09:19.226095Z 0 [System] [MY-000000] [WSREP] P: declaring ba087e3c-84c4 at tcp://192.168.7.71:4567 stable
2024-05-17T14:09:19.226115Z 0 [System] [MY-000000] [WSREP] P: PC protocol upgrade 0 -> 1
2024-05-17T14:09:19.226317Z 0 [System] [MY-000000] [WSREP] P: Node ba087e3c-84c4 state prim
2024-05-17T14:09:19.226619Z 0 [System] [MY-000000] [WSREP] P: view(view_id(PRIM,0d286b44-947b,2) memb {
        0d286b44-947b,0
        ba087e3c-84c4,0
} joined {
} left {
} partitioned {
})
2024-05-17T14:09:19.226677Z 0 [System] [MY-000000] [WSREP] P: save pc into disk
2024-05-17T14:09:19.229008Z 0 [System] [MY-000000] [WSREP] P: discarding pending addr without UUID: tcp://192.168.7.73:4567
2024-05-17T14:09:19.725301Z 0 [System] [MY-000000] [WSREP] P: gcomm: connected
2024-05-17T14:09:19.725666Z 0 [System] [MY-000000] [WSREP] P: Changing maximum packet size to 64500, resulting msg size: 32636
2024-05-17T14:09:19.725861Z 0 [System] [MY-000000] [WSREP] P: Shifting CLOSED -> OPEN (TO: 0)
2024-05-17T14:09:19.725883Z 0 [System] [MY-000000] [WSREP] P: Opened channel 'mysql_cluster'
2024-05-17T14:09:19.725989Z 0 [System] [MY-000000] [WSREP] P: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2024-05-17T14:09:19.726145Z 0 [System] [MY-000000] [WSREP] P: STATE_EXCHANGE: sent state UUID: 0dc15a39-1457-11ef-8509-864a5ec4af57
2024-05-17T14:09:19.726474Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: sent state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57
2024-05-17T14:09:19.726636Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: got state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57 from 0 (mysql-clusterdev2)
2024-05-17T14:09:19.726792Z 2 [System] [MY-000000] [WSREP] Starting rollbacker thread 2
2024-05-17T14:09:19.726820Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: got state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57 from 1 (mysql-clusterdev1)
2024-05-17T14:09:19.726910Z 0 [System] [MY-000000] [WSREP] P: Quorum results:
        version    = 6,
        component  = PRIMARY,
        conf_id    = 1,
        members    = 1/2 (joined/total),
        act_id     = 62,
        last_appl. = 61,
        protocols  = 3/11/7 (gcs/repl/appl),
        vote policy= 0,
        group UUID = 316d9a20-144d-11ef-8434-92dad2756103
2024-05-17T14:09:19.726905Z 1 [System] [MY-000000] [WSREP] Starting applier thread 1
2024-05-17T14:09:19.726988Z 0 [System] [MY-000000] [WSREP] P: Flow-control interval: [23, 23]
2024-05-17T14:09:19.727008Z 0 [System] [MY-000000] [WSREP] P: Shifting OPEN -> PRIMARY (TO: 63)
2024-05-17T14:09:19.727072Z 1 [System] [MY-000000] [WSREP] P: ####### processing CC 63, local, ordered
2024-05-17T14:0 1 [System] [MY-000000] [WSREP] P: Process first view: 316d9a20-144d-11ef-8434-92dad2756103 my uuid: 0d286b44-1457-11ef-947b-4a931d67fa57
2024-05-17T14:09:19.727135Z 1 [System] [MY-000000] [WSREP] L: Server mysql-clusterdev2 connected to cluster at position 316d9a20-144d-11ef-8434-92dad2756103:63 with ID 0d286b44-1457-11ef-947b-4a931d67fa57
2024-05-17T14:09:19.727179Z 1 [System] [MY-000000] [WSREP] Server status change disconnected -> connected
2024-05-17T14:09:19.727242Z 1 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:19.727273Z 1 [System] [MY-000000] [WSREP] P: ####### My UUID: 0d286b44-1457-11ef-947b-4a931d67fa57
2024-05-17T14:09:19.727285Z 1 [System] [MY-000000] [WSREP] P: Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 11), state transfer needed: yes
2024-05-17T14:09:19.727350Z 0 [System] [MY-000000] [WSREP] P: Service thread queue flushed.
2024-05-17T14:09:19.727434Z 1 [System] [MY-000000] [WSREP] P: ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2024-05-17T14:09:19.727472Z 1 [System] [MY-000000] [WSREP] P: State transfer required:
        Group state: 316d9a20-144d-11ef-8434-92dad2756103:63
        Local state: 00000000-0000-0000-0000-000000000000:-1
2024-05-17T14:09:19.727499Z 1 [System] [MY-000000] [WSREP] Server status change connected -> joiner
2024-05-17T14:09:19.727522Z 1 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:19.727677Z 0 [System] [MY-000000] [WSREP] Running: 'wsrep_sst_rsync --role 'joiner' --address '192.168.7.72' --datadir '/var/lib/mysql/'   --parent '16824' --mysqld-version '8.0.36-26.18' --protocol '7' --plugin-dir '/usr/lib/mysql/plugin/'   ''  '''
2024-05-17T14:09:19.913407Z 1 [System] [MY-000000] [WSREP] P: ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 63, STRv: 3
2024-05-17T14:09:19.913576Z 1 [System] [MY-000000] [WSREP] P: IST receiver addr using tcp://192.168.7.72:4568
2024-05-17T14:09:19.913770Z 1 [System] [MY-000000] [WSREP] P: Prepared IST receiver for 0-63, listening at: tcp://192.168.7.72:4568
2024-05-17T14:09:19.914316Z 0 [System] [MY-000000] [WSREP] P: Member 0.0 (mysql-clusterdev2) requested state transfer from '*any*'. Selected 1.0 (mysql-clusterdev1)(SYNCED) as donor.
2024-05-17T14:09:19.914365Z 0 [System] [MY-000000] [WSREP] P: Shifting PRIMARY -> JOINER (TO: 63)
2024-05-17T14:09:19.914404Z 1 [System] [MY-000000] [WSREP] P: Requesting state transfer: success, donor: 1
2024-05-17T14:09:19.914430Z 1 [System] [MY-000000] [WSREP] P: Resetting GCache seqno map due to different histories.
2024-05-17T14:09:19.914442Z 1 [System] [MY-000000] [WSREP] P: GCache history reset: 316d9a20-144d-11ef-8434-92dad2756103:0 -> 316d9a20-144d-11ef-8434-92dad2756103:63
2024-05-17T14:09:19.917193Z 0 [Warning] [MY-000000] [WSREP] P: 1.0 (mysql-clusterdev1): State transfer to 0.0 (mysql-clusterdev2) failed: -78 (Remote address changed)
2024-05-17T14:09:19.917248Z 0 [ERROR] [MY-000000] [WSREP] P: ./gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1178: Will never receive state. Need to abort.
2024-05-17T14:09:19.917266Z 0 [System] [MY-000000] [WSREP] P: gcomm: terminating thread
2024-05-17T14:09:19.917419Z 0 [System] [MY-000000] [WSREP] P: gcomm: joining thread
2024-05-17T14:09:19.917615Z 0 [System] [MY-000000] [WSREP] P: gcomm: closing backend
2024-05-17T14:09:20.922168Z 0 [System] [MY-000000] [WSREP] P: view(view_id(NON_PRIM,0d286b44-947b,2) memb {
        0d286b44-947b,0
} joined {
} left {
} partitioned {
        ba087e3c-84c4,0
})
2024-05-17T14:09:20.922287Z 0 [System] [MY-000000] [WSREP] P: PC protocol downgrade 1 -> 0
2024-05-17T14:09:20.922301Z 0 [System] [MY-000000] [WSREP] P: view((empty))
2024-05-17T14:09:20.922501Z 0 [System] [MY-000000] [WSREP] P: gcomm: closed
2024-05-17T14:09:20.922532Z 0 [System] [MY-000000] [WSREP] P: mysqld: Terminated.
WSREP_SST: [ERROR] Parent mysqld process (PID:16824) terminated unexpectedly. (20240517 14:09:20.970)
/usr//bin/wsrep_sst_rsync: line 637: kill: (-16824) - No such process
WSREP_SST: [INFO] Joiner cleanup. rsync PID: 16991 (20240517 14:09:20.973)
WSREP_SST: [INFO] Joiner cleanup done. (20240517 14:09:21.481)

and on the donor side I find

2024-05-17T14:09:18.725618Z 0 [System] [MY-000000] [WSREP] P: (ba087e3c-84c4, 'tcp://0.0.0.0:4567') connection established to 0d286b44-947b tcp://192.168.7.72:4567
2024-05-17T14:09:19.225930Z 0 [System] [MY-000000] [WSREP] P: declaring 0d286b44-947b at tcp://192.168.7.72:4567 stable
2024-05-17T14:09:19.226220Z 0 [System] [MY-000000] [WSREP] P: Node ba087e3c-84c4 state prim
2024-05-17T14:09:19.229141Z 0 [System] [MY-000000] [WSREP] P: view(view_id(PRIM,0d286b44-947b,2) memb {
	0d286b44-947b,0
	ba087e3c-84c4,0
} joined {
} left {
} partitioned {
})
2024-05-17T14:09:19.229194Z 0 [System] [MY-000000] [WSREP] P: save pc into disk
2024-05-17T14:09:19.232848Z 0 [System] [MY-000000] [WSREP] P: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2024-05-17T14:09:19.232881Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: Waiting for state UUID.
2024-05-17T14:09:19.726564Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: sent state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57
2024-05-17T14:09:19.726649Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: got state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57 from 0 (mysql-clusterdev2)
2024-05-17T14:09:19.726683Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: got state msg: 0dc15a39-1457-11ef-8509-864a5ec4af57 from 1 (mysql-clusterdev1)
2024-05-17T14:09:19.726707Z 0 [System] [MY-000000] [WSREP] P: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 1,
	members    = 1/2 (joined/total),
	act_id     = 62,
	last_appl. = 61,
	protocols  = 3/11/7 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 316d9a20-144d-11ef-8434-92dad2756103
2024-05-17T14:09:19.726826Z 0 [System] [MY-000000] [WSREP] P: Flow-control interval: [23, 23]
2024-05-17T14:09:19.726918Z 2 [System] [MY-000000] [WSREP] P: ####### processing CC 63, local, ordered
2024-05-17T14:09:19.726954Z 2 [System] [MY-000000] [WSREP] P: ####### My UUID: ba087e3c-1456-11ef-84c4-93f70e1cf1db
2024-05-17T14:09:19.726965Z 2 [System] [MY-000000] [WSREP] P: Skipping cert index reset
2024-05-17T14:09:19.726975Z 2 [System] [MY-000000] [WSREP] P: REPL Protocols: 11 (6)
2024-05-17T14:09:19.726984Z 2 [System] [MY-000000] [WSREP] P: ####### Adjusting cert position: 62 -> 63
2024-05-17T14:09:19.727063Z 0 [System] [MY-000000] [WSREP] P: Service thread queue flushed.
2024-05-17T14:09:19.735120Z 2 [System] [MY-000000] [WSREP] L: ================================================
View:
  id: 316d9a20-144d-11ef-8434-92dad2756103:63
  status: primary
  protocol_version: 7
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 1
  members(2):
	0: 0d286b44-1457-11ef-947b-4a931d67fa57, mysql-clusterdev2
	1: ba087e3c-1456-11ef-84c4-93f70e1cf1db, mysql-clusterdev1
=================================================
2024-05-17T14:09:19.735185Z 2 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:19.743736Z 2 [System] [MY-000000] [WSREP] P: Lowest cert index boundary for CC from group: 63
2024-05-17T14:09:19.743784Z 2 [System] [MY-000000] [WSREP] P: Min available from gcache for CC from group: 1
2024-05-17T14:09:19.914334Z 0 [System] [MY-000000] [WSREP] P: Member 0.0 (mysql-clusterdev2) requested state transfer from '*any*'. Selected 1.0 (mysql-clusterdev1)(SYNCED) as donor.
2024-05-17T14:09:19.914395Z 0 [System] [MY-000000] [WSREP] P: Shifting SYNCED -> DONOR/DESYNCED (TO: 63)
2024-05-17T14:09:19.914535Z 2 [System] [MY-000000] [WSREP] P: Detected STR version: 1, req_len: 115, req: STRv1
2024-05-17T14:09:19.914576Z 2 [System] [MY-000000] [WSREP] P: Cert index preload: 63 -> 63
2024-05-17T14:09:19.915056Z 2 [System] [MY-000000] [WSREP] Server status change synced -> donor
2024-05-17T14:09:19.915089Z 2 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:19.915145Z 0 [System] [MY-000000] [WSREP] P: async IST sender starting to serve tcp://192.168.7.72:4568 sending 63-63, preload starts from 63
2024-05-17T14:09:19.915236Z 0 [System] [MY-000000] [WSREP] Initiating SST/IST transfer on DONOR side (wsrep_sst_rsync --role 'donor' --address '192.168.7.72:4444/rsync_sst' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'   --mysqld-version '8.0.36-26.18' --protocol '7' --plugin-dir '/usr/lib/mysql/plugin/'   ''  --binlog-index 'binlog.index' --gtid 316d9a20-144d-11ef-8434-92dad2756103:63 --local-gtid 316d9a20-144d-11ef-8434-92dad2756103:0 --server-id 1 --server-uuid 0cfbb087-11fb-11ef-8e7d-0050569e5917 )
2024-05-17T14:09:19.915503Z 0 [System] [MY-000000] [WSREP] P: IST sender 63 -> 63
2024-05-17T14:09:19.916100Z 0 [ERROR] [MY-000000] [WSREP] posix_spawnp(wsrep_sst_rsync --role 'donor' --address '192.168.7.72:4444/rsync_sst' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'   --mysqld-version '8.0.36-26.18' --protocol '7' --plugin-dir '/usr/lib/mysql/plugin/'   ''  --binlog-index 'binlog.index' --gtid 316d9a20-144d-11ef-8434-92dad2756103:63 --local-gtid 316d9a20-144d-11ef-8434-92dad2756103:0 --server-id 1 --server-uuid 0cfbb087-11fb-11ef-8e7d-0050569e5917 ) failed: 13 (Permission denied)
2024-05-17T14:09:19.916171Z 0 [ERROR] [MY-000000] [WSREP] Failed to start SST process: -13
2024-05-17T14:09:19.916239Z 2 [System] [MY-000000] [WSREP] sst_donor_thread signaled with 13
2024-05-17T14:09:19.916374Z 12 [ERROR] [MY-000000] [WSREP] Failed to execute: wsrep_sst_rsync --role 'donor' --address '192.168.7.72:4444/rsync_sst' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'   --mysqld-version '8.0.36-26.18' --protocol '7' --plugin-dir '/usr/lib/mysql/plugin/'   ''  --binlog-index 'binlog.index' --gtid 316d9a20-144d-11ef-8434-92dad2756103:63 --local-gtid 316d9a20-144d-11ef-8434-92dad2756103:0 --server-id 1 --server-uuid 0cfbb087-11fb-11ef-8e7d-0050569e5917  : 0 (Success)
2024-05-17T14:09:19.916455Z 12 [System] [MY-000000] [WSREP] L: SST sent: 00000000-0000-0000-0000-000000000000:-1
2024-05-17T14:09:19.916469Z 12 [System] [MY-000000] [WSREP] Server status change donor -> joined
2024-05-17T14:09:19.916552Z 12 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:19.916658Z 12 [ERROR] [MY-000000] [WSREP] Command did not run: wsrep_sst_rsync --role 'donor' --address '192.168.7.72:4444/rsync_sst' --local-port '3306' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/'   --mysqld-version '8.0.36-26.18' --protocol '7' --plugin-dir '/usr/lib/mysql/plugin/'   ''  --binlog-index 'binlog.index' --gtid 316d9a20-144d-11ef-8434-92dad2756103:63 --local-gtid 316d9a20-144d-11ef-8434-92dad2756103:0 --server-id 1 --server-uuid 0cfbb087-11fb-11ef-8e7d-0050569e5917
2024-05-17T14:09:19.916672Z 12 [System] [MY-000000] [WSREP] Cleaning up SST user.
2024-05-17T14:09:19.917032Z 0 [Warning] [MY-000000] [WSREP] P: 1.0 (mysql-clusterdev1): State transfer to 0.0 (mysql-clusterdev2) failed: -78 (Remote address changed)
2024-05-17T14:09:19.917069Z 0 [System] [MY-000000] [WSREP] P: Shifting DONOR/DESYNCED -> JOINED (TO: 63)
2024-05-17T14:09:19.917160Z 0 [System] [MY-000000] [WSREP] P: Processing event queue:... -nan% (0/0 events) complete.
2024-05-17T14:09:20.917997Z 0 [System] [MY-000000] [WSREP] P: forgetting 0d286b44-947b (tcp://192.168.7.72:4567)
2024-05-17T14:09:20.918019Z 0 [System] [MY-000000] [WSREP] P: Member 1.0 (mysql-clusterdev1) synced with group.
2024-05-17T14:09:20.918063Z 0 [System] [MY-000000] [WSREP] P: Node ba087e3c-84c4 state prim
2024-05-17T14:09:20.918073Z 0 [System] [MY-000000] [WSREP] P: Processing event queue:... 100.0% (1/1 events) complete.
2024-05-17T14:09:20.918084Z 0 [System] [MY-000000] [WSREP] P: Shifting JOINED -> SYNCED (TO: 63)
2024-05-17T14:09:20.918085Z 0 [System] [MY-000000] [WSREP] P: view(view_id(PRIM,ba087e3c-84c4,3) memb {
	ba087e3c-84c4,0
} joined {
} left {
} partitioned {
	0d286b44-947b,0
})
2024-05-17T14:09:20.918115Z 0 [System] [MY-000000] [WSREP] P: save pc into disk
2024-05-17T14:09:20.918173Z 2 [System] [MY-000000] [WSREP] L: Server mysql-clusterdev1 synced with group
2024-05-17T14:09:20.918198Z 2 [System] [MY-000000] [WSREP] Server status change joined -> synced
2024-05-17T14:09:20.918205Z 2 [System] [MY-000000] [WSREP] Synchronized with group, ready for connections
2024-05-17T14:09:20.918211Z 2 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:20.921787Z 0 [System] [MY-000000] [WSREP] P: forgetting 0d286b44-947b (tcp://192.168.7.72:4567)
2024-05-17T14:09:20.921825Z 0 [System] [MY-000000] [WSREP] P: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
2024-05-17T14:09:20.921901Z 0 [System] [MY-000000] [WSREP] P: STATE_EXCHANGE: sent state UUID: 0e77d18f-1457-11ef-b504-b2b30ba069be
2024-05-17T14:09:20.921920Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: sent state msg: 0e77d18f-1457-11ef-b504-b2b30ba069be
2024-05-17T14:09:20.921977Z 0 [System] [MY-000000] [WSREP] P: STATE EXCHANGE: got state msg: 0e77d18f-1457-11ef-b504-b2b30ba069be from 0 (mysql-clusterdev1)
2024-05-17T14:09:20.922011Z 0 [System] [MY-000000] [WSREP] P: Quorum results:
	version    = 6,
	component  = PRIMARY,
	conf_id    = 2,
	members    = 1/1 (joined/total),
	act_id     = 63,
	last_appl. = 61,
	protocols  = 3/11/7 (gcs/repl/appl),
	vote policy= 0,
	group UUID = 316d9a20-144d-11ef-8434-92dad2756103
2024-05-17T14:09:20.922061Z 0 [System] [MY-000000] [WSREP] P: Flow-control interval: [16, 16]
2024-05-17T14:09:20.922113Z 2 [System] [MY-000000] [WSREP] P: ####### processing CC 64, local, ordered
2024-05-17T14:09:20.922134Z 2 [System] [MY-000000] [WSREP] P: ####### My UUID: ba087e3c-1456-11ef-84c4-93f70e1cf1db
2024-05-17T14:09:20.922142Z 2 [System] [MY-000000] [WSREP] P: Skipping cert index reset
2024-05-17T14:09:20.922153Z 2 [System] [MY-000000] [WSREP] P: REPL Protocols: 11 (6)
2024-05-17T14:09:20.922160Z 2 [System] [MY-000000] [WSREP] P: ####### Adjusting cert position: 63 -> 64
2024-05-17T14:09:20.922186Z 0 [System] [MY-000000] [WSREP] P: Service thread queue flushed.
2024-05-17T14:09:20.926879Z 2 [System] [MY-000000] [WSREP] L: ================================================
View:
  id: 316d9a20-144d-11ef-8434-92dad2756103:64
  status: primary
  protocol_version: 7
  capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
  final: no
  own_index: 0
  members(1):
	0: ba087e3c-1456-11ef-84c4-93f70e1cf1db, mysql-clusterdev1
=================================================
2024-05-17T14:09:20.926920Z 2 [System] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2024-05-17T14:09:20.934927Z 2 [System] [MY-000000] [WSREP] P: Lowest cert index boundary for CC from group: 64
2024-05-17T14:09:20.934964Z 2 [System] [MY-000000] [WSREP] P: Min available from gcache for CC from group: 1
2024-05-17T14:09:20.935243Z 0 [System] [MY-000000] [WSREP] P: async IST sender served
2024-05-17T14:09:21.791762Z 0 [System] [MY-000000] [WSREP] P: (ba087e3c-84c4, 'tcp://0.0.0.0:4567') turning message relay requesting off
2024-05-17T14:09:26.292383Z 0 [System] [MY-000000] [WSREP] P:  cleaning up 0d286b44-947b (tcp://192.168.7.72:4567)

it’s a couple of days him hitting my head on this but I can’t find out a solution.
Any help will be appreciated.
Thanks in advance,
Rob

1 Like

You shouldn’t be using rsync method. Switch to wsrep_sst_method=xtrabackup-v2. Ensure all ports are opened between hosts (3306, 4444, 4567, 4568). Check SELinux/apparmour.