Node is alone in cluster

Hi all,

I have 5 node xtradb cluster with version 8.0.30-22 and all nodes running on vmware. Our application connects the database from only node2. Veeam backup is configured only at node03 (10.10.10.42). During veeam backup, node03 has losing network connection between other nodes for 5-10 seconds. At that time node03 suspected from cluster and when network connection established the node added to the cluster again. I know that this is normal situation.
But when we check the application logs we saw “WSREP has not yet prepared node for application use” during that time period. When I check the error logs, I saw that node2 really stay alone.

As you can see at the following log, at 2023-03-23T12:33:18.907006Z node is at non-primary state.
Why is node2 alone in the cluster even though access to only the node3 cluster is cut off?

The logs below belong to node 2.

2023-03-23T12:33:10.815022Z 0 [Note] [MY-000000] [Galera] (fcf5432e-baf5, ‘tcp://0.0.0.0:4567’) connection to peer 2add3eae-9a80 with addr tcp://10.10.10.42:4567 timed out, no messages seen in PT3S, socket stats: rtt: 3098 rttvar: 5504 rto: 1632000 lost: 1 last_data_recv: 3048 cwnd: 1 last_queued_since: 87488 last_delivered_since: 3043794261 send_queue_length: 1 send_queue_bytes: 656 segment: 0 messages: 1 (gmcast.peer_timeout)
2023-03-23T12:33:10.815336Z 0 [Note] [MY-000000] [Galera] Deferred close timer started for socket with remote endpoint: tcp://10.10.10.42:40484
2023-03-23T12:33:10.815513Z 0 [Note] [MY-000000] [Galera] (fcf5432e-baf5, ‘tcp://0.0.0.0:4567’) turning message relay requesting on, nonlive peers: tcp://10.10.10.42:4567
2023-03-23T12:33:10.816183Z 0 [Note] [MY-000000] [Galera] Deferred close timer handle_wait Operation aborted. for 0x7f0c046371e0
2023-03-23T12:33:10.816324Z 0 [Note] [MY-000000] [Galera] Deferred close timer destruct
2023-03-23T12:33:12.149288Z 0 [Note] [MY-000000] [Galera] (fcf5432e-baf5, ‘tcp://0.0.0.0:4567’) reconnecting to 2add3eae-9a80 (tcp://10.10.10.42:4567), attempt 0
2023-03-23T12:33:12.152090Z 0 [Note] [MY-000000] [Galera] (fcf5432e-baf5, ‘tcp://0.0.0.0:4567’) connection established to 2add3eae-9a80 tcp://10.10.10.42:4567
2023-03-23T12:33:12.894483Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2023-03-23T12:33:12.894686Z 0 [Note] [MY-000000] [Galera] evs::proto(fcf5432e-baf5, OPERATIONAL, view_id(REG,2add3eae-9a80,5423)) suspecting node: 2add3eae-9a80
2023-03-23T12:33:12.894762Z 0 [Note] [MY-000000] [Galera] evs::proto(fcf5432e-baf5, OPERATIONAL, view_id(REG,2add3eae-9a80,5423)) suspected node without join message, declaring inactive
2023-03-23T12:33:15.650407Z 0 [Note] [MY-000000] [Galera] (fcf5432e-baf5, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2023-03-23T12:33:17.898379Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2023-03-23T12:33:17.898699Z 0 [Note] [MY-000000] [Galera] evs::proto(fcf5432e-baf5, GATHER, view_id(REG,2add3eae-9a80,5423)) suspecting node: 2add3eae-9a80
2023-03-23T12:33:18.399198Z 0 [Note] [MY-000000] [Galera] evs::proto(fcf5432e-baf5, GATHER, view_id(REG,2add3eae-9a80,5423)) suspecting node: 2add3eae-9a80
2023-03-23T12:33:18.899712Z 0 [Note] [MY-000000] [Galera] declaring node with index 0 inactive (evs.inactive_timeout)
2023-03-23T12:33:18.899875Z 0 [Note] [MY-000000] [Galera] declaring node with index 2 suspected, timeout PT5S (evs.suspect_timeout)
2023-03-23T12:33:18.899921Z 0 [Note] [MY-000000] [Galera] declaring node with index 2 inactive (evs.inactive_timeout)
2023-03-23T12:33:18.907006Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,2add3eae-9a80,5423)
memb {
fcf5432e-baf5,0
}
joined {
}
left {
}
partitioned {
2add3eae-9a80,0
3f728cd4-82e4,0
3fd90a25-897e,0
52ede836-beea,0
}
)
2023-03-23T12:33:18.907437Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,fcf5432e-baf5,5424)
memb {
fcf5432e-baf5,0
}
joined {
}
left {
}
partitioned {
2add3eae-9a80,0
3f728cd4-82e4,0
3fd90a25-897e,0
52ede836-beea,0
}
)
2023-03-23T12:33:18.907560Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2023-03-23T12:33:18.907777Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2023-03-23T12:33:18.907832Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.
2023-03-23T12:33:18.907865Z 0 [Note] [MY-000000] [Galera] Shifting SYNCED → OPEN (TO: 124608636)
2023-03-23T12:33:18.907929Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2023-03-23T12:33:18.908094Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2023-03-23T12:33:18.908154Z 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.
2023-03-23T12:33:18.909181Z 10 [Note] [MY-000000] [Galera] Maybe drain monitors from 124608608 upto current CC event 124608636 upto:124608636
2023-03-23T12:33:18.909313Z 10 [Note] [MY-000000] [Galera] Drain monitors from 124608608 up to 124608636
2023-03-23T12:33:18.913261Z 18101881 [Warning] [MY-000000] [Galera] Send action {(nil), 504, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.913547Z 18077786 [Warning] [MY-000000] [Galera] Send action {(nil), 408, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.913743Z 0 [Warning] [MY-000000] [Galera] Failed to report last committed 82f7c7cf-0500-11ed-8bc0-7ac3d7767bf1:124608576, -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.913778Z 18100092 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.914008Z 18098647 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.914336Z 18097246 [Warning] [MY-000000] [Galera] Send action {(nil), 600, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.914486Z 17957234 [Warning] [MY-000000] [Galera] Send action {(nil), 448, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.914688Z 18103410 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.915040Z 18103411 [Warning] [MY-000000] [Galera] Send action {(nil), 712, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.915277Z 17957153 [Warning] [MY-000000] [Galera] Send action {(nil), 712, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.915411Z 10 [Note] [MY-000000] [Galera] ================================================
View:
id: 82f7c7cf-0500-11ed-8bc0-7ac3d7767bf1:124608636
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: fcf5432e-a0ea-11ed-baf5-0217a0428028, dbnode02
=================================================
2023-03-23T12:33:18.915508Z 10 [Note] [MY-000000] [Galera] Non-primary view
2023-03-23T12:33:18.915566Z 10 [Note] [MY-000000] [WSREP] Server status change synced → connected
2023-03-23T12:33:18.915602Z 10 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-03-23T12:33:18.915640Z 10 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-03-23T12:33:18.915695Z 18097760 [Warning] [MY-000000] [Galera] Send action {(nil), 816, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.915723Z 10 [Note] [MY-000000] [Galera] Maybe drain monitors from 124608636 upto current CC event 124608636 upto:124608636
2023-03-23T12:33:18.915893Z 10 [Note] [MY-000000] [Galera] Drain monitors from 124608636 up to 124608636
2023-03-23T12:33:18.915943Z 10 [Note] [MY-000000] [Galera] ================================================
View:
id: 82f7c7cf-0500-11ed-8bc0-7ac3d7767bf1:124608636
status: non-primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 0
members(1):
0: fcf5432e-a0ea-11ed-baf5-0217a0428028, dbnode02
=================================================
2023-03-23T12:33:18.915976Z 10 [Note] [MY-000000] [Galera] Non-primary view
2023-03-23T12:33:18.916003Z 10 [Note] [MY-000000] [WSREP] Server status change connected → connected
2023-03-23T12:33:18.916034Z 10 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-03-23T12:33:18.916102Z 17997105 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.916330Z 17884333 [Warning] [MY-000000] [Galera] Send action {(nil), 856, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.916511Z 18098087 [Warning] [MY-000000] [Galera] Send action {(nil), 632, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.916692Z 14511685 [Warning] [MY-000000] [Galera] Send action {(nil), 560, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.916869Z 18092012 [Warning] [MY-000000] [Galera] Send action {(nil), 480, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917083Z 18100574 [Warning] [MY-000000] [Galera] Send action {(nil), 816, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917296Z 18094566 [Warning] [MY-000000] [Galera] Send action {(nil), 712, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917462Z 18097593 [Warning] [MY-000000] [Galera] Send action {(nil), 1896, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917599Z 18095774 [Warning] [MY-000000] [Galera] Send action {(nil), 736, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917803Z 7523155 [Warning] [MY-000000] [Galera] Send action {(nil), 504, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.917930Z 18101917 [Warning] [MY-000000] [Galera] Send action {(nil), 1464, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.918086Z 18094768 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.918256Z 18103416 [Warning] [MY-000000] [Galera] Send action {(nil), 992, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.918698Z 17884334 [Warning] [MY-000000] [Galera] Send action {(nil), 528, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.918871Z 17884332 [Warning] [MY-000000] [Galera] Send action {(nil), 528, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.919332Z 17957521 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.919495Z 17884328 [Warning] [MY-000000] [Galera] Send action {(nil), 528, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.919655Z 18095108 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.919783Z 4507395 [Warning] [MY-000000] [Galera] Send action {(nil), 584, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.919937Z 17884326 [Warning] [MY-000000] [Galera] Send action {(nil), 528, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.920091Z 18001396 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.920276Z 17861899 [Warning] [MY-000000] [Galera] Send action {(nil), 536, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.920557Z 18100982 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.920722Z 18096163 [Warning] [MY-000000] [Galera] Send action {(nil), 728, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.920936Z 18102786 [Warning] [MY-000000] [Galera] Send action {(nil), 816, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.921183Z 17884330 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.921394Z 17884327 [Warning] [MY-000000] [Galera] Send action {(nil), 696, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.921571Z 17884329 [Warning] [MY-000000] [Galera] Send action {(nil), 696, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.921793Z 17884331 [Warning] [MY-000000] [Galera] Send action {(nil), 728, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.921940Z 17884302 [Warning] [MY-000000] [Galera] Send action {(nil), 696, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.922237Z 18102787 [Warning] [MY-000000] [Galera] Send action {(nil), 504, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.922388Z 18096164 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.922524Z 18095217 [Warning] [MY-000000] [Galera] Send action {(nil), 728, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.922681Z 18097544 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.922989Z 18097716 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.923156Z 18095107 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.923301Z 18098976 [Warning] [MY-000000] [Galera] Send action {(nil), 728, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.923444Z 18095899 [Warning] [MY-000000] [Galera] Send action {(nil), 728, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.923736Z 18103422 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.923867Z 17952540 [Warning] [MY-000000] [Galera] Send action {(nil), 736, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.924049Z 18039475 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.924320Z 18098544 [Warning] [MY-000000] [Galera] Send action {(nil), 736, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.924497Z 18094541 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.924646Z 18103427 [Warning] [MY-000000] [Galera] Send action {(nil), 720, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:18.924781Z 18103387 [Warning] [MY-000000] [Galera] Send action {(nil), 2416, WRITESET} returned -107 (Transport endpoint is not connected)
2023-03-23T12:33:19.919593Z 0 [Note] [MY-000000] [Galera] declaring 2add3eae-9a80 at tcp://10.10.10.42:4567 stable
2023-03-23T12:33:19.919767Z 0 [Note] [MY-000000] [Galera] declaring 3f728cd4-82e4 at tcp://10.10.10.30:4567 stable
2023-03-23T12:33:19.919814Z 0 [Note] [MY-000000] [Galera] declaring 3fd90a25-897e at tcp://10.10.10.40:4567 stable
2023-03-23T12:33:19.919853Z 0 [Note] [MY-000000] [Galera] declaring 52ede836-beea at tcp://10.10.10.31:4567 stable
2023-03-23T12:33:19.926246Z 0 [Note] [MY-000000] [Galera] Node 3f728cd4-82e4 state primary
2023-03-23T12:33:19.932347Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,2add3eae-9a80,5425)
memb {
2add3eae-9a80,0
3f728cd4-82e4,0
3fd90a25-897e,0
52ede836-beea,0
fcf5432e-baf5,0
}
joined {
}
left {
}
partitioned {
}
)
2023-03-23T12:33:19.932502Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2023-03-23T12:33:19.935191Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 4, memb_num = 5
2023-03-23T12:33:19.935339Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: Waiting for state UUID.
2023-03-23T12:33:19.944121Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: sent state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df
2023-03-23T12:33:19.949907Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df from 0 (dbnode03)
2023-03-23T12:33:19.950024Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df from 2 (dbnode01)
2023-03-23T12:33:19.950096Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df from 3 (dbnode05)
2023-03-23T12:33:19.950162Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df from 4 (dbnode02)
2023-03-23T12:33:19.954993Z 0 [Note] [MY-000000] [Galera] STATE EXCHANGE: got state msg: e4be834a-c976-11ed-9b41-5a0bdcfef9df from 1 (dbnode04)
2023-03-23T12:33:19.955090Z 0 [Note] [MY-000000] [Galera] Quorum results:
version = 6,
component = PRIMARY,
conf_id = 4612,
members = 5/5 (primary/total),
act_id = 124608636,
last_appl. = 124608460,
protocols = 2/10/4 (gcs/repl/appl),
vote policy= 0,
group UUID = 82f7c7cf-0500-11ed-8bc0-7ac3d7767bf1
2023-03-23T12:33:19.955319Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [224, 224]
2023-03-23T12:33:19.955388Z 0 [Note] [MY-000000] [Galera] Restored state OPEN → SYNCED (124608637)
2023-03-23T12:33:19.955878Z 10 [Note] [MY-000000] [Galera] ####### processing CC 124608637, local, ordered
2023-03-23T12:33:19.956037Z 10 [Note] [MY-000000] [Galera] Maybe drain monitors from 124608636 upto current CC event 124608637 upto:124608636
2023-03-23T12:33:19.956084Z 10 [Note] [MY-000000] [Galera] Drain monitors from 124608636 up to 124608636
2023-03-23T12:33:19.956153Z 10 [Note] [MY-000000] [Galera] ####### My UUID: fcf5432e-a0ea-11ed-baf5-0217a0428028
2023-03-23T12:33:19.956193Z 10 [Note] [MY-000000] [Galera] Skipping cert index reset
2023-03-23T12:33:19.956225Z 10 [Note] [MY-000000] [Galera] REPL Protocols: 10 (5)
2023-03-23T12:33:19.956258Z 10 [Note] [MY-000000] [Galera] ####### Adjusting cert position: 124608636 → 124608637
2023-03-23T12:33:19.956535Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2023-03-23T12:33:19.958068Z 10 [Note] [MY-000000] [Galera] Server dbnode02 synced with group
2023-03-23T12:33:19.958136Z 10 [Note] [MY-000000] [WSREP] Server status change connected → synced
2023-03-23T12:33:19.958166Z 10 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections
2023-03-23T12:33:19.958196Z 10 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-03-23T12:33:19.958599Z 10 [Note] [MY-000000] [Galera] ================================================
View:
id: 82f7c7cf-0500-11ed-8bc0-7ac3d7767bf1:124608637
status: primary
protocol_version: 4
capabilities: MULTI-MASTER, CERTIFICATION, PARALLEL_APPLYING, REPLAY, ISOLATION, PAUSE, CAUSAL_READ, INCREMENTAL_WS, UNORDERED, PREORDERED, STREAMING, NBO
final: no
own_index: 4
members(5):
0: 2add3eae-a7ba-11ed-9a80-7eb619eed191, dbnode03
1: 3f728cd4-aef5-11ed-82e4-237a853d09be, dbnode04
2: 3fd90a25-a0e9-11ed-897e-73715feadb4d, dbnode01
3: 52ede836-aef5-11ed-beea-faa17c876327, dbnode05
4: fcf5432e-a0ea-11ed-baf5-0217a0428028, dbnode02
=================================================
2023-03-23T12:33:19.958777Z 10 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-03-23T12:33:19.965985Z 10 [Note] [MY-000000] [Galera] Recording CC from group: 124608637
2023-03-23T12:33:19.966102Z 10 [Note] [MY-000000] [Galera] Lowest cert index boundary for CC from group: 124608461
2023-03-23T12:33:19.966154Z 10 [Note] [MY-000000] [Galera] Min available from gcache for CC from group: 113514750

Did you ensure that the connection from node2 → node1/3/4/5 is active and stable? Then when node3 goes offline for backup (which it should not need to do, btw) this causes node2 to also go offline?

Yes, I’m sure that from node2 to node1/3/4/5 the connection is active and stable. I have checked the whole error logs in the cluster nodes and only node3 goes down.

BTW, I tried to simulate this issue in the test environment, when the network connection goes down for a long time (for example 3 mins) everything seems normal, only one node disconnects from the cluster and the other nodes continue to operate.

But when I try to disconnect one node from the cluster for a very short time period (for example 3-5 seconds), I notice that I’m getting NON-PRIM errors.

I simulate this issue with the following commands. I think this is an unexpected result.

iptables -I INPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I INPUT -m tcp -p tcp --dport 3306 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 3306 -j REJECT
sleep 3
iptables -F
iptables -I INPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I INPUT -m tcp -p tcp --dport 3306 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 3306 -j REJECT
sleep 4
iptables -F
iptables -I INPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I INPUT -m tcp -p tcp --dport 3306 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 3306 -j REJECT
sleep 5
iptables -F
iptables -I INPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I INPUT -m tcp -p tcp --dport 3306 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 4567 -j REJECT
iptables -I OUTPUT -m tcp -p tcp --dport 3306 -j REJECT
sleep 6
iptables -F

In our scenario, the network connection errors are not stable. For example, I can’t say that the network connections go down and come back after 1 min, It’s down and up for very short periods.