Setup:
3 node percona pxc cluster with the following percona images
Operator: 1.19.0
Cluster: 8.0.42-33.1
Xtrabackup: 8.0.35-34.1
I was trying triggering sst transfer after deleting pvc’s for one of the node, but the sst failing after donor sends all the data, because of sst-idle-timeout, causing joiner to fail to reach synced state and re triggering the sst again.
I was confused on why do we need the sst connection to open once all data is already sent?
joiner logs:
joiner: => Rate:[ 245MiB/s] Avg:[ 242MiB/s] Elapsed:1:08:10 Bytes: 967GiB :1:06:10 Bytes: 938GiB
joiner: => Rate:[ 242MiB/s] Avg:[ 242MiB/s] Elapsed:1:10:20 Bytes: 998GiB 242MiB/s] Elapsed:1:08:20 Bytes: 969GiB
2026-04-15T06:18:45.641528Z 0 [Note] [MY-000000] [Galera] 0.0 (mysqlcluster-pxc-1): State transfer to 2.0 (mysqlcluster-pxc-2) complete.
2026-04-15T06:18:45.641766Z 0 [Note] [MY-000000] [Galera] Member 0.0 (mysqlcluster-pxc-1) synced with group.
ERR:******************* FATAL ERROR ********************** 30 Bytes: 1011GiB 246MiB/s] Avg:[ 242MiB/s] Elapsed:1:10:30 Bytes: 1000GiB
2026-04-15T06:20:58.852365Z 0 [ERROR] [MY-000000] [WSREP-SST] Killing SST (889) with SIGKILL after stalling for 120 seconds.
2026-04-15T06:20:58.852370Z 0 [ERROR] [MY-000000] [WSREP-SST] Within the last 120 seconds (defined by the sst-idle-timeout variable),
2026-04-15T06:20:58.852384Z 0 [ERROR] [MY-000000] [WSREP-SST] the SST process on the joiner (this node) has not received any data from the donor.
2026-04-15T06:20:58.852399Z 0 [ERROR] [MY-000000] [WSREP-SST] This error could be caused by broken network connectivity between
2026-04-15T06:20:58.852413Z 0 [ERROR] [MY-000000] [WSREP-SST] the donor and the joiner (this node).
2026-04-15T06:20:58.852426Z 0 [ERROR] [MY-000000] [WSREP-SST] Check the network connection and restart the joiner node.
2026-04-15T06:20:58.852440Z 0 [ERROR] [MY-000000] [WSREP-SST] Line 324
2026-04-15T06:20:58.852453Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************************************************
2026-04-15T06:20:58.856608Z 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 191: 891 Killed socat -u openssl-listen:4444,reuseaddr,cert=/etc/mysql/ssl-internal/tls.crt,key=/etc/mysql/ssl-internal/tls.key,cafile=/etc/mysql/ssl-internal/ca.crt,verify=1,retry=30 stdio
2026-04-15T06:20:58.856619Z 0 [Note] [MY-000000] [WSREP-SST] 892 | pv -f -i 10 -N joiner -F '%N => Rate:%r Avg:%a Elapsed:%t %e Bytes: %b %p'
2026-04-15T06:20:58.856623Z 0 [Note] [MY-000000] [WSREP-SST] 893 | /usr/bin/pxc_extra/pxb-8.0/bin/xbstream -x --decompress
2026-04-15T06:20:58.856808Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR **********************
2026-04-15T06:20:58.856827Z 0 [ERROR] [MY-000000] [WSREP-SST] SST transfer has been interrupted.
2026-04-15T06:20:58.856849Z 0 [ERROR] [MY-000000] [WSREP-SST] Check if the log above indicates the interruption
2026-04-15T06:20:58.856887Z 0 [ERROR] [MY-000000] [WSREP-SST] was related to sst-idle-timeout configuration variable.
2026-04-15T06:20:58.856920Z 0 [ERROR] [MY-000000] [WSREP-SST] exit codes: 137 137 137
2026-04-15T06:20:58.856926Z 0 [ERROR] [MY-000000] [WSREP-SST] Line 1377
2026-04-15T06:20:58.856931Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************************************************
donor logs:
donor: => Rate:[ 243MiB/s] Avg:[ 242MiB/s] Elapsed:1:10:30 ETA 0:00:00s] Avg:[ 242MiB/s] Elapsed:1:08:20 ETA 0:00:00
donor: => Rate:[ 241MiB/s] Avg:[ 241MiB/s] Elapsed:1:11:25
2026-04-15T06:18:45.636340Z 0 [Note] [MY-000000] [Galera] SST sent: cbbf83fe-37ee-11f1-8643-76be332e1b50:103529
2026-04-15T06:18:45.636362Z 0 [Note] [MY-000000] [WSREP] Server status change donor -> joined
2026-04-15T06:18:45.644024Z 0 [Note] [MY-000000] [Galera] 0.0 (mysqlcluster-pxc-1): State transfer to 2.0 (mysqlcluster-pxc-2) complete.
2026-04-15T06:18:45.644047Z 0 [Note] [MY-000000] [Galera] Shifting DONOR/DESYNCED -> JOINED (TO: 103529)
2026-04-15T06:18:45.644088Z 0 [Note] [MY-000000] [Galera] Processing event queue:... -nan% (0/0 events) complete.
2026-04-15T06:18:45.644268Z 0 [Note] [MY-000000] [Galera] Member 0.0 (mysqlcluster-pxc-1) synced with group.
2026-04-15T06:18:45.644279Z 0 [Note] [MY-000000] [Galera] Processing event queue:... 100.0% (1/1 events) complete.
2026-04-15T06:18:45.644285Z 0 [Note] [MY-000000] [Galera] Shifting JOINED -> SYNCED (TO: 103529)
2026-04-15T06:18:45.644320Z 13 [Note] [MY-000000] [Galera] Server mysqlcluster-pxc-1 synced with group
2026-04-15T06:18:45.644346Z 13 [Note] [MY-000000] [WSREP] Server status change joined -> synced
2026-04-15T06:18:45.644352Z 13 [Note] [MY-000000] [WSREP] Synchronized with group, ready for connections