Percona Xtradb Cluster upgrade

Hi team,

l have 3 nodes running Percona Xtradb Cluster 5.7 version. Now l want to upgrdae them to version 8. For downtime requirement l want to do it like rolling upgrade. One node at a time. Is it possible doing major upgarde in Xtardb cluster? l read document that it is possible and l did it on node 3 but when starting mysqld service my node joined cluster with joiner state and thats totally normal but in the logs it stucks like this

joiner side:

Note] [MY-000000] [WSREP-SST] …Waiting for SST streaming to complete!

on donor side also in the logs l see

[Note] WSREP: Initiating SST/IST transfer on DONOR side (wsrep_sst_xtrabackup-v2 --role ‘donor’ --address ‘xx.xx.xxx:4444/xtrabackup_sst//1’ --socket ‘/var/lib/mysql/mysql.sock’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --mysqld-version ‘5.7.42-46-57’ --binlog ‘binlog’ --gtid ‘c11ed0aa-0c3b-11ee-b9dd-731cc246b6e6:572053310’ )

[Note] WSREP: DONOR thread signaled with 0

WSREP_SST: [INFO] Streaming with xbstream

[Note] WSREP: (078eaf51, ‘ssl://0.0.0.0:4567’) turning message relay requesting off

WSREP_SST: [INFO] Streaming the backup to joiner at xxxxxxx 4444

5.7 version uses galera 3, 8 version uses galera 4.

l updated packages on joiner and see logs that it upgraded data dir. But at the end it stucks. Why is that? Can you help me plsss? @matthewb @CTutte

joiner.log (2.5 KB)

donor.log (1.1 KB)

1 Like

With this upgrade method, please ensure that the 8.0 node joins the cluster (2 nodes with 5.7) through IST rather than SST. Based on the logs, it is currently triggering SST. Please verify that you have sufficient gcache size configured to continue the instance with IST.

2025-12-08T07:29:36.164484Z 2 [Note] [MY-000000] [Galera] State gap can't be serviced using IST. Switching to SST

1 Like

Make sure you also have gcache.recover=yes in your wsrep_provider_options

l changed parameters and upgraded one node today. It was successful, l just upgraded one of 3 nodes, then vendor said lets switch application upgarded node to see everything is good with application. If yes l can continue to upgarde other 2 nodes. but we got error ,first l saw l parameter pxc_strict_mode=Permissive, yes after upgrade l didnt change this parameter then l saw lots of connection refused and all nodes got down, Cluster crushed. l wonder can it be because of this parameter? l had to bootstrap one node again then other nodes joined, but we had downtime. @matthewb @Abhinav_Gupta

No. That’s not how it works. You cannot write to the upgraded node during a rolling upgrade process. If the vendor wants to test, you should create a brand new PXC 8 cluster in a testing/QA/pre-prod environment and test the application there.