I’m using 5.8. I’m having a lot of trouble recycling a cluster in order to apply OS upgrades, and I’m wondering if there is a recommended way.
What I try to do:
- Stop my application accessing a node.
- Stop any automatic restarting software (e.g. monit).
- Gracefully stop the cluster node via /etc/init.d/mysql stop, or occasionally via SIGQUIT.
- Wait for the mysqld process to exit.
- apt-get update; apt-get upgrade
- Watch the node come back up and enter the cluster.
- Re-enable application access to the node.
- Move on to the next one.
Pretty much every time I’ve done this in the last few months, I end up with nodes that crash, or stay up but go non-Primary. My impression is that this is worse with 5.8 than it was with 5.7.
I end up in a nightmare of frantically bootstrapping to get my cluster back up, then nursing nodes through SST. I can’t really spend time debugging why the node is not functional because I’m trying to get the live system back up.
Is there a recommended process for this kind of thing? Maybe I’m doing it completely wrong?
There’s no such thing as MySQL 5.8. Can you please clarify?
Sorry, of course you’re right. I mean v8. Specifically:
mysql> select @@version;
| @@version |
| 8.0.28-19.1 |
root@db3:/var/www/iznik# dpkg --list | grep percona
ii percona-release 1.0-27.generic all Package to install Percona gpg key and APT repos
ii percona-xtradb-cluster 1:8.0.28-19-1.focal amd64 Percona XtraDB Cluster with Galera
ii percona-xtradb-cluster-client 1:8.0.28-19-1.focal amd64 Percona XtraDB Cluster database client binaries
ii percona-xtradb-cluster-common 1:8.0.28-19-1.focal amd64 Percona XtraDB Cluster database common files (e.g. /etc/mysql/my.cnf)
rc percona-xtradb-cluster-common-5.7 5.7.33-31.49-1.bionic amd64 Percona XtraDB Cluster database common files (e.g. /etc/mysql/my.cnf)
rc percona-xtradb-cluster-garbd-5.7 5.7.27-31.39-1.bionic amd64 Garbd components of Percona XtraDB Cluster
ii percona-xtradb-cluster-server 1:8.0.28-19-1.focal amd64 Percona XtraDB Cluster database server binaries
rc percona-xtradb-cluster-server-5.7 5.7.33-31.49-1.bionic amd64 Percona XtraDB Cluster database server binaries
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <firstname.lastname@example.org> (modified by Percona <https://percona.com/>) |
| wsrep_provider_version | 4.11(a9008fc)
(I think the 5.7 packages are still installed from before I upgraded.)
Your steps are correct. Do some manual package cleanup; remove those 5.7 packages. Ensure 8.0 is the binary in place.
If you are only taking action on 1 node at a time, then the rest of the cluster should remain online PRIMARY. You have 3 nodes, right? The loss of 1 node should NOT cause any issues with the remaining nodes.
SST each time you restart is a misconfiguration. You should IST after a simple restart. Make sure you have
gcache.size=2G;gcache.recover=yes in your wsrep_provider_options.
Thanks, I don’t have those options. I’ll add those and see if it helps (when I next dare to upgrade).
Yes, I have 3 nodes.
Yes, I have 3 nodes.
Then you should never have any issues if you only operate on 1 node at a time. We do labs like this all the time in our training classes never experience any crashes.