Remove node from cluster for backup reasons

danarashad · March 16, 2022, 1:26pm

I am having issues with backup locking the cluster. I have to continually use DDL statements in production. I have a 5 node cluster I would like to remove 1 node from the cluster run the backup then add the node back to the cluster.
Do I go to each node and update wsrep_cluster_address? If so how do I update wsrep_cluster_address?

set wsrep_cluster_address = gcomm://192.168.2.59,192.168.2.60,192.168.2.61,192.168.2.62 leaving out the node i want to remove?
On the node i want to remove do I set wsrep_cluste_address=gcomm://

CTutte · March 17, 2022, 2:57pm

Hi danarashad,

For this use case what I would suggest doing is:

Set up an async replica from the cluster and take the backup from there to avoid disrupting normal cluster operation
OR
Check how DDL creation needs to be executed so frequently. DDL locking is a limitation and characteristic of how PXC works and will be disruptive for operations.
As operations grows (and possibly amount of DDLs grows as well) amount of disruptions will take longer and longer, disrupting normal cluster operations .

I would suggest any of the above approaches, but if you still want to take a node out of the cluster for taking a backup.

shutdown the node from where the backup will be taken
backup grastate.dat file with current cluster status
start the node in standalone (remove wsrep lines from config file)
take backup
undo changes (add back wsrep lines, restore grastate.dat file)
restart the node to join the cluster

You might find this useful to avoid IST: Want IST Not SST for Node Rejoins? We Have a Solution! - Percona Database Performance Blog

Regards

CTutte · March 17, 2022, 2:57pm

Typo. Last line should have said “You might find this useful to avoid SST”

danarashad · March 17, 2022, 3:57pm

The ddl is on a temp table in a stored procedure and before I upgraded to xtradb cluster 8.0.26 i never had the issues. I was on 5.7. If SET GLOBAL wsrep_desync=ON; do the backup then SET GLOBAL wsrep_desync=OFF; would that exclude would that be the same as removing it from the cluster then adding it back to the cluster?
If I go with removing the node, do I have to update the other nodes gcomm line?

matthewb · March 17, 2022, 5:10pm

@danarashad,
There is no setting that can remove a node from the cluster. wsrep_desync only tells the node to NOT broadcast flow control messages if it becomes overwhelmed with incoming transactions. The node remains connected and continues to participate in general.

You NEVER need to do this. wsrep_cluster_address does NOT define membership of your cluster. Membership/quorum of your cluster is defined by the cluster itself. The wsrep_cluster_address parameter only requires at least 1 other member. Once connected, that node will inform of ever other node also connected.

If you want to remove a node completely, do as Carlos suggests which is to stop the node, comment out all wsrep_ params, and start mysql.

Since the temporary table engine was completely overhauled in MySQL 8, I can understand why you might now have such an issue with this DDL.

danarashad · March 17, 2022, 5:46pm

Ok thanks. I will go with that option stopping node altering gcomm and renaming wsrep_cluster_name. do backup, then undo changes.

Oleksandr_Bezpiatov · March 17, 2022, 9:56pm

Hi @danarashad,

But still: if there is an option to setup separate replica from cluster, this will look more organic than such a complex solution to exclude-reinclude node in cluster.

danarashad · March 18, 2022, 12:26am

I looked at setting up a replica but I am lost. I don’t need auto fail over and I have a multi master setup. I didn’t quite understand the directions listed Async

matthewb · March 18, 2022, 4:02am

@danarashad,
Well, for starters, that blog post you linked is for Group Replication, which you are most certainly NOT using. (ie: you’re using PXC/Galera)

You need to follow any of the hundreds of posts out there for setting up simple async replication between two mysql servers. The fact that the source is also a PXC node doesn’t matter that much.

Ivan_Groenewold · March 18, 2022, 2:41pm

If you need a quick fix, you might want to consider changing internal_tmp_disk_storage_engine to revert to the previous value which is to use myisam for temp tables, although be aware of the implications.

Topic		Replies	Views
Remove Node from cluster Percona XtraDB Cluster 8.x	7	4784	November 2, 2021
Removing a node Percona XtraDB Cluster 5.x	4	3981	August 10, 2018
The Effect of WSREP_ON Percona XtraDB Cluster 5.x	7	1336	June 12, 2020
Take out one node from sync Percona XtraDB Cluster 5.x percona	1	629	July 23, 2021
Joining second node without cluster lock Percona XtraDB Cluster 5.x	7	1443	June 15, 2016

Remove node from cluster for backup reasons

Related topics