Remove Node from cluster

I need to remove a node from the db cluster. Here is my cluster line wsrep_cluster_address='gcomm://192.168.2.59,192.168.2.60,192.168.2.61,192.168.2.62,192.168.2.63' I need to remove 2.63 address. Do I need to stop mysql first, then remove the ip address from the cluster and restart the services?

1 Like

Hi @danarashad welcome to the Percona Forums!

If this will be a permanent removal of the host 192.168.2.63 you should gracefully stop mysqld on this node so that it removes itself from the cluster. Then on the other nodes, you can remove this IP address from the wsrep_cluster_address line.

You will not need to stop or restart the other members of the cluster, they will automatically pick up the departure of 192.168.2.63 during that node’s departure from the cluster. Updating the configuration file on disk is simply so that when the remaining nodes boot that they don’t try to connect to the downed node.

Keep in mind that best practices are that you have an odd number of members in the cluster so that quorum can always be achieved should the network be partitioned, so I encourage you to add a 5th member (or reduce your cluster to 3 nodes) at some point in the near future.

Best of luck!

1 Like

Thank you for the information. I am very new to clustering and the person before me left no instructions or documentation. So if I stop the service then do my upgrades if i restart mysqld will it then join the cluster?

2 Likes

If you need to upgrade all of your node to a newer version of PXC, yes, stop node1, upgrade it, start it back up. It will auto rejoin. Then move to node2, then node3, etc.

If you are trying to upgrade from 5.7 to 8.0, the best practice is to start a brand new 8.0 cluster and migrate your data rather than upgrade-in-place. It can be done, but it’s very tricky for a PXC beginner.

The parameter wsrep_cluster_address does NOT define your cluster. This parameter only serves as an initial list of some members in your cluster. You can absolutely have a node join that is not in this list. Cluster membership is determined by the nodes themselves. When they join, they look for that initial list and establish connections, then they receive as list of ALL members for further connections.

2 Likes

What defines the cluster? I assumed the wsrep_cluster_address on start up defines the cluster. Removing stopping mysql and removing ip’s from cluster_address I assumed would kick the node out of the cluster.

2 Likes

The cluster itself defines the cluster. Consider this layout and config values:

node1: wsrep_cluster_address=gcomm://node3
node2: wsrep_cluster_address=gcomm://node2,node4
node3: wsrep_cluster_address=gcomm://node2,node3

Notice how node1 is not part of any of the config parameters, and notice that node4 doesn’t exist. You would bootstrap node2, then you could then start node3, which would connect to node2 based on the parameter. They would form a 2-node cluster. Then you’d start node1 which would connect to node3. Upon this connection, node3 would inform node1 about node2 and then node1 would connect to node2 even though node2 is not part of node1’s config. You now have a 3-node cluster. node4 is a possible member. Does node4s existence prevent the cluster from running? No, it does not.

“stopping mysql” is what permanently kicks a node out of the cluster. Again, as I said above, the wsrep_cluster_address is a list of potential members of the cluster.

2 Likes

So if I stop mysqld on node3, wouldn’t I have to remove all references to node3 from the other config files. I am guessing if I were to reboot a node that is referencing node3. That node would add node3 back to the cluster and propagate node3 to the other servers? So my assumption is I would still need to remove all references in the config files for node3 just in-case a reboot would need to occur. As well as remove node1 and node2 from node3’s config file. However stopping mysqld when i :SHOW VARIABLES LIKE 'wsrep_cluster_address'; I am still getting the ip of the stopped mysql server.

1 Like

No.

No, that is all incorrect. Again, the config DOES NOT define the cluster. If you rebooted a node that references node3, that node would attempt to connect to node3 and would fail (since node3 is offline). The node would say “oh well, node3 isn’t responding” and it would move on. It would NOT error. It would NOT prevent the node from starting.

No, wrong assumption.

The cluster defines its own membership. The config does NOT. Please re-read my example above. Again, as I said above, the wsrep_cluster_address is a list of potential members of the cluster.

1 Like