We currently have a 3 node cluster set up, and it is working great, I would like to add another 2 nodes to our setup, but it consistently fails. The Percona setup uses 2 sites, locally, we currently have 2 nodes, and our colocation we have a single node (trying to add more) the current node at the colo has been running for 3 months no issue.
Just wondering if ANYONE has an idea, all nodes have been updated to the same version software, selinux is off, firewalls are all off, can telnet to correct ports. This one has me pulling my hair out. I am going to try to add another node to my location here and see if the same thing happens, but I really need to have these other nodes set up and functioning for the new website rollout.
Are you certain you are running exactly the same versions across all nodes?
5.30 and 5.33 use different versions of xtrabackup for SST and different net streaming pipes (nc and socat, respectively).
From what I’m seeing in logs: [LIST=1]
[*]xb_stream_read_chunk(): wrong chunk magic at offset 0x0.
[/LIST]
The issue is described here:
Scratch that , I did a snapshot of the VM, tried the previous steps, no go, still does the same thing, does this mean I need to shut down and update the whole cluster? I was hoping I could do a rolling upgrade. is this possible?
Rolling upgrade is certainly possible, but you need to gather some info first.
Try this:
Check the exact version of all cluster nodes.
Check the exact version on the node you are trying to upgrade (if its 5.5.34 you’re good to go with point 4., if not try to upgrade to that version).
Check if socat is available on all systems.
If you use 5.5.33 on old nodes use wsrep_sst_method =xtrabackup-v2 and if below that version use wsrep_sst_method =xtrabackup on node you are trying to upgrade.
Do you use [COLOR=#252C2F]wsrep_sst_method =xtrabackup-v2 on the node wanting to join or on the donor node, or both? socat is already installed on all nodes.
I cannot get the new or updated notes to join the cluster, error keeps coming up like “Doesn’t look like a tar file” and “no route to host” when I can for sure see and talk to the nodes no problem. I also created symlinks using “ln -s /usr/bin/wsrep_sst_xtrabackup /usr/bin/wsrep_sst_xtrabackup-v2” on the production nodes on older version. Any help would be appreciated. Most of the time the sst dies, and doesnt kill socat, so I have to killall -9 socat to give it another go.
If that’s the case, set wsrep_sst_method = xtrabackup-v2 on JOINER, and wsrep_sst_method = xtrabackup on DONOR.
Also, maybe someone else can pipe in concerning what’s exactly in which RPM since you have 5.33 server and client/shared libs v. 5.34 RPM’s on DONOR.
From above, you don’t need symlink for xtrabackup-v2 on donor, use xtrabackup on donor and v2 on joiner.
Now, the error you posted seems like xtrabackup-v2 does not exist or is not in the path:
sh: wsrep_sst_xtrabackup-v2: command not found
If the node you are trying to bring up (joiner) is server 5.34 it should have this binary somewhere in bin directory (again, I’m running stuff from /opt/mysql/bin and the binary is there allright). Check that the binary exists on JOINER and that the my.cnf has correct paths or it is in sys path.
You can also try running yum update on [COLOR=#252C2F]percona-xtrabackup on new nodes (again, dunno what goes in what package).
You are correct with the versions, I still, cannot add another node running new version to the cluster, In the repo I am seeing both “percona-xtrabackup-20” and “percona-xtrabackup” Do you have any idea which one I am supposed to have installed? I assume percona-xtrabackup-20 is 2.0
I still get the same errors when I try to add it in. I CAN clone the older VMs and they add in fine, just not the new updated nodes.