Not possible for JOINER TO join the Cluster

Folks, how are you all? I need some help here, the story is as follows…

Some time ago I implemented a XtraDB Cluster using the packages on version 5.6.20-68.0-56-log. This cluster’s version has four nodes and it’s in production, running very well with no problems at all. I’ve got another one running version 5.5.39 with the same configuration files on all the four nodes and no problems to implement the cluster (considering out some variables presented below for 5.6).

On both scenarios, the setup sequence was the same: to get the Percona Repository installed on CentOS 6.5 machines, install the Percona-XtraDB-Cluster-56 and xtrabackup as well (just for a double check).

Configuration files were created with the following basic configs:

#: arquivo configuração, nó # 1

[mysqld]
user=mysql
server_id=1
datadir=/var/lib/mysql
log_bin=node01-bin
binlog_format=ROW
log_slave_updates=true
enforce_gtid_consistency=true
default_storage_engine=innodb
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_log_group_home_dir=/var/lib/mysql
innodb_log_files_in_group=2
innodb_log_file_size=2G
#: wsrep variables
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name=mycluster
wsrep_node_address=192.168.0.101
wsrep_node_name=node01
wsrep_cluster_address=gcomm://192.168.0.101:4567,192.168.0.102:4567,192.168.0.10 3:4567, 192.168.0.104:4567
wsrep_slave_threads=2
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:123

#: arquivo configuração, nó # 2

[mysqld]
user=mysql
server_id=2
datadir=/var/lib/mysql
log_bin=node02-bin
binlog_format=ROW
log_slave_updates=true
enforce_gtid_consistency=true
default_storage_engine=innodb
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_log_group_home_dir=/var/lib/mysql
innodb_log_files_in_group=2
innodb_log_file_size=
#: wsrep variables
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name=mycluster
wsrep_node_address=192.168.0.102
wsrep_node_name=node02
wsrep_cluster_address=gcomm://192.168.0.101:4567,192.168.0.102:4567,192.168.0.10 3:4567, 192.168.0.104:4567
wsrep_slave_threads=2
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:123

#: arquivo configuração, nó # 3

[mysqld]
user=mysql
server_id=3
datadir=/var/lib/mysql
log_bin=node03-bin
binlog_format=ROW
default_storage_engine=innodb
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_log_group_home_dir=/var/lib/mysql
innodb_log_files_in_group=2
innodb_log_file_size=
#: wsrep variables
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_name=mycluster
wsrep_node_address=192.168.0.103
wsrep_node_name=node03
wsrep_cluster_address=gcomm://192.168.0.101:4567,192.168.0.102:4567,192.168.0.10 3:4567, 192.168.0.104:4567
wsrep_slave_threads=2
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:123

#: arquivo de configuração, nó # 4

[mysqld]
user=mysql
datadir=/var/lib/mysql
server_id=4
log_bin=node04-bin
binlog_format=ROW
log_slave_updates
default_storage_engine=innodb
innodb_locks_unsafe_for_binlog=1
innodb_autoinc_lock_mode=2
innodb_log_group_home_dir=/var/lib/mysql
innodb_log_files_in_group=2
innodb_log_file_size=
#: wsrep variables
wsrep_provider=/usr/lib64/libgalera_smm.so
wsrep_cluster_address=gcomm://192.168.0.101:4567,192.168.0.102:4567,192.168.0.10 3:4567, 192.168.0.104:4567
wsrep_cluster_name=mycluster
wsrep_node_name=node04
wsrep_node_address=192.168.0.104
wsrep_sst_method=xtrabackup
wsrep_sst_auth=sst:123

The cluster is running as I said…

$ mysql -u root -pxxxxxxx -e “show status like ‘wsrep_cluster_size’”
±-------------------±------+
| Variable_name | Value |
±-------------------±------+
| wsrep_cluster_size | 4 |
±-------------------±------+

So, with those configuration files, I started a new project, but now to test the version Ver 5.6.21-70.1-56. I simple copied all the configuration I have currently running very well as showed previously. The new version I’m testing is:

$ mysqld --version
mysqld Ver 5.6.21-70.1-56 for Linux on x86_64 (Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150)

The bootstrap was OK, but the second node is not joining the cluster even having the SST completing with an OK message at the end (gotten from the innobackup.backup.log).

$ sudo tail -f /var/lib/mysql/innobackup.backup.log
xtrabackup: Transaction log of lsn (1627264) to (1627264) was copied.
141212 16:48:55 innobackupex: Executing UNLOCK BINLOG
141212 16:48:55 innobackupex: Executing UNLOCK TABLES
141212 16:48:55 innobackupex: All tables unlocked

innobackupex: Backup created in directory ‘/tmp’
innobackupex: MySQL binlog position: filename ‘node01-bin.000002’, position 120
141212 16:48:55 innobackupex: Connection to database server closed
innobackupex: You must use -i (–ignore-zeros) option for extraction of the tar stream.
141212 16:48:55 innobackupex: completed OK!

The mysql error log on the second node says that the SST completed with errors:

WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (d4022a87-820a-11e4-bcf4-5793c1c5841f): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():456. IST will be unavailable.
2014-12-12 16:49:47 5698 [Note] WSREP: Member 1.0 (node02) requested state transfer from ‘any’. Selected 0.0 (node01)(SYNCED) as donor.
2014-12-12 16:49:47 5698 [Note] WSREP: Shifting PRIMARY → JOINER (TO: 5)
2014-12-12 16:49:47 5698 [Note] WSREP: Requesting state transfer: success, donor: 0
2014-12-12 16:49:47 5698 [Note] WSREP: (dff2f8f7, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2014-12-12 16:49:57 5698 [Note] WSREP: 0.0 (node01): State transfer to 1.0 (node02) complete.
WSREP_SST: [ERROR] xtrabackup process ended without creating ‘/var/lib/mysql//xtrabackup_galera_info’ (20141212 16:49:57.174)
[…]
WSREP_SST: [ERROR] Cleanup after exit with status:32 (20141212 16:49:57.297)
WSREP_SST: [INFO] Removing the sst_in_progress file (20141212 16:49:57.326)
2014-12-12 16:49:57 5698 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role ‘joiner’ --address ‘192.168.0.102’ --auth ‘sst:123’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --parent ‘5698’ ‘’ : 32 (Broken pipe)
2014-12-12 16:49:57 5698 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2014-12-12 16:49:57 5698 [ERROR] WSREP: SST failed: 32 (Broken pipe)
2014-12-12 16:49:57 5698 [ERROR] Aborting

Just to get you guys aware, I’ve got the user sst with 123 as its password well created on MySQL:

$ mysql -u root -e “show grants for sst@localhost\G”
*************************** 1. row ***************************
Grants for sst@localhost: GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON . TO ‘sst’@‘localhost’ IDENTIFIED BY PASSWORD ‘*23AE809DDACAF96AF0FD78ED04B6A265E05AA257’

Trying to test manually the xtrabackup command as per the documentation to check if it’s doing the right thing, I’ve got the following:

$ innobackupex --user=sst --password=123 /tmp/

InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy
and Percona LLC and/or its affiliates 2009-2013. All Rights Reserved.

This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.

Get the latest version of Percona XtraBackup, documentation, and help resources:
[URL]Percona XtraBackup - MySQL Database Backup Software

141212 17:17:36 innobackupex: Connecting to MySQL server with DSN ‘dbi:mysql:;mysql_read_default_group=xtrabackup’ as ‘sst’ (using password: YES).
141212 17:17:36 innobackupex: Connected to MySQL server
141212 17:17:36 innobackupex: Executing a version check against the server…
Can’t use an undefined value as an ARRAY reference at /usr/bin/innobackupex line 1069.

The Xtrabackup packages I’ve got installed on these new machines:

$ xtrabackup --version
xtrabackup version 2.2.7 based on MySQL server 5.6.21 Linux (x86_64) (revision id: )

Any lights in here? Any advice?

Welcome to the club - see the latest two entries in this forum. I have several installations of PXC in production use and can’t join new nodes to existing clusters nor create new clusters. Exact same error logs as yours. Seems as if the latest version of XtraDB Cluster (or maybe XtraBackup or some script) is broken. Unfortunately no reaction from Percona so far. :frowning:

I reverted to the 5.6.20 build - that worked.

Thanks for the welcome, SouthernBelle! Working with the Percona Repo files with the YUM command, I found the version 5.6.20-25.7 which is the version before the current and one can install from repository using te commando below:

$ yum --showduplicate list Percona-XtraDB-Cluster-server-56.x86_64
Loaded plugins: fastestmirror, versionlock
Loading mirror speeds from cached hostfile

  • base: centos.brnet.net.br
  • epel: epel.gtdinternet.com
  • extras: centos.ar.host-engine.com
  • updates: centos.ar.host-engine.com
    Installed Packages
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.21-25.8.938.el6 @percona
    Available Packages
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.14-25.1.570.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.14-25.1.571.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.15-25.2.645.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.15-25.2.692.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.15-25.3.706.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.15-25.4.731.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.15-25.5.759.rhel6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.19-25.6.824.el6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.20-25.7.886.el6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.20-25.7.888.el6 percona
    Percona-XtraDB-Cluster-server-56.x86_64 1:5.6.21-25.8.938.el6 percona

$ yum install Percona-XtraDB-Cluster--5.6.20-
Loaded plugins: fastestmirror, versionlock
Loading mirror speeds from cached hostfile

  • base: centos.ufms.br
  • epel: mirror.globo.com
  • extras: centos.ufms.br
  • updates: mirrors.dcarsat.com.ar
    Setting up Install Process
    […]

Yet, the 5.6.20-25.7 is different than the other I’ve been running here, which is 5.6.20-68.0-56-log, but I’ll give it a try for awhile…

I found your thread here in the Percona Forum,
[url]http://www.percona.com/forums/questions-discussions/percona-xtradb-cluster/27626-2nd-node-unable-to-join-cluster[/url]

Thanks man, keep in touch!!

I’m also working with yum repos on CentOS and Amazon Linux. Please keep me posted how you’re doing with the older build.

I reverted the binaries to the old version and it’s working fine here, that is, with version “Ver 5.6.20-68.0-56 for Linux on x86_64”. What I noticed was that at the time I was setting things up, it installed 34 packages on my first node and 40 packages on my second node. Lots of perl packages and others I haven’t seen before. BTW, at the end all the nodes have joined the cluster and it seems that there is a problem with the SST’s [I][B]restore phase[/B][/I] on the version 5.6.21-70.1-56. Maybe they have something to do with new Xtrabackup version to make it work correctly, I’m not sure yet…

$ mysqld --version
mysqld Ver 5.6.20-68.0-56 for Linux on x86_64 (Percona XtraDB Cluster (GPL), Release rel68.0, Revision 888, WSREP version 25.7, wsrep_25.7.r4126)

[root@testenv-node01 ~]$ mysql -u root -e “show status like ‘wsrep_cluster_size’\G”
*************************** 1. row ***************************
Variable_name: wsrep_cluster_size
Value: 3

It’s me again and just to register that, I’ve just built new virtual servers with the same configs of the bare metal ones I’m using at the company and, after the YUM command to install Percona-XtraDB-Cluster packages, I can affirm that the same amount of packages were installed, a total of 34.

Have you all a good weekend.

wsrep_sst_method=xtrabackup

Don’t use this. This is deprecated for 5.6 (and 5.5) and should only be used for compatibility with PXC 5.5.34 and earlier.

The default (even when wsrep-sst-method is unset) now is xtrabackup-v2

Next, on the SST methods, PXC 5.6.21 does support backup locks and so its SST ‘version’ is bumped.

However, this applies only to new donors and old joiners, not new joiners and old donors.

(Here, new refers to PXC 5.6.21 and higher, old refers to anything below that).

So, with new joiners and old donors (together with minimum PXB version requirements of PXC’s installed there), it should work fine.

mysql> show status like ‘wsrep_cluster%’\G
*************************** 1. row ***************************
Variable_name: wsrep_cluster_conf_id
Value: 5
*************************** 2. row ***************************
Variable_name: wsrep_cluster_size
Value: 3
*************************** 3. row ***************************
Variable_name: wsrep_cluster_state_uuid
Value: f9cdb70c-8483-11e4-86af-4b70fd8e7a88
*************************** 4. row ***************************
Variable_name: wsrep_cluster_status
Value: Primary
4 rows in set (0.00 sec)