SST Step is failing while 2nd node trying to join the Percona xtradb cluster 5.7

prince85 · December 21, 2018, 9:00am

Hi,

I really need help here. I’ve 3 nodes and The 1st node started with bootstrap and the cluster is up and running with 1 node. Now, while the 2nd node trying to join the cluster the SST step is failing. Below is the error in joiner. Also, I’ve attached the log files from joiner and donor.

2018-12-21T13:51:14.606655Z 1 [Warning] WSREP: Gap in state sequence. Need state transfer.
2018-12-21T13:51:14.606665Z 1 [Note] WSREP: Setting wsrep_ready to false
2018-12-21T13:51:14.606900Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘192.168.50.84’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --parent ‘3695’ --mysqld-version ‘5.7.23-23-57’ --binlog ‘/var/lib/mysql/mysql-bin’ )
2018-12-21T13:51:14.607720Z 0 [ERROR] WSREP: Failed to read ‘ready ’ from: wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘192.168.50.84’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --parent ‘3695’ --mysqld-version ‘5.7.23-23-57’ --binlog ‘/var/lib/mysql/mysql-bin’
Read: ‘(null)’
2018-12-21T13:51:14.607838Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘192.168.50.84’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --parent ‘3695’ --mysqld-version ‘5.7.23-23-57’ --binlog ‘/var/lib/mysql/mysql-bin’ : 2 (No such file or directory)
2018-12-21T13:51:14.608109Z 1 [ERROR] WSREP: Failed to prepare for ‘xtrabackup-v2’ SST. Unrecoverable.
2018-12-21T13:51:14.608143Z 1 [ERROR] Aborting

Note: All the ports are opened in firewall. I’m using latest version of percona xtradb cluster.

Log.zip (84.8 KB)

lorraine.pocklington · December 21, 2018, 12:22pm

Could you post your backup logs from all nodes please.
Also the my.cnf from the PRIM node and the one that is failing.

prince85 · December 21, 2018, 12:38pm

There is no backup log generated at the data file location. I’ve attached the .cnf files from both donor and joiner nodes.

my.cnf.zip (5.85 KB)

vinicius.grippa · December 21, 2018, 1:46pm

Hi,

Please make sure that if you are using the settings are correct or it is disabled.

SELinux:

Check if your firewall allows the communication between the nodes on the ports: [LIST=1]
[]Regular MySQL port (default is 3306).
[]Port for group communication (default is 4567).
[]Port for State Snaphot Transfer (default is 4444).
[]Port for Incremental State Transfer (default is port for group communication + 1 or 4568).
[/LIST] It is possible to identify that the . Another point, check if you have any file on your datadir on the joiner and remove it. Another theory is that the socat might be with a problem:

Check if both are using the same versions.

prince85 · December 21, 2018, 2:12pm

OS:
Donor:
[root@qtsxtradb03 ~]# cat /etc/os-release
NAME=“Red Hat Enterprise Linux Server”
VERSION=“7.5 (Maipo)”
ID=“rhel”
ID_LIKE=“fedora”
VARIANT=“Server”
VARIANT_ID=“server”
VERSION_ID=“7.5”
PRETTY_NAME=“Red Hat Enterprise Linux Server 7.5 (Maipo)”
ANSI_COLOR=“0;31”
CPE_NAME=“cpe:/o:redhat:enterprise_linux:7.5:GA:server”
HOME_URL=“https://www.redhat.com/”
BUG_REPORT_URL=“https://bugzilla.redhat.com/”

REDHAT_BUGZILLA_PRODUCT=“Red Hat Enterprise Linux 7”
REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
REDHAT_SUPPORT_PRODUCT=“Red Hat Enterprise Linux”
REDHAT_SUPPORT_PRODUCT_VERSION=“7.5”

Joiner:

[root@qtsxtradb02 ~]# cat /etc/os-release
NAME=“Red Hat Enterprise Linux Server”
VERSION=“7.5 (Maipo)”
ID=“rhel”
ID_LIKE=“fedora”
VARIANT=“Server”
VARIANT_ID=“server”
VERSION_ID=“7.5”
PRETTY_NAME=“Red Hat Enterprise Linux Server 7.5 (Maipo)”
ANSI_COLOR=“0;31”
CPE_NAME=“cpe:/o:redhat:enterprise_linux:7.5:GA:server”
HOME_URL=“https://www.redhat.com/”
BUG_REPORT_URL=“https://bugzilla.redhat.com/”

REDHAT_BUGZILLA_PRODUCT=“Red Hat Enterprise Linux 7”
REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
REDHAT_SUPPORT_PRODUCT=“Red Hat Enterprise Linux”
REDHAT_SUPPORT_PRODUCT_VERSION=“7.5”

[I][B]As mentioned earlier post, all the ports are opened. And tested listening onto the mentioned ports.

As I have sent the error logs, it seems the PRIMARY node is receiving request from the joiner node.[/B][/I]

socat version:

Donor:
[root@qtsxtradb03 ~]# rpm -qa |grep -i socat
socat-1.7.3.2-2.el7.x86_64

Joiner:
[root@qtsxtradb02 ~]# rpm -qa |grep -i socat
socat-1.7.3.2-2.el7.x86_64

Yes I have files in datadir in joiner node as follows.

[root@qtsxtradb02 mysql]# ll
total 733540
-rw-rw----. 1 mysql mysql 56 Sep 28 05:36 auto.cnf
drwxr-x—. 2 mysql mysql 20 Nov 30 03:16 db1
-rw-r-----. 1 mysql mysql 134219048 Dec 21 05:51 galera.cache
-rw-r-----. 1 mysql mysql 0 Dec 21 05:13 grastate.dat
-rw-r-----. 1 mysql mysql 463 Nov 30 04:28 ib_buffer_pool
-rw-rw----. 1 mysql mysql 79691776 Dec 21 05:51 ibdata1
-rw-r-----. 1 mysql mysql 268435456 Dec 21 05:51 ib_logfile0
-rw-r-----. 1 mysql mysql 268435456 Nov 29 05:40 ib_logfile1
drwx------. 2 mysql mysql 4096 Sep 28 05:21 mysql
-rw-r-----. 1 mysql mysql 0 Dec 21 05:51 mysql-bin.index
-rw-rw----. 1 root root 5 Dec 21 05:51 mysqld_safe.pid
-rw-r-----. 1 mysql mysql 315228 Dec 21 05:51 mysql-error.log
-rw-rw----. 1 mysql mysql 3019 Dec 21 05:51 mysql-slow.log
drwx------. 2 mysql mysql 4096 Sep 28 05:21 performance_schema
-rw-rw----. 1 mysql mysql 167 Nov 13 04:03 relay-bin.000004
-rw-rw----. 1 mysql mysql 1257 Nov 16 03:09 relay-bin.000005
-rw-rw----. 1 mysql mysql 64 Nov 13 04:03 relay-bin.index
-rw-r–r–. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_HISTORY
-rw-r–r–. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_MARKER-LAST
drwx------. 2 mysql mysql 60 Nov 13 04:10 Test
[root@qtsxtradb02 mysql]# pwd
/var/lib/mysql

[I][B]Do I have to delete everything?
As per the demo video on Percona xtradb cluster, the files in datadir of the joiner node was not removed.

Please specify what are the files I have to remove from the joiner’s datadir.[/B][/I]

vinicius.grippa · December 21, 2018, 4:54pm

Hi,

Did you check for SELinux?

Check if SELinux is disabled:

It is necessary only to remove file.

Kenn_Takara · December 21, 2018, 7:25pm

The donor node is not starting the SST script (there’s no SST logging output in the donor error logs).

In the wsrep.cnf for the donor node, the wsrep_sst_method is commented out

SST method

#wsrep_sst_method=xtrabackup-v2

This causes the donor node to not startup the script for the donor side (thus the joiner node fails).

For additional SST-only error logging, you can set wsrep_debug in the [sst] section.
[sst]
wsrep_debug=ON

Actually, PXC defaults to xtrabackup-v2, so it should still work. I would suggest enabling the SST wsrep_debug to ON on both sides and seeing what happens. For some reason the SST process is not starting up on the donor side…

prince85 · January 3, 2019, 7:18am

Hi,

I have disabled the SELinux in both the servers.

Enabled wsrep_sst_method=xtrabackup-v2

Added below lines into both server’s my.cnf
[sst]
wsrep_debug=ON

Now I have attached the logs from both donor and joiner server,

Now everything in the joiner datadir got deleted.
[root@qtsxtradb02 mysql]# ll
total 348
-rw-r-----. 1 mysql mysql 346357 Jan 3 03:48 mysql-error.log
-rw-rw----. 1 mysql mysql 3275 Dec 24 00:41 mysql-slow.log
-rw-r–r–. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_HISTORY
[root@qtsxtradb02 mysql]# pwd
/var/lib/mysql

innobackup log is having below information.

xtrabackup: [ERROR] Could not open required defaults file: /etc/my.cnf
xtrabackup: [ERROR] Fatal error in defaults handling. Program aborted!

So, I gave read permission to mysql user to the file.
-rw-r-----. 1 root mysql 2043 Dec 21 04:57 my.cnf

After this all,
Now when starting mysql at joiner it’s not doing anything and nothing is getting written onto the logs.

[root@qtsxtradb02 mysql]# systemctl start mysql
Job for mysql.service failed because the control process exited with error code. See “systemctl status mysql.service” and “journalctl -xe” for details.

Log.zip (49.9 KB)

prince85 · January 3, 2019, 7:33am

Now I have deleted all the files from the joiner’s datadir and mysql now started at joiner.

Thanks for all the help.

Topic		Replies	Views
Can't start a joiner due to SST problems Percona XtraDB Cluster 5.x	2	2765	December 15, 2017
xtrabackup SST error Percona XtraDB Cluster 5.x	5	4853	October 9, 2022
New setup fails to join Percona XtraDB Cluster 5.x	5	4836	December 28, 2014
SST failing please help Percona XtraDB Cluster 5.x	1	1413	April 22, 2016
Can't join second node (rsync) Percona XtraDB Cluster 5.x	3	1805	December 15, 2014

SST Step is failing while 2nd node trying to join the Percona xtradb cluster 5.7

SST method

Related topics