Not the answer you need?
Register and ask your own question!

SST Step is failing while 2nd node trying to join the Percona xtradb cluster 5.7

prince85prince85 ContributorCurrent User Role Novice
Hi,

I really need help here. I've 3 nodes and The 1st node started with bootstrap and the cluster is up and running with 1 node. Now, while the 2nd node trying to join the cluster the SST step is failing. Below is the error in joiner. Also, I've attached the log files from joiner and donor.


2018-12-21T13:51:14.606655Z 1 [Warning] WSREP: Gap in state sequence. Need state transfer.
2018-12-21T13:51:14.606665Z 1 [Note] WSREP: Setting wsrep_ready to false
2018-12-21T13:51:14.606900Z 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.50.84' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '3695' --mysqld-version '5.7.23-23-57' --binlog '/var/lib/mysql/mysql-bin' )
2018-12-21T13:51:14.607720Z 0 [ERROR] WSREP: Failed to read 'ready <addr>' from: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.50.84' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '3695' --mysqld-version '5.7.23-23-57' --binlog '/var/lib/mysql/mysql-bin'
Read: '(null)'
2018-12-21T13:51:14.607838Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.50.84' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '3695' --mysqld-version '5.7.23-23-57' --binlog '/var/lib/mysql/mysql-bin' : 2 (No such file or directory)
2018-12-21T13:51:14.608109Z 1 [ERROR] WSREP: Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.
2018-12-21T13:51:14.608143Z 1 [ERROR] Aborting

Note: All the ports are opened in firewall. I'm using latest version of percona xtradb cluster.
Log.zip 84.8K

Comments

  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Could you post your backup logs from all nodes please.
    Also the my.cnf from the PRIM node and the one that is failing.
  • prince85prince85 Contributor Current User Role Novice
    There is no backup log generated at the data file location. I've attached the .cnf files from both donor and joiner nodes.
  • vinicius.grippavinicius.grippa Percona Percona Staff Role
    Hi,

    Please make sure that if you are using the settings are correct or it is disabled.

    SELinux:



    Check if your firewall allows the communication between the nodes on the ports:
    1. Regular MySQL port (default is 3306).
    2. Port for group communication (default is 4567).
    3. Port for State Snaphot Transfer (default is 4444).
    4. Port for Incremental State Transfer (default is port for group communication + 1 or 4568).
    It is possible to identify that the . Another point, check if you have any file on your datadir on the joiner and remove it. Another theory is that the socat might be with a problem:



    Check if both are using the same versions.
  • prince85prince85 Contributor Current User Role Novice
    OS:
    Donor:
    [[email protected] ~]# cat /etc/os-release
    NAME="Red Hat Enterprise Linux Server"
    VERSION="7.5 (Maipo)"
    ID="rhel"
    ID_LIKE="fedora"
    VARIANT="Server"
    VARIANT_ID="server"
    VERSION_ID="7.5"
    PRETTY_NAME="Red Hat Enterprise Linux Server 7.5 (Maipo)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:redhat:enterprise_linux:7.5:GA:server"
    HOME_URL="https://www.redhat.com/&quot;
    BUG_REPORT_URL="https://bugzilla.redhat.com/&quot;

    REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
    REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
    REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
    REDHAT_SUPPORT_PRODUCT_VERSION="7.5"

    Joiner:

    [[email protected] ~]# cat /etc/os-release
    NAME="Red Hat Enterprise Linux Server"
    VERSION="7.5 (Maipo)"
    ID="rhel"
    ID_LIKE="fedora"
    VARIANT="Server"
    VARIANT_ID="server"
    VERSION_ID="7.5"
    PRETTY_NAME="Red Hat Enterprise Linux Server 7.5 (Maipo)"
    ANSI_COLOR="0;31"
    CPE_NAME="cpe:/o:redhat:enterprise_linux:7.5:GA:server"
    HOME_URL="https://www.redhat.com/&quot;
    BUG_REPORT_URL="https://bugzilla.redhat.com/&quot;

    REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
    REDHAT_BUGZILLA_PRODUCT_VERSION=7.5
    REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
    REDHAT_SUPPORT_PRODUCT_VERSION="7.5"

    As mentioned earlier post, all the ports are opened. And tested listening onto the mentioned ports.

    As I have sent the error logs, it seems the PRIMARY node is receiving request from the joiner node.


    socat version:

    Donor:
    [[email protected] ~]# rpm -qa |grep -i socat
    socat-1.7.3.2-2.el7.x86_64


    Joiner:
    [[email protected] ~]# rpm -qa |grep -i socat
    socat-1.7.3.2-2.el7.x86_64

    Yes I have files in datadir in joiner node as follows.

    [[email protected] mysql]# ll
    total 733540
    -rw-rw----. 1 mysql mysql 56 Sep 28 05:36 auto.cnf
    drwxr-x---. 2 mysql mysql 20 Nov 30 03:16 db1
    -rw-r
    . 1 mysql mysql 134219048 Dec 21 05:51 galera.cache
    -rw-r
    . 1 mysql mysql 0 Dec 21 05:13 grastate.dat
    -rw-r
    . 1 mysql mysql 463 Nov 30 04:28 ib_buffer_pool
    -rw-rw----. 1 mysql mysql 79691776 Dec 21 05:51 ibdata1
    -rw-r
    . 1 mysql mysql 268435456 Dec 21 05:51 ib_logfile0
    -rw-r
    . 1 mysql mysql 268435456 Nov 29 05:40 ib_logfile1
    drwx
    . 2 mysql mysql 4096 Sep 28 05:21 mysql
    -rw-r
    . 1 mysql mysql 0 Dec 21 05:51 mysql-bin.index
    -rw-rw----. 1 root root 5 Dec 21 05:51 mysqld_safe.pid
    -rw-r
    . 1 mysql mysql 315228 Dec 21 05:51 mysql-error.log
    -rw-rw----. 1 mysql mysql 3019 Dec 21 05:51 mysql-slow.log
    drwx
    . 2 mysql mysql 4096 Sep 28 05:21 performance_schema
    -rw-rw----. 1 mysql mysql 167 Nov 13 04:03 relay-bin.000004
    -rw-rw----. 1 mysql mysql 1257 Nov 16 03:09 relay-bin.000005
    -rw-rw----. 1 mysql mysql 64 Nov 13 04:03 relay-bin.index
    -rw-r--r--. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_HISTORY
    -rw-r--r--. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_MARKER-LAST
    drwx
    . 2 mysql mysql 60 Nov 13 04:10 Test
    [[email protected] mysql]# pwd
    /var/lib/mysql


    Do I have to delete everything?
    As per the demo video on Percona xtradb cluster, the files in datadir of the joiner node was not removed.

    Please specify what are the files I have to remove from the joiner's datadir.
  • vinicius.grippavinicius.grippa Percona Percona Staff Role
    Hi,

    Did you check for SELinux?

    Check if SELinux is disabled:




    Do I have to delete everything?

    It is necessary only to remove file.
  • Kenn TakaraKenn Takara Percona Percona Staff Role
    The donor node is not starting the SST script (there's no SST logging output in the donor error logs).

    In the wsrep.cnf for the donor node, the wsrep_sst_method is commented out

    # SST method
    #wsrep_sst_method=xtrabackup-v2

    This causes the donor node to not startup the script for the donor side (thus the joiner node fails).

    For additional SST-only error logging, you can set wsrep_debug in the [sst] section.
    [sst]
    wsrep_debug=ON

    ****
    Actually, PXC defaults to xtrabackup-v2, so it should still work. I would suggest enabling the SST wsrep_debug to ON on both sides and seeing what happens. For some reason the SST process is not starting up on the donor side..
  • prince85prince85 Contributor Current User Role Novice
    Hi,

    I have disabled the SELinux in both the servers.

    Enabled wsrep_sst_method=xtrabackup-v2

    Added below lines into both server's my.cnf
    [sst]
    wsrep_debug=ON

    Now I have attached the logs from both donor and joiner server,

    Now everything in the joiner datadir got deleted.
    [[email protected] mysql]# ll
    total 348
    -rw-r
    . 1 mysql mysql 346357 Jan 3 03:48 mysql-error.log
    -rw-rw----. 1 mysql mysql 3275 Dec 24 00:41 mysql-slow.log
    -rw-r--r--. 1 mysql mysql 117 Sep 28 05:21 RPM_UPGRADE_HISTORY
    [[email protected] mysql]# pwd
    /var/lib/mysql

    innobackup log is having below information.

    xtrabackup: [ERROR] Could not open required defaults file: /etc/my.cnf
    xtrabackup: [ERROR] Fatal error in defaults handling. Program aborted!



    So, I gave read permission to mysql user to the file.
    -rw-r
    . 1 root mysql 2043 Dec 21 04:57 my.cnf


    After this all,
    Now when starting mysql at joiner it's not doing anything and nothing is getting written onto the logs.
    ​​​​​​​
    [[email protected] mysql]# systemctl start mysql
    Job for mysql.service failed because the control process exited with error code. See "systemctl status mysql.service" and "journalctl -xe" for details.
    Log.zip 49.9K
  • prince85prince85 Contributor Current User Role Novice
    Now I have deleted all the files from the joiner's datadir and mysql now started at joiner.

    Thanks for all the help.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.