Adding second node gets [WSREP] posix_spawnp() failed: 13 (Permission denied)

AndrewAndrew Current User Role Novice
edited September 4 in Percona XtraDB Cluster 8.x

I'm setting up a cluster as a PoC to test some performance and resilience stuff as part of developing a roadmap.

I have bootstrapped the primary node, and have set up the second node in line with the documentation, but starting the second node it fails with these errors in the log.

2020-09-04T12:08:33.195018Z 2 [Note] [MY-000000] [Galera] State transfer required:

    Group state: 0bfa5e83-ee9a-11ea-87c7-bef4ee9147f0:11

    Local state: 00000000-0000-0000-0000-000000000000:-1

2020-09-04T12:08:33.195032Z 2 [Note] [MY-000000] [WSREP] Server status change connected -> joiner

2020-09-04T12:08:33.195044Z 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notifi


2020-09-04T12:08:33.195139Z 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'x.x.x.x' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib64/mysql/plugin/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '11943' --mysqld-version '8.0.19-10'  '' )

2020-09-04T12:08:33.195431Z 0 [ERROR] [MY-000000] [WSREP] posix_spawnp() failed: 13 (Permission denied)

2020-09-04T12:08:33.195492Z 0 [ERROR] [MY-000000] [WSREP] Failed to execute: wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'x.x.x.x' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib64/mysql/plugin/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '11943' --mysqld-version '8.0.19-10'  '' : 13 (Permission denied)

2020-09-04T12:08:33.195592Z 2 [ERROR] [MY-000000] [WSREP] Failed to prepare for 'xtrabackup-v2' SST. Unrecoverable.

2020-09-04T12:08:33.195686Z 2 [ERROR] [MY-000000] [Galera] SST request callback failed. This is unrecoverable, restart required.

2020-09-04T12:08:33.195703Z 2 [Note] [MY-000000] [Galera] ReplicatorSMM::abort()

2020-09-04T12:08:33.195718Z 2 [Note] [MY-000000] [Galera] Closing send monitor...

2020-09-04T12:08:33.195732Z 2 [Note] [MY-000000] [Galera] Closed send monitor.

2020-09-04T12:08:33.195746Z 2 [Note] [MY-000000] [Galera] gcomm: terminating thread

2020-09-04T12:08:33.195762Z 2 [Note] [MY-000000] [Galera] gcomm: joining thread

2020-09-04T12:08:33.195838Z 2 [Note] [MY-000000] [Galera] gcomm: closing backend

2020-09-04T12:08:33.206299Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation

12:08:33 UTC - mysqld got signal 11 ;

Failed 13 obviously suggests a permissions issue, but I'm unclear as to whether it's permissions on the local file system, or something reported from the primary node. I can see entries in the log of the primary node to acknowledge the connection from the second node, but nothing that reports any errors in there.

Any pointers as to where the permissions issue lies? Or is it something I've missed?

Best Answer

  • AndrewAndrew Current User Role Novice
    Accepted Answer

    I've deployed on Ubuntu and not had the same issue. I can only surmise that it's a permissions issue in Redhat which doesn't apply to Ubuntu.

    I now have a working cluster. Off to configure ProxySQL now


  • jriverajrivera Percona Support Engineer Percona Staff Role

    This might be permissions on the joiner node. Can you share configuration file? Which exact version of PXC are you installing? What specific OS are you running these on?

  • AndrewAndrew Current User Role Novice

    Installing the latest full cluster from the YUM repo on RH EL 8:

     yum list installed | grep percona

    percona-release.noarch            1.0-24                   @@commandline

    percona-testing.noarch            0.0-1                    @System

    percona-xtrabackup-80.x86_64         8.0.14-1.el8                @tools-release-x86_64

    percona-xtradb-cluster-client.x86_64     8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-debuginfo.x86_64    8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-devel.x86_64      8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-full.x86_64      8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-garbd.x86_64      8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-server.x86_64     8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-shared.x86_64     8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-shared-compat.x86_64  8.0.19-10.1.el8               @pxc-80-release-x86_64

    percona-xtradb-cluster-test.x86_64      8.0.19-10.1.el8               @pxc-80-release-x86_64

    my.cnf :

    # Template my.cnf for PXC

    # Edit to your requirements.










    # Binary log expiration period is 604800 seconds, which equals 7 days


    ######## wsrep ###############

    # Path to Galera library


    # Cluster connection URL contains IPs of nodes

    #If no IP is found, this implies that a new cluster needs to be created,

    #in order to do that you need to bootstrap this node


    # In order for Galera to work correctly binlog format should be ROW


    # Slave thread to use



    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera


    # Node IP address


    # Cluster name


    #If wsrep_node_name is not specified, then system hostname will be used


    #pxc_strict_mode allowed values: DISABLED,PERMISSIVE,ENFORCING,MASTER


    # SST method







  • AndrewAndrew Current User Role Novice
    edited September 8
    The plot thickens.

    Running mysqld as the MysQL user manually instead of using 'systemctl', I don't get the posix_spawn error, but I do get this instead, which suggests it may be a permissions issue around the systemctl process?


    2020-09-08T09:07:40.902147Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR **********************
    2020-09-08T09:07:40.902193Z 0 [ERROR] [MY-000000] [WSREP-SST] Possible timeout in receving first data from donor in gtid/keyring stage
    2020-09-08T09:07:40.902209Z 0 [ERROR] [MY-000000] [WSREP-SST] Line 1108
    2020-09-08T09:07:40.902221Z 0 [ERROR] [MY-000000] [WSREP-SST] ******************************************************
    2020-09-08T09:07:40.902233Z 0 [ERROR] [MY-000000] [WSREP-SST] Cleanup after exit with status:32
    2020-09-08T09:07:40.910684Z 0 [ERROR] [MY-000000] [WSREP] Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'x.x.x.x' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib64/mysql/plugin/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '7807' --mysqld-version '8.0.19-10'   '' : 32 (Broken pipe)
    2020-09-08T09:07:40.910723Z 0 [ERROR] [MY-000000] [WSREP] Failed to read uuid:seqno from joiner script.
    2020-09-08T09:07:40.910768Z 0 [ERROR] [MY-000000] [WSREP] SST script aborted with error 32 (Broken pipe)
    2020-09-08T09:07:40.910865Z 3 [Note] [MY-000000] [Galera] Processing SST received
    2020-09-08T09:07:40.910895Z 3 [Note] [MY-000000] [Galera] SST received: 00000000-0000-0000-0000-000000000000:-1
    2020-09-08T09:07:40.910917Z 3 [System] [MY-000000] [WSREP] SST completed
    2020-09-08T09:07:40.911416Z 1 [Note] [MY-000000] [Galera]  str_proto_ver_: 3 sst_seqno_: -1 cc_seqno: 40 req->ist_len(): 0
    2020-09-08T09:07:40.911446Z 1 [ERROR] [MY-000000] [Galera] Application received wrong state:
            Received: 00000000-0000-0000-0000-000000000000
            Required: 0bfa5e83-ee9a-11ea-87c7-bef4ee9147f0
    2020-09-08T09:07:40.911462Z 1 [ERROR] [MY-000000] [Galera] Application state transfer failed. This is unrecoverable condition, restart required.
    2020-09-08T09:07:40.911479Z 1 [Note] [MY-000000] [Galera] ReplicatorSMM::abort()
    2020-09-08T09:07:40.911497Z 1 [Note] [MY-000000] [Galera] Closing send monitor...
    2020-09-08T09:07:40.911513Z 1 [Note] [MY-000000] [Galera] Closed send monitor.
    2020-09-08T09:07:40.911528Z 1 [Note] [MY-000000] [Galera] recv_thread() joined.
    2020-09-08T09:07:40.911542Z 1 [Note] [MY-000000] [Galera] Closing replication queue.
    2020-09-08T09:07:40.911556Z 1 [Note] [MY-000000] [Galera] Closing slave action queue.
    2020-09-08T09:07:40.911575Z 1 [Note] [MY-000000] [Galera] mysqld: Terminated.
    2020-09-08T09:07:40.911597Z 1 [Note] [MY-000000] [WSREP] Initiating SST cancellation

  • jriverajrivera Percona Support Engineer Percona Staff Role
    Can you also send us the contents of mysqld log on the DONOR side preferably with the same timestamps as the one from the JOINER.
  • AndrewAndrew Current User Role Novice
    Files attached
    I've attached files for both starting using systemctl, and for a manual mysqld execution as the MySQL user.
  • AndrewAndrew Current User Role Novice
    edited September 16
    Ignore the issue from the manual start. That was a red-herring.
    The issue remains the posix_spawnp() error.
    I've rebuilt the 3 servers to make sure there wasn't anything I'd missed in the build, but I still get the same error when I attempt to connect nodes 2 and 3 to node 1 for the first time.
