PXC 8.0.32 JOINER node unable to SST "/usr/bin/ls: Operation not permitted"

Hi,

We’ve been working with a client who has a 3 node PXC8.0.32 cluster on RHEL8.10. The cluster has been running fine for several months and was restarted last week after some patching on one of the nodes.

The bootstrap node came up fine but both joiner nodes were failing around SST with the following being the core error:

2025-02-19T08:42:49.925402-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted

Line 1248 of wsrep_sst_xtrabackup-v2 is doing this:

sockets=$(ls -l /proc/$pid/fd | grep socket | cut -d’[’ -f2 | cut -d ‘]’ -f1 | tr ‘\n’ ‘|’)

The client rolled back the updates on the patched server but the issue persists so we suspected some security policies/hardening had taken place across all servers in recent weeks/months.

We’ve subsequently narrowed it down to the “capabilities” in the mysqld service script /etc/systemd/system/mysqld.service:

CapabilityBoundingSet=CAP_IPC_LOCK CAP_DAC_OVERRIDE CAP_AUDIT_WRITE

After some trial and error (with a test service) we found that adding two extra capabilities CAP_SYS_PTRACE and CAP_DAC_READ_SEARCH was sufficient to prevent the error and allow the joiner nodes to SST successfully:

CapabilityBoundingSet=CAP_IPC_LOCK CAP_DAC_OVERRIDE CAP_AUDIT_WRITE CAP_SYS_PTRACE CAP_DAC_READ_SEARCH

Good that we fixed it but the problem is that we don’t know why this is happening in this client’s environment - we haven’t encountered this internally or on any of our other client systems.

We’ve checked and ruled out pretty much everything we know of: selinux, fapolicyd, apparmor, /proc mount flags and a whole bunch of weird and wonderful systemd and kernel settings.

Anybody encountered something similar? What else can we check in terms of audit logging to trace why the ls on /proc is being denied?

Any wisdom much appreciated.

thanks,

Neil

@Neil_Billett Indeed it looks some permission or security restrictions. Bye any chance the setup deployed in some container based services Docker or k8s etc ?

We’ve checked and ruled out pretty much everything we know of: selinux, fapolicyd, apparmor, /proc mount flags and a whole bunch of weird and wonderful systemd and kernel settings.

If this areas already checked, have you noticed any warnings or interesting stuff in Kernel logs (dmesg -T or /var/log/messages etc). I believe the user and permissions for running the target service/DB is not an blocker ? Are those same across all cluster nodes ?

Can you share a few more lines before the “Operation not permitted” message ?

Hi Anil,

Thanks for the response.

No - no docker or k8s as far as I’m aware. This is baremetal with PXC installed by RPM and controlled via systemd.

We had a good scour of the obvious logs like /var/log/messages and dmesg but couldn’t find anything related to the error.

As far as I’m aware the users/permissions are the same across the servers running the 3 nodes.

Here’s a bit more around the error from one of the JOINER logs - I’ve hidden any sensitive info:

2025-02-19T12:10:58.487415-05:00 2 [Note] [MY-000000] [WSREP] Server status change connected → joiner
2025-02-19T12:10:58.487424-05:00 2 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2025-02-19T12:10:58.487711-05:00 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘hidden’ --datadir ‘/var/lib/mysql/’ --basedir ‘/usr/’ --plugindir ‘/usr/lib64/mysql/plugin/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --parent ‘1861’ --mysqld-version ‘8.0.32-24.2’ ‘’ )
2025-02-19T12:10:58.996220-05:00 0 [Warning] [MY-000000] [WSREP-SST] wsrep_node_address or wsrep_sst_receive_address not set. Consider setting them if SST fails.
2025-02-19T12:10:59.665925-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:00.042427-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:00.432187-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:00.799872-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:00.985898-05:00 0 [Note] [MY-000000] [Galera] (685a5b4b-9d55, ‘tcp://0.0.0.0:4567’) turning message relay requesting off
2025-02-19T12:11:01.169222-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:01.564026-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:01.973882-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
2025-02-19T12:11:02.370377-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted
.
many repeating errors
.
2025-02-19T12:12:38.308964-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted 2025-02-19T12:12:38.686194-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted 2025-02-19T12:12:39.060653-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted 2025-02-19T12:12:39.439926-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1248: /usr/bin/ls: Operation not permitted 2025-02-19T12:12:39.482672-05:00 0 [Note] [MY-000000] [WSREP-SST] Trying to terminate (2425) socat -u TCP-LISTEN:4444,reuseaddr,retry=30 stdio | /usr/bin/pxc_extra/pxb-8.0/bin/xbstream -x with SIGTERM
2025-02-19T12:12:39.499262-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 218: 2427 Exit 143 socat -u TCP-LISTEN:4444,reuseaddr,retry=30 stdio
2025-02-19T12:12:39.499346-05:00 0 [Note] [MY-000000] [WSREP-SST] 2428 Terminated | /usr/bin/pxc_extra/pxb-8.0/bin/xbstream -x
2025-02-19T12:12:40.502410-05:00 0 [ERROR] [MY-000000] [WSREP-SST] ******************* FATAL ERROR **********************
2025-02-19T12:12:40.502511-05:00 0 [ERROR] [MY-000000] [WSREP-SST] Possible timeout in receving first data from donor in gtid/keyring stage
2025-02-19T12:12:40.502531-05:00 0 [ERROR] [MY-000000] [WSREP-SST] Line 1381
2025-02-19T12:12:40.502544-05:00 0 [ERROR] [MY-000000] [WSREP-SST] ******************************************************
2025-02-19T12:12:40.502574-05:00 0 [ERROR] [MY-000000] [WSREP-SST] Cleanup after exit with status:32

…and a little later:

2025-02-19T12:12:51.577110-05:00 0 [ERROR] [MY-000000] [WSREP] Process completed with error: wsrep_sst_xtrabackup-v2 --role ‘joiner’ --address ‘hidden’ --datadir ‘/var/lib/mysql/’ --basedir ‘/usr/’ --plugindir ‘/usr/lib64/mysql/plugin/’ --defaults-file ‘/etc/my.cnf’ --defaults-group-suffix ‘’ --parent ‘1861’ --mysqld-version ‘8.0.32-24.2’ ‘’ : 32 (Broken pipe) 2025-02-19T12:12:51.577203-05:00 0 [ERROR] [MY-000000] [WSREP] Failed to read uuid:seqno from joiner script.
2025-02-19T12:12:51.577234-05:00 0 [ERROR] [MY-000000] [WSREP] SST script aborted with error 32 (Broken pipe)

…and a little later the node shuts down:

2025-02-19T12:12:51.582179-05:00 3 [Note] [MY-000000] [Galera] PC protocol downgrade 1 → 0
2025-02-19T12:12:51.582195-05:00 3 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2025-02-19T12:12:51.582491-05:00 3 [Note] [MY-000000] [Galera] gcomm: closed
2025-02-19T12:12:51.582541-05:00 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2025-02-19T12:12:51.582621-05:00 0 [Note] [MY-000000] [Galera] Flow-control interval: [100, 100]
2025-02-19T12:12:51.582640-05:00 0 [Note] [MY-000000] [Galera] Received NON-PRIMARY.
2025-02-19T12:12:51.582653-05:00 0 [Note] [MY-000000] [Galera] Shifting JOINER → OPEN (TO: 111583020)
2025-02-19T12:12:51.582681-05:00 0 [Note] [MY-000000] [Galera] New SELF-LEAVE.
2025-02-19T12:12:51.582700-05:00 0 [Note] [MY-000000] [Galera] Flow-control interval: [0, 0]
2025-02-19T12:12:51.582713-05:00 0 [Note] [MY-000000] [Galera] Received SELF-LEAVE. Closing connection.
2025-02-19T12:12:51.582742-05:00 0 [Note] [MY-000000] [Galera] Shifting OPEN → CLOSED (TO: 111583020)
2025-02-19T12:12:51.582772-05:00 0 [Note] [MY-000000] [Galera] RECV thread exiting 0: Success
2025-02-19T12:12:51.582880-05:00 3 [Note] [MY-000000] [Galera] recv_thread() joined.
2025-02-19T12:12:51.582903-05:00 3 [Note] [MY-000000] [Galera] Closing replication queue.
2025-02-19T12:12:51.582915-05:00 3 [Note] [MY-000000] [Galera] Closing slave action queue.
2025-02-19T12:12:51.582934-05:00 3 [Note] [MY-000000] [Galera] /usr/sbin/mysqld: Terminated.
2025-02-19T12:12:51.582949-05:00 3 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2025-02-19T12:12:51.582961-05:00 3 [Note] [MY-000000] [WSREP] Terminating SST process

thanks,

Neil