PXC8.0.32 - nodes not joining the cluster

Hello,
We have a 3 node percona cluster, with first node bootstrapped, we are trying to join node 2 and 3 to the cluster, but mysql service is failing after it tries to attempt SST.

2025-02-19T08:44:39.527965-05:00 0 [ERROR] [MY-000000] [Galera] async IST sender failed to serve tcp://10.32.74.231:4568: ist send failed: ', asio error ‘Failed to read: Connection reset by peer: 104 (Connection reset by peer)
at galerautils/src/gu_asio_stream_react.cpp:throw_sync_op_error():141’: 104 (Connection reset by peer)
at galera/src/ist.cpp:send():862
2025-02-19T08:44:39.528075-05:00 0 [Note] [MY-000000] [Galera] async IST sender served
2025-02-19T08:44:39.671549-05:00 0 [Warning] [MY-000000] [WSREP-SST] wsrep_node_address or wsrep_sst_receive_address not set. Consider setting them if SST fails.
2025-02-19T08:44:39.887101-05:00 26515 [Warning] [MY-013712] [Server] No suitable ‘keyring_component_metadata_query’ service implementation found to fulfill the request.
2025-02-19T08:44:39.904310-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:39 socat[1610805] E write(7, 0x5614d46b5000, 168): Connection refused
2025-02-19T08:44:40.911086-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:40 socat[1610822] E write(7, 0x562461384000, 168): Connection refused
2025-02-19T08:44:41.917966-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:41 socat[1610825] E write(7, 0x56184b596000, 146): Connection refused
2025-02-19T08:44:42.924766-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:42 socat[1610828] E write(7, 0x55eb20845000, 168): Connection refused
2025-02-19T08:44:43.931327-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:43 socat[1610845] E write(7, 0x564595457000, 168): Connection refused
2025-02-19T08:44:44.531016-05:00 0 [Note] [MY-000000] [Galera] cleaning up 680836fa-8def (tcp://10.32.74.231:4567)
2025-02-19T08:44:44.937792-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:44 socat[1610848] E write(7, 0x5652f4b21000, 168): Connection refused
2025-02-19T08:44:45.944729-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:45 socat[1610851] E write(7, 0x564e9ff42000, 168): Connection refused
2025-02-19T08:44:46.951024-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:46 socat[1610868] E write(7, 0x55b211263000, 168): Connection refused
2025-02-19T08:44:47.958345-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:47 socat[1610871] E write(7, 0x5558906e3000, 168): Connection refused
2025-02-19T08:44:48.965889-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:48 socat[1610874] E write(7, 0x56348728d000, 168): Connection refused
2025-02-19T08:44:49.973197-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:49 socat[1610891] E write(7, 0x5613386a8000, 168): Connection refused

donor is showing below errors:-

2025-02-19T08:44:39.527965-05:00 0 [ERROR] [MY-000000] [Galera] async IST sender failed to serve tcp://10.32.74.231:4568: ist send failed: ', asio error ‘Failed to read: Connection reset by peer: 104 (Connection reset by peer)
at galerautils/src/gu_asio_stream_react.cpp:throw_sync_op_error():141’: 104 (Connection reset by peer)
at galera/src/ist.cpp:send():862
2025-02-19T08:44:39.528075-05:00 0 [Note] [MY-000000] [Galera] async IST sender served
2025-02-19T08:44:39.671549-05:00 0 [Warning] [MY-000000] [WSREP-SST] wsrep_node_address or wsrep_sst_receive_address not set. Consider setting them if SST fails.
2025-02-19T08:44:39.887101-05:00 26515 [Warning] [MY-013712] [Server] No suitable ‘keyring_component_metadata_query’ service implementation found to fulfill the request.
2025-02-19T08:44:39.904310-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:39 socat[1610805] E write(7, 0x5614d46b5000, 168): Connection refused
2025-02-19T08:44:40.911086-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:40 socat[1610822] E write(7, 0x562461384000, 168): Connection refused
2025-02-19T08:44:41.917966-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:41 socat[1610825] E write(7, 0x56184b596000, 146): Connection refused
2025-02-19T08:44:42.924766-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:42 socat[1610828] E write(7, 0x55eb20845000, 168): Connection refused
2025-02-19T08:44:43.931327-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:43 socat[1610845] E write(7, 0x564595457000, 168): Connection refused
2025-02-19T08:44:44.531016-05:00 0 [Note] [MY-000000] [Galera] cleaning up 680836fa-8def (tcp://10.32.74.231:4567)
2025-02-19T08:44:44.937792-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:44 socat[1610848] E write(7, 0x5652f4b21000, 168): Connection refused
2025-02-19T08:44:45.944729-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:45 socat[1610851] E write(7, 0x564e9ff42000, 168): Connection refused
2025-02-19T08:44:46.951024-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:46 socat[1610868] E write(7, 0x55b211263000, 168): Connection refused
2025-02-19T08:44:47.958345-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:47 socat[1610871] E write(7, 0x5558906e3000, 168): Connection refused
2025-02-19T08:44:48.965889-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:48 socat[1610874] E write(7, 0x56348728d000, 168): Connection refused
2025-02-19T08:44:49.973197-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:49 socat[1610891] E write(7, 0x5613386a8000, 168): Connection refused

There are no permission issues with /usr , /usr/bin , /usr/bin/ls , /proc or /usr/bin/wsrep_sst_xtrabackup-v2 . There is no connectivity issue between the servers on port 4444 , 4567 and port 4568 selinux and local firewall disabled, no iptables entries.

Can someone please help us fix this issue.

-Abhi

2025-02-19T08:44:39.904310-05:00 0 [Note] [MY-000000] [WSREP-SST] 2025/02/19 08:44:39 socat[1610805] E write(7, 0x5614d46b5000, 168): Connection refused

This looks like network issues. It states so quite directly. You can test this yourself using socat on node1 in listen mode, then use socat on node2 to send some text to node1. If that fails, then you know 100% there are network issues.

Hi @matthewb, We did test with below command on ports 4444, 4567 and 4568 and didn’t see any issues.

node1: socat - TCP-LISTEN:4444
node2: echo “hello” | socat - TCP:ip.adr.of.node1:4444

If we change user to mysql and execute a mysld & command, it works and node got synced with its donor. Strangely the service startup/SST fails via systemd service start (service mysqld restart). May be some permission issue since this is run as root user which we can’t seem to figure out.

-Abhi

systemd should have user and group as mysql
what command did you exactly execute for starting MySQL?
And can you share the output of

systemctl cat mysqld

Hi Yunus This is the permission we have if this is what you are asking. I think the permissions are correct on the files/folders. We did validate the same with a working environment.

ls -la /etc/ |grep systemd
drwxr-xr-x. 4 root root 195 May 18 2023 systemd

ls -la /etc/systemd/system/mysqld.service
lrwxrwxrwx 1 root root 37 Jul 25 2023 /etc/systemd/system/mysqld.service → /usr/lib/systemd/system/mysql.service

ls -la /usr/lib/systemd/system/mysql.service
-rw-r–r-- 1 root root 4080 Feb 20 15:38 /usr/lib/systemd/system/mysql.service

I am executing service mysqld start / systemctl start mysql. Here is the error we are currently chasing to solve on the joiner node:-

2025-02-20T10:57:13.824732-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1254: /usr/bin/ls: Operation not permitted
2025-02-20T10:57:13.828019-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1258: sockets:
2025-02-20T10:57:14.029559-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1228: Entering loop body : 300
2025-02-20T10:57:14.049399-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1237: Examining pid: 41914
2025-02-20T10:57:14.100138-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1250: /usr/bin/ls: Operation not permitted
2025-02-20T10:57:14.100391-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1250: Testing ls :
2025-02-20T10:57:14.101000-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1251: /usr/bin/ls: Operation not permitted
2025-02-20T10:57:14.101283-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1251: Testing ls :
2025-02-20T10:57:14.101770-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1252: /usr/bin/ls: Operation not permitted
2025-02-20T10:57:14.102034-05:00 0 [Note] [MY-000000] [WSREP-SST] (debug) 1252: Testing ls :
2025-02-20T10:57:14.102819-05:00 0 [Note] [MY-000000] [WSREP-SST] /usr/bin/wsrep_sst_xtrabackup-v2: line 1254: /usr/bin/ls: Operation not permitted

Here is systemctl cat mysqld output.

systemctl.txt (4.2 KB)

-Abhi

Hi!

wsrep_node_address or wsrep_sst_receive_address not set.
Did you check that configuration is correctly set in all nodes ?

Also did you check that all ports are open on all 3 servers and not just on 1?

All MySQL files owner/group should be MySQL.

Are all nodes the same version and galera protocol ?

You can also try enablin “wsrep_debug” for extra verbosity.

Regards

Hi All,

Commenting out the CapabilityBoundingSet in mysqld.service worked and SST succeeded. There might have been some security hardening on these 3 DB instances which is somehow not allowing to execute ls command. We were able to replicate the same issue via a test service script.

Thanks for all your valuable inputs.

-Abhi