Node failure in Cluster

Existing 3-node cluster failed to accept updates to database. Trace to offending node. Stopped the offending node not synched and 2 existing nodes are fine. Appears to be a communication issue. I have tried running socat on 4444, 4567, 4568 from both healthy nodes and offending node with no valid results.
Node3_log_snapshot.docx (194.1 KB)

2 Likes

Hi @Mikem , thanks for posting to the Percona forums!

Have you checked for host / network firewalls blocking these ports? You are correct that you need to ensure 4444, 4567, 4568 are open between all three members of your cluster.

1 Like

It should not be an issue. I tried socat but all three nodes produced the same results. basically nothing.

A good node shows this with ‘netstat -tunlp’

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 525/systemd-resolve
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 839/sshd
tcp 0 0 0.0.0.0:4567 0.0.0.0:* LISTEN 28115/mysqld
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 12674/sendmail: MTA
tcp 0 0 127.0.0.1:587 0.0.0.0:* LISTEN 12674/sendmail: MTA
tcp 0 0 0.0.0.0:42000 0.0.0.0:* LISTEN 17346/docker-proxy
tcp 0 0 0.0.0.0:42001 0.0.0.0:* LISTEN 17324/docker-proxy
tcp 0 0 127.0.0.1:41843 0.0.0.0:* LISTEN 10565/containerd
tcp6 0 0 :::22 :::* LISTEN 839/sshd
tcp6 0 0 :::33060 :::* LISTEN 28115/mysqld
tcp6 0 0 :::3306 :::* LISTEN 28115/mysqld
tcp6 0 0 :::80 :::* LISTEN 13158/apache2
tcp6 0 0 :::42000 :::* LISTEN 17352/docker-proxy
tcp6 0 0 :::42001 :::* LISTEN 17331/docker-proxy
udp 0 0 127.0.0.53:53 0.0.0.0:* 525/systemd-resolve

The offending node is the same without the 4567/3306/33060 as mysql is not currently running there .

Is there a better way to test traffic between nodes?

1 Like

Hi @Mikem , were you able to resolve this issue?

Running this from a healthy node:

nc -z -w2 <bad-node> 4567

Should confirm whether network access is working correctly or not

1 Like