I had 3 cluster databases.
One node went down unceremoniously.
We couldn’t understand why.
It could possibly be because of the table_open_files value (currently the value is 2000).
The database size is 7.2 TB.
When I want to include this node again, there is a delay in the connected software services because the surviving node 1 is looking at node 3.
We have currently distributed the services over 2 nodes. However, the backup was also working on the crashed Node 3. That’s why I have to bring this back soon.
cluster-node-3 donor is cluster-node-2
cluster-node-2 donor is cluster-node-1
cluster-node-1 donor is cluster-node-3
Node 1 and Node 2 is avaliable and node 3 mysql is not working now can you help?
We try to bosstrap=true but we got 2 nodes so it doesnt work.
Using Mysql 5.7 .
Thanks for your time.
Hi @aycelen,
We try to bosstrap=true but we got 2 nodes so it doesnt work.
You haven’t mentioned the error that got node3 down. Also, why are you not able to join node3 to the cluster? What error are you getting? You don’t need to bootstrap to join the third node in the pxc cluster unless you want to sync all other nodes from the source of truth node. We need more info here to assist.
Regards,
Vinodh Guruji
Hi Vinodh,
Thanks for interest.
I can’t mentioned cuz we dont know yet. I am very new to mysql. I can say that I know almost nothing. When I check the logs I can’t find anything and we didn’t have any errorlogs file
There is Node-3 logs
[Note] WSREP: New cluster view: global state: 6a***********--:81, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
[Note] WSREP: Setting wsrep_ready to false
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: New cluster view: global state: 6a--:81, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version 3
[Note] WSREP: Setting wsrep_ready to false
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: New cluster view: global state: 6a--:81, view# -1: non-Primary, number of nodes: 2, my index: 0, protocol version 3
[Note] WSREP: Setting wsrep_ready to false
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: State transfer required:
Group state: 6a--:58
Local state: 6a--:81
[Note] WSREP: REPL Protocols: 9 (4, 2)
[Note] WSREP: REPL Protocols: 9 (4, 2)
[Note] WSREP: New cluster view: global state: 6a--:58,
view# 100: Primary, number of nodes: 3, my index: 1, protocol version 3
[Note] WSREP: Setting wsrep_ready to true
[Warning] WSREP: Gap in state sequence. Need state transfer.
[Note] WSREP: Setting wsrep_ready to false
[Note] WSREP: You have configured ‘xtrabackup-v2’ state snapshot transfer method which cannot be performed on a running server.
Wsrep provider won’t be able to fall back to it if other means
of state transfer are unavailable. In that case you will need to restart the server.
[Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 2 → 2) (Increment: 3 → 3)
[Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
[Note] WSREP: Assign initial position for certification: 58, protocol version: 4
[Note] WSREP: Service thread queue flushed.
[Note] WSREP: Check if state gap can be serviced using IST
[Note] WSREP: IST receiver addr using tcp://IP2:4568
[Note] WSREP: Prepared IST receiver, listening at: tcp://IP2:4568
[Note] WSREP: State gap can be likely serviced using IST. SST request though present would be void.
[Note] WSREP: Member 1.0 (cluster-node-3) requested state transfer from ‘cluster-node-2’. Selected 2.0 (cluster-node-1)(SYNCED) as donor.
[Note] WSREP: Shifting PRIMARY → JOINER (TO: 56350)
[Note] WSREP: Requesting state transfer: success, donor: 2
[Note] WSREP: GCache history reset: 6a--:81 → 6a--:********58
31253 [Note] Aborted connection 2931253 to db: ‘unconnected’ user: ‘obss’ host: ‘ip’ (Got an error reading communication packets)
[Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): discarded 26842852480 bytes
[Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): found 1/493 locked buffers
[Note] WSREP: Receiving IST: 1577 writesets, seqnos 5638081881-********58
[Warning] WSREP: 2.0 (cluster-node-1): State transfer to 1.0 (cluster-node-3) failed: -110 (Connection timed out)
[ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():811: Will never receive state. Need to abort.
[Note] WSREP: gcomm: terminating thread
[Note] WSREP: gcomm: joining thread
[Note] WSREP: gcomm: closing backend
[Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,4666,107)
memb {
96728195,0
}
joined {
}
left {
}
partitioned {
4666,0
ad*e74,0
}
)
[Note] WSREP: Current view of cluster as seen by this node
view ((empty))
[Note] Aborted connection 2931241 to db: ‘unconnected’ user: ‘pmm’ host: ‘127.0.0.1’ (Got an error writing communication packets)
2025-01-29T08:42:11.731132Z 0 [Note] WSREP: gcomm: closed
2025-01-29T08:42:11.731248Z 0 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Can you please provide all logs from node3, so we can help determine why it crashed?
No! No! When you bootstrap you will create a BRAND NEW CLUSTER! You do not want this! You already have a running cluster. The ONLY time you bootstrap is when all nodes are down and you need to start the cluster.
Remove this config. You should allow any node to receive from any node.
Check your network. Node3 and node1 are having issues talking to each other. Make sure 3306, 4444, 4567, and 4568 ports are open.