Node 2 and 3 not joining cluster

Sajid_Ali_Sundrani · March 29, 2022, 1:00pm

master node 1 working fine with systemctl start mysql@bootstrap

but when i am starting mysql on node 2 and 3 its failed logs are are pasted here

INFO: Skipping wsrep-recover for 647330dd-af48-11ec-ae4c-6ae8229bb299:1 pair
INFO: Assigning 647330dd-af48-11ec-ae4c-6ae8229bb299:1 to wsrep_start_position
2022-03-29T12:56:58.778701Z 0 [Warning] [MY-011068] [Server] The syntax ‘wsrep_slave_threads’ is deprecated and will be removed in a future release. Please use wsrep_applier_threads instead.
2022-03-29T12:56:58.805397Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.26-16.1) starting as process 31419
2022-03-29T12:56:58.825131Z 0 [Note] [MY-000000] [Galera] Loading provider /usr/lib/galera4/libgalera_smm.so initial position: 647330dd-af48-11ec-ae4c-6ae8229bb299:1
2022-03-29T12:56:58.825272Z 0 [Note] [MY-000000] [Galera] wsrep_load(): loading provider library ‘/usr/lib/galera4/libgalera_smm.so’
2022-03-29T12:56:58.840812Z 0 [Note] [MY-000000] [Galera] wsrep_load(): Galera 4.7(7ea7225) by Codership Oy info@codership.com (modified by Percona https://percona.com/) loaded successfully.
2022-03-29T12:56:58.840891Z 0 [Note] [MY-000000] [Galera] CRC-32C: using 64-bit x86 acceleration.
2022-03-29T12:56:58.844002Z 0 [Note] [MY-000000] [Galera] Found saved state: 647330dd-af48-11ec-ae4c-6ae8229bb299:1, safe_to_bootstrap: 1
2022-03-29T12:56:58.844201Z 0 [Note] [MY-000000] [Galera] GCache DEBUG: opened preamble:
Version: 2
UUID: 647330dd-af48-11ec-ae4c-6ae8229bb299
Seqno: 1 - 1
Offset: 1280
Synced: 1
2022-03-29T12:56:58.844252Z 0 [Note] [MY-000000] [Galera] Recovering GCache ring buffer: version: 2, UUID: 647330dd-af48-11ec-ae4c-6ae8229bb299, offset: 1280
2022-03-29T12:56:58.845151Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer initial scan… 0.0% ( 0/134217752 bytes) complete.
2022-03-29T12:56:58.845871Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer initial scan…100.0% (134217752/134217752 bytes) complete.
2022-03-29T12:56:58.845965Z 0 [Note] [MY-000000] [Galera] Recovering GCache ring buffer: found gapless sequence 1-1
2022-03-29T12:56:58.846172Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer unused buffers scan… 0.0% ( 0/176 bytes) complete.
2022-03-29T12:56:58.846224Z 0 [Note] [MY-000000] [Galera] GCache::RingBuffer unused buffers scan…100.0% (176/176 bytes) complete.
2022-03-29T12:56:58.846268Z 0 [Note] [MY-000000] [Galera] GCache DEBUG: RingBuffer::recover(): found 0/1 locked buffers
2022-03-29T12:56:58.846425Z 0 [Note] [MY-000000] [Galera] GCache DEBUG: RingBuffer::recover(): free space: 134217552/134217728
2022-03-29T12:56:58.854763Z 0 [Note] [MY-000000] [Galera] Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.18.136; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 4; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = true; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 10; socket.checksum = 2; socket.recv_buf_size = auto; socket.send_buf_size = auto;
2022-03-29T12:56:59.042566Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2022-03-29T12:56:59.044483Z 0 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 647330dd-af48-11ec-ae4c-6ae8229bb299:1, protocol version: -1
2022-03-29T12:56:59.044987Z 0 [Note] [MY-000000] [WSREP] Starting replication
2022-03-29T12:56:59.045078Z 0 [Note] [MY-000000] [Galera] Connecting with bootstrap option: 0
2022-03-29T12:56:59.046661Z 0 [Note] [MY-000000] [Galera] Setting GCS initial position to 647330dd-af48-11ec-ae4c-6ae8229bb299:1
2022-03-29T12:56:59.047793Z 0 [Note] [MY-000000] [Galera] protonet asio version 0
2022-03-29T12:56:59.049561Z 0 [Note] [MY-000000] [Galera] Using CRC-32C for message checksums.
2022-03-29T12:56:59.050493Z 0 [Note] [MY-000000] [Galera] backend: asio
2022-03-29T12:56:59.051253Z 0 [Note] [MY-000000] [Galera] gcomm thread scheduling priority set to other:0
2022-03-29T12:56:59.053333Z 0 [Warning] [MY-000000] [Galera] Fail to access the file (/var/lib/mysql//gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2022-03-29T12:56:59.053582Z 0 [Note] [MY-000000] [Galera] Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2022-03-29T12:56:59.054690Z 0 [Note] [MY-000000] [Galera] GMCast version 0
2022-03-29T12:56:59.055754Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) listening at tcp://0.0.0.0:4567
2022-03-29T12:56:59.055859Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) multicast: , ttl: 1
2022-03-29T12:56:59.056094Z 0 [Note] [MY-000000] [Galera] EVS version 1
2022-03-29T12:56:59.056734Z 0 [Note] [MY-000000] [Galera] gcomm: connecting to group ‘pxc-cluster’, peer ‘192.168.18.134:,192.168.18.135:,192.168.18.136:’
2022-03-29T12:56:59.065018Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) Found matching local endpoint for a connection, blacklisting address tcp://192.168.18.136:4567
2022-03-29T12:57:02.066774Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 1851 rttvar: 925 rto: 204000 lost: 0 last_data_recv: 3004 cwnd: 10 last_queued_since: 3003098361 last_delivered_since: 3003098361 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:02.068360Z 0 [Note] [MY-000000] [Galera] announce period timed out (pc.announce_timeout)
2022-03-29T12:57:02.071541Z 0 [Note] [MY-000000] [Galera] EVS version upgrade 0 → 1
2022-03-29T12:57:02.071862Z 0 [Note] [MY-000000] [Galera] PC protocol upgrade 0 → 1
2022-03-29T12:57:02.072417Z 0 [Warning] [MY-000000] [Galera] no nodes coming from prim view, prim not possible
2022-03-29T12:57:02.072745Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,b84e689b-83ac,1)
memb {
b84e689b-83ac,0
}
joined {
}
left {
}
partitioned {
}
)
2022-03-29T12:57:02.574394Z 0 [Warning] [MY-000000] [Galera] last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT3.51823S), skipping check
2022-03-29T12:57:06.073508Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 785 rttvar: 392 rto: 200000 lost: 0 last_data_recv: 3004 cwnd: 10 last_queued_since: 3004044391 last_delivered_since: 3004044391 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:10.085545Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 1774 rttvar: 887 rto: 204000 lost: 0 last_data_recv: 3008 cwnd: 10 last_queued_since: 3006169848 last_delivered_since: 3006169848 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:14.098041Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 415 rttvar: 207 rto: 200000 lost: 0 last_data_recv: 3008 cwnd: 10 last_queued_since: 3008547444 last_delivered_since: 3008547444 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:18.110178Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 738 rttvar: 369 rto: 200000 lost: 0 last_data_recv: 3008 cwnd: 10 last_queued_since: 3006611371 last_delivered_since: 3006611371 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:22.126373Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 871 rttvar: 435 rto: 200000 lost: 0 last_data_recv: 3008 cwnd: 10 last_queued_since: 3009188548 last_delivered_since: 3009188548 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:26.142624Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 820 rttvar: 410 rto: 200000 lost: 0 last_data_recv: 3012 cwnd: 10 last_queued_since: 3011893431 last_delivered_since: 3011893431 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:30.158935Z 0 [Note] [MY-000000] [Galera] (b84e689b-83ac, ‘tcp://0.0.0.0:4567’) connection to peer 00000000-0000 with addr tcp://192.168.18.134:4567 timed out, no messages seen in PT3S, socket stats: rtt: 466 rttvar: 233 rto: 200000 lost: 0 last_data_recv: 3012 cwnd: 10 last_queued_since: 3014044521 last_delivered_since: 3014044521 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-03-29T12:57:32.194398Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 → 0
2022-03-29T12:57:32.194901Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2022-03-29T12:57:32.196686Z 0 [ERROR] [MY-000000] [Galera] failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():161
2022-03-29T12:57:32.196736Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_core.cpp:gcs_core_open():219: Failed to open backend connection: -110 (Connection timed out)
2022-03-29T12:57:33.198684Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread
2022-03-29T12:57:33.198803Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread
2022-03-29T12:57:33.199149Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs.cpp:gcs_open():1757: Failed to open channel ‘pxc-cluster’ at ‘gcomm://192.168.18.134,192.168.18.135,192.168.18.136’: -110 (Connection timed out)
2022-03-29T12:57:33.199226Z 0 [ERROR] [MY-000000] [Galera] gcs connect failed: Connection timed out
2022-03-29T12:57:33.199320Z 0 [ERROR] [MY-000000] [WSREP] Provider/Node (gcomm://192.168.18.134,192.168.18.135,192.168.18.136) failed to establish connection with cluster (reason: 7)
2022-03-29T12:57:33.200773Z 0 [ERROR] [MY-010119] [Server] Aborting
2022-03-29T12:57:33.211424Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.26-16.1) Percona XtraDB Cluster (GPL), Release rel16, Revision b141904, WSREP version 26.4.3.
2022-03-29T12:57:33.215551Z 0 [Note] [MY-000000] [Galera] dtor state: CLOSED
2022-03-29T12:57:33.216160Z 0 [Note] [MY-000000] [Galera] MemPool(TrxHandleSlave): hit ratio: 0, misses: 0, in use: 0, in pool: 0
2022-03-29T12:57:33.221436Z 0 [Note] [MY-000000] [Galera] apply mon: entered 0
2022-03-29T12:57:33.226059Z 0 [Note] [MY-000000] [Galera] apply mon: entered 0
2022-03-29T12:57:33.232207Z 0 [Note] [MY-000000] [Galera] apply mon: entered 0
2022-03-29T12:57:33.232265Z 0 [Note] [MY-000000] [Galera] cert index usage at exit 0
2022-03-29T12:57:33.232275Z 0 [Note] [MY-000000] [Galera] cert trx map usage at exit 0
2022-03-29T12:57:33.232295Z 0 [Note] [MY-000000] [Galera] deps set usage at exit 0
2022-03-29T12:57:33.232308Z 0 [Note] [MY-000000] [Galera] avg deps dist 0
2022-03-29T12:57:33.232315Z 0 [Note] [MY-000000] [Galera] avg cert interval 0
2022-03-29T12:57:33.232320Z 0 [Note] [MY-000000] [Galera] cert index size 0
2022-03-29T12:57:33.232435Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2022-03-29T12:57:33.232622Z 0 [Note] [MY-000000] [Galera] wsdb trx map usage 0 conn query map usage 0
2022-03-29T12:57:33.232676Z 0 [Note] [MY-000000] [Galera] MemPool(LocalTrxHandle): hit ratio: 0, misses: 0, in use: 0, in pool: 0
2022-03-29T12:57:33.233007Z 0 [Note] [MY-000000] [Galera] Shifting CLOSED → DESTROYED (TO: 0)
2022-03-29T12:57:33.233850Z 0 [Note] [MY-000000] [Galera] Flushing memory map to disk…

Sajid_Ali_Sundrani · March 29, 2022, 1:03pm

can anyone help me here please,
when i am going to start mysql on node 2 & 3 its failed with above logs, all thre nodes are pingable each other no firewall is running all ports are open, but facing issue again and again.

matthewb · March 29, 2022, 3:33pm

The above is repeated over and over in your logs. You do have some sort of networking issue. From node2, can you telnet node1 4567 and get a response?

Sajid_Ali_Sundrani · March 30, 2022, 5:26am

Dear Matthewb,
Thanks for your kind response, i have tried to telnet my all three local machines reply is pasted below

root@node3:~# telnet 192.168.18.134 4567
Trying 192.168.18.134…
Connected to 192.168.18.134.
Escape character is ‘^]’.

Connection closed by foreign host.

Sajid_Ali_Sundrani · March 30, 2022, 5:27am

there is no issue found on my machines even i have disabled the ufw firewalld on all nodes still i am not able to telnet,

Michael_Coburn · March 30, 2022, 3:13pm

I assume this is PXC 8? By default we ship with pxc_encrypt_cluster_traffic=ON so we expect that JOINER nodes will have copies of the ssl certificates in place. Please ensure you have followed the deployment instructions in our documentation:
https://www.percona.com/doc/percona-xtradb-cluster/LATEST/security/encrypt-traffic.html

Topic		Replies	Views
The node does not join back into the cluster Percona XtraDB Cluster 8.x	3	1119	February 12, 2024
Can't start node 2 and 3 on ubuntu 20.04 Percona XtraDB Cluster 8.x percona	11	1704	August 26, 2022
First node fails to restart after bootstrap Percona XtraDB Cluster 5.x	13	20485	April 5, 2017
Not able to join the node to the cluster Percona XtraDB Cluster 8.x mysql , percona	15	2729	February 25, 2021
Status of the cluster = > VAR=bash /usr/bin/mysql-systemd galera-recovery Percona XtraDB Cluster 8.x mysql , percona	3	707	June 27, 2022

Node 2 and 3 not joining cluster

Related topics