Percona Galera Arbitrator can not join cluster

Hello, I have setup a cluster with 2 data-nodes (node 1 in datacenter 1 and node 2 in datacenter 2. I try to add arbitrator in datacenter 2 but it’s not working. Here is the config file /etc/default/garb:

# Copyright (C) 2012 Codership Oy
# This config file is to be sourced by garb service script.

# A comma-separated list of node addresses (address[:port]) in the cluster
GALERA_NODES="172.17.123.107:4567,192.168.100.145:4567"

# Galera cluster name, should be the same as on the rest of the nodes.
GALERA_GROUP="pxc-cluster"

# Optional Galera internal options string (e.g. SSL settings)
# see http://galeracluster.com/documentation-webpages/galeraparameters.html
GALERA_OPTIONS="socket.ssl_cipher=AES128-SHA256;socket.ssl_key=/etc/ssl/mysql/server-key.pem;socket.ssl_cert=/etc/ssl/mysql/server-cert.pem;socket.ssl_ca=/etc/ssl/mysql/ca.pem"

# Log file for garbd. Optional, by default logs to syslog
# Deprecated for CentOS7, use journalctl to query the log for garbd
LOG_FILE="/var/log/garbd.log"

Log file generated:

2022-07-13 12:16:01.607  INFO: CRC-32C: using 64-bit x86 acceleration.
2022-07-13 12:16:01.607  INFO: Read config:
	daemon:      0
	name:        garb
	address:     gcomm://172.17.123.107:4567,192.168.100.145:4567
	group:       pxc-cluster
	sst:         trivial
	donor:
	options:     socket.ssl_cipher=AES128-SHA256;socket.ssl_key=/etc/ssl/mysql/server-key.pem;socket.ssl_cert=/etc/ssl/mysql/server-cert.pem;socket.ssl_ca=/etc/ssl/mysql/ca.pem; gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
	cfg:
	log:         /var/log/garbd.log
	recv_script:

2022-07-13 12:16:01.608  WARN: Option 'gcs.fc_master_slave' is deprecated and will be removed in the future versions, please use 'gcs.fc_single_primary' instead.
2022-07-13 12:16:01.609  INFO: protonet asio version 0
2022-07-13 12:16:01.609  INFO: Using CRC-32C for message checksums.
2022-07-13 12:16:01.609  INFO: backend: asio
2022-07-13 12:16:01.609  INFO: gcomm thread scheduling priority set to other:0
2022-07-13 12:16:01.609  WARN: Fail to access the file (./gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2022-07-13 12:16:01.609  INFO: Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2022-07-13 12:16:01.609  INFO: GMCast version 0
2022-07-13 12:16:01.609  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2022-07-13 12:16:01.609  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2022-07-13 12:16:01.610  INFO: EVS version 1
2022-07-13 12:16:01.610  INFO: gcomm: connecting to group 'pxc-cluster', peer '172.17.123.107:4567,192.168.100.145:4567'
2022-07-13 12:16:04.610  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://172.17.123.107:4567 timed out, no messages seen in PT3S, socket stats: rtt: 283 rttvar: 141 rto: 200000 lost: 0 last_data_recv: 3000 cwnd: 10 last_queued_since: 3000113753 last_delivered_since: 3000113753 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:04.610  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://192.168.100.145:4567 timed out, no messages seen in PT3S, socket stats: rtt: 196984 rttvar: 98492 rto: 588000 lost: 0 last_data_recv: 2804 cwnd: 10 last_queued_since: 2803444419 last_delivered_since: 2803444419 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:04.611  INFO: announce period timed out (pc.announce_timeout)
2022-07-13 12:16:04.611  INFO: EVS version upgrade 0 -> 1
2022-07-13 12:16:04.611  INFO: PC protocol upgrade 0 -> 1
2022-07-13 12:16:04.611  WARN: no nodes coming from prim view, prim not possible
2022-07-13 12:16:04.611  INFO: Current view of cluster as seen by this node
view (view_id(NON_PRIM,813458ad-80a7,1)
memb {
	813458ad-80a7,0
	}
joined {
	}
left {
	}
partitioned {
	}
)
2022-07-13 12:16:05.111  WARN: last inactive check more than PT1.5S (3*evs.inactive_check_period) ago (PT3.50116S), skipping check
2022-07-13 12:16:09.611  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://172.17.123.107:4567 timed out, no messages seen in PT3S, socket stats: rtt: 296 rttvar: 148 rto: 200000 lost: 0 last_data_recv: 3500 cwnd: 10 last_queued_since: 3499767536 last_delivered_since: 3499767536 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:13.111  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://192.168.100.145:4567 timed out, no messages seen in PT3S, socket stats: rtt: 194271 rttvar: 97135 rto: 588000 lost: 0 last_data_recv: 3304 cwnd: 10 last_queued_since: 3305593214 last_delivered_since: 3305593214 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:16.611  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://172.17.123.107:4567 timed out, no messages seen in PT3S, socket stats: rtt: 326 rttvar: 163 rto: 200000 lost: 0 last_data_recv: 3500 cwnd: 10 last_queued_since: 3499618112 last_delivered_since: 3499618112 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:20.112  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://192.168.100.145:4567 timed out, no messages seen in PT3S, socket stats: rtt: 195190 rttvar: 97595 rto: 588000 lost: 0 last_data_recv: 3304 cwnd: 10 last_queued_since: 3304705385 last_delivered_since: 3304705385 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:23.612  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://172.17.123.107:4567 timed out, no messages seen in PT3S, socket stats: rtt: 266 rttvar: 133 rto: 200000 lost: 0 last_data_recv: 3500 cwnd: 10 last_queued_since: 3499640067 last_delivered_since: 3499640067 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:27.112  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://192.168.100.145:4567 timed out, no messages seen in PT3S, socket stats: rtt: 194915 rttvar: 97457 rto: 588000 lost: 0 last_data_recv: 3308 cwnd: 10 last_queued_since: 3304965121 last_delivered_since: 3304965121 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:30.613  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://172.17.123.107:4567 timed out, no messages seen in PT3S, socket stats: rtt: 271 rttvar: 135 rto: 200000 lost: 0 last_data_recv: 3500 cwnd: 10 last_queued_since: 3499627038 last_delivered_since: 3499627038 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:34.113  INFO: (813458ad-80a7, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://192.168.100.145:4567 timed out, no messages seen in PT3S, socket stats: rtt: 194231 rttvar: 97115 rto: 588000 lost: 0 last_data_recv: 3308 cwnd: 10 last_queued_since: 3305667131 last_delivered_since: 3305667131 send_queue_length: 0 send_queue_bytes: 0 (gmcast.peer_timeout)
2022-07-13 12:16:34.616  INFO: PC protocol downgrade 1 -> 0
2022-07-13 12:16:34.617  INFO: Current view of cluster as seen by this node
view ((empty))
2022-07-13 12:16:34.617 ERROR: failed to open gcomm backend connection: 110: failed to reach primary view (pc.wait_prim_timeout): 110 (Connection timed out)
	 at gcomm/src/pc.cpp:connect():161
2022-07-13 12:16:34.617 ERROR: gcs/src/gcs_core.cpp:gcs_core_open():219: Failed to open backend connection: -110 (Connection timed out)
2022-07-13 12:16:35.617  INFO: gcomm: terminating thread
2022-07-13 12:16:35.617  INFO: gcomm: joining thread
2022-07-13 12:16:35.617 ERROR: gcs/src/gcs.cpp:gcs_open():1758: Failed to open channel 'pxc-cluster' at 'gcomm://172.17.123.107:4567,192.168.100.145:4567': -110 (Connection timed out)
2022-07-13 12:16:35.617  INFO: Shifting CLOSED -> DESTROYED (TO: 0)
2022-07-13 12:16:35.617 FATAL: Garbd exiting with error: Failed to open connection to group: 110 (Connection timed out)
	 at garb/garb_gcs.cpp:Gcs():35

Command to start garbd:

service garbd start

Many thanks.

2 Likes

You need to add socket.ssl=YES
Please search the forums before posting. Others have posted on this same exact issue.

2 Likes

This should be more clear once we merge:

-option="socket.ssl=YES;  ..."
1 Like

Hello Matthewb,

When I add that config in /etc/default/garb, it has the error:

2022-07-13 16:12:58.351  WARN: Option 'gcs.fc_master_slave' is deprecated and will be removed in the future versions, please use 'gcs.fc_single_primary' instead.
2022-07-13 16:12:58.352  INFO: protonet asio version 0
2022-07-13 16:12:58.352 ERROR: failed to create gcomm backend connection: 22: Bad value '/etc/ssl/mysql/server-cert.pem' for SSL parameter 'socket.ssl_cert': 33558541: 'error:0200100D:system library:fopen:Permission denied': 22 (Invalid argument)
	 at galerautils/src/gu_asio.cpp:ssl_prepare_context():444
2022-07-13 16:12:58.352 ERROR: gcs/src/gcs_core.cpp:gcs_core_open():226: Failed to initialize backend using 'gcomm://172.17.123.107:4567,192.168.100.145:4567': -22 (Invalid argument)
2022-07-13 16:12:58.352 ERROR: gcs/src/gcs.cpp:gcs_open():1758: Failed to open channel 'pxc-cluster' at 'gcomm://172.17.123.107:4567,192.168.100.145:4567': -22 (Invalid argument)
2022-07-13 16:12:58.352  INFO: Shifting CLOSED -> DESTROYED (TO: 0)
2022-07-13 16:12:58.352 FATAL: Garbd exiting with error: Failed to open connection to group: 22 (Invalid argument)
	 at garb/garb_gcs.cpp:Gcs():35

Content of /etc/default/garb:

# Copyright (C) 2012 Codership Oy
# This config file is to be sourced by garb service script.

# A comma-separated list of node addresses (address[:port]) in the cluster
GALERA_NODES="172.17.123.107:4567,192.168.100.145:4567"

# Galera cluster name, should be the same as on the rest of the nodes.
GALERA_GROUP="pxc-cluster"

# Optional Galera internal options string (e.g. SSL settings)
# see http://galeracluster.com/documentation-webpages/galeraparameters.html
GALERA_OPTIONS="socket.ssl=YES;socket.ssl_cipher=AES128-SHA256;socket.ssl=YES;socket.ssl_key=/etc/ssl/mysql/server-key.pem;socket.ssl_cert=/etc/ssl/mysql/server-cert.pem;socket.ssl_ca=/etc/ssl/mysql/ca.pem"

# Log file for garbd. Optional, by default logs to syslog
# Deprecated for CentOS7, use journalctl to query the log for garbd
LOG_FILE="/var/log/garbd.log"
1 Like

Yep. You need to read that and fix the issue.

1 Like