I have started the first node with the bootstrap option, and it is running well and serving queries. However, when I try to add the 2nd and 3rd nodes, they both fail with the same error. It’s odd as they were all running perfectly just a bit ago. I had to stop all the machines to do some maintenance and rebooted them. This is the mysql log on the 2nd machine trying to join:
2023-09-07T22:30:55.822334Z 0 [Note] [MY-000000] [Galera] Flow-control interval: [141, 141]
2023-09-07T22:30:55.822343Z 0 [Note] [MY-000000] [Galera] Shifting OPEN -> PRIMARY (TO: 12374121)
2023-09-07T22:30:55.822439Z 1 [Note] [MY-000000] [Galera] ####### processing CC 12374121, local, ordered
2023-09-07T22:30:55.822471Z 1 [Note] [MY-000000] [Galera] Maybe drain monitors from -1 upto current CC event 12374121 upto:-1
2023-09-07T22:30:55.822481Z 1 [Note] [MY-000000] [Galera] Drain monitors from -1 up to -1
2023-09-07T22:30:55.822493Z 1 [Note] [MY-000000] [Galera] Process first view: 329f6151-d406-11ed-8dc7-77e2709c9139 my uuid: 359d6040-4dce-11ee-a6c0-4e911c923161
2023-09-07T22:30:55.822508Z 1 [Note] [MY-000000] [Galera] Server pxc-cluster-node-2 connected to cluster at position 329f6151-d406-11ed-8dc7-77e2709c9139:12374121 with ID 359d6040-4dce-11ee-a6c0-4e911c923161
2023-09-07T22:30:55.822521Z 1 [Note] [MY-000000] [WSREP] Server status change disconnected -> connected
2023-09-07T22:30:55.822566Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-09-07T22:30:55.822597Z 1 [Note] [MY-000000] [Galera] ####### My UUID: 359d6040-4dce-11ee-a6c0-4e911c923161
2023-09-07T22:30:55.822609Z 1 [Note] [MY-000000] [Galera] Cert index reset to 00000000-0000-0000-0000-000000000000:-1 (proto: 10), state transfer needed: yes
2023-09-07T22:30:55.822781Z 0 [Note] [MY-000000] [Galera] Service thread queue flushed.
2023-09-07T22:30:55.822876Z 1 [Note] [MY-000000] [Galera] ####### Assign initial position for certification: 00000000-0000-0000-0000-000000000000:-1, protocol version: -1
2023-09-07T22:30:55.822900Z 1 [Note] [MY-000000] [Galera] State transfer required:
Group state: 329f6151-d406-11ed-8dc7-77e2709c9139:12374121
Local state: 00000000-0000-0000-0000-000000000000:-1
2023-09-07T22:30:55.822909Z 1 [Note] [MY-000000] [WSREP] Server status change connected -> joiner
2023-09-07T22:30:55.822917Z 1 [Note] [MY-000000] [WSREP] wsrep_notify_cmd is not defined, skipping notification.
2023-09-07T22:30:55.823125Z 0 [Note] [MY-000000] [WSREP] Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.128.0.50' --datadir '/var/lib/mysql/' --basedir '/usr/' --plugindir '/usr/lib/mysql/plugin/' --defaults-file '/etc/mysql/my.cnf' --defaults-group-suffix '' --parent '68329' --mysqld-version '8.0.31-23.2' '' )
2023-09-07T22:30:56.729249Z 1 [Note] [MY-000000] [WSREP] Prepared SST request: xtrabackup-v2|10.128.0.50:4444/xtrabackup_sst//1
2023-09-07T22:30:56.729353Z 1 [Note] [MY-000000] [Galera] Check if state gap can be serviced using IST
2023-09-07T22:30:56.729382Z 1 [Note] [MY-000000] [Galera] Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 329f6151-d406-11ed-8dc7-77e2709c9139
2023-09-07T22:30:56.729403Z 1 [Note] [MY-000000] [Galera] ####### IST uuid:00000000-0000-0000-0000-000000000000 f: 0, l: 12374121, STRv: 3
2023-09-07T22:30:56.729500Z 1 [Note] [MY-000000] [Galera] IST receiver addr using tcp://10.128.0.50:4568
2023-09-07T22:30:56.729708Z 1 [Note] [MY-000000] [Galera] Prepared IST receiver for 0-12374121, listening at: tcp://10.128.0.50:4568
2023-09-07T22:30:56.730269Z 0 [Note] [MY-000000] [Galera] Member 0.0 (pxc-cluster-node-2) requested state transfer from '*any*'. Selected 1.0 (pxc-cluster-node-1)(SYNCED) as donor.
2023-09-07T22:30:56.730303Z 0 [Note] [MY-000000] [Galera] Shifting PRIMARY -> JOINER (TO: 12374123)
2023-09-07T22:30:56.730343Z 1 [Note] [MY-000000] [Galera] Requesting state transfer: success, donor: 1
2023-09-07T22:30:56.730363Z 1 [Note] [MY-000000] [Galera] Resetting GCache seqno map due to different histories.
2023-09-07T22:30:56.730375Z 1 [Note] [MY-000000] [Galera] GCache history reset: 329f6151-d406-11ed-8dc7-77e2709c9139:0 -> 329f6151-d406-11ed-8dc7-77e2709c9139:12374121
2023-09-07T22:30:57.432489Z 0 [Warning] [MY-000000] [Galera] 1.0 (pxc-cluster-node-1): State transfer to 0.0 (pxc-cluster-node-2) failed: -13 (Permission denied)
2023-09-07T22:30:57.432538Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1216: Will never receive state. Need to abort.
2023-09-07T22:30:57.432550Z 0 [Note] [MY-000000] [Galera] gcomm: terminating thread
2023-09-07T22:30:57.432571Z 0 [Note] [MY-000000] [Galera] gcomm: joining thread
2023-09-07T22:30:57.432703Z 0 [Note] [MY-000000] [Galera] gcomm: closing backend
2023-09-07T22:30:58.436334Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,359d6040-a6c0,8)
memb {
359d6040-a6c0,0
}
joined {
}
left {
}
partitioned {
abb830ee-9751,0
}
)
2023-09-07T22:30:58.436409Z 0 [Note] [MY-000000] [Galera] (359d6040-a6c0, 'tcp://0.0.0.0:4567') turning message relay requesting off
2023-09-07T22:30:58.436429Z 0 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2023-09-07T22:30:58.436440Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2023-09-07T22:30:58.436606Z 0 [Note] [MY-000000] [Galera] gcomm: closed
2023-09-07T22:30:58.436628Z 0 [Note] [MY-000000] [Galera] /usr/sbin/mysqld: Terminated.
2023-09-07T22:30:58.436639Z 0 [Note] [MY-000000] [WSREP] Initiating SST cancellation
2023-09-07T22:30:58.436646Z 0 [Note] [MY-000000] [WSREP] Terminating SST process
This is the MySQL conf for the joining node:
[client]
socket=/var/run/mysqld/mysqld.sock
[mysqld]
server-id=1
datadir=/var/lib/mysql
socket=/var/run/mysqld/mysqld.sock
log_error=/var/log/mysql/error.log
pid_file=/var/run/mysqld/mysqld.pid
max_connections=1000
wsrep_auto_increment_control=0
auto_increment_increment=1
binlog_expire_logs_seconds=604800
wsrep_provider=/usr/lib/galera4/libgalera_smm.so
wsrep_cluster_address=gcomm://10.128.0.51,10.128.0.50,10.128.0.49
binlog_format=ROW
wsrep_slave_threads=8
wsrep_log_conflicts
innodb_autoinc_lock_mode=2
wsrep_node_address=10.128.0.50
wsrep_cluster_name=pxc-cluster
wsrep_node_name=pxc-cluster-node-2
pxc_strict_mode=ENFORCING
wsrep_sst_method=xtrabackup-v2
The failure of course is in these lines:
2023-09-07T22:30:57.432489Z 0 [Warning] [MY-000000] [Galera] 1.0 (pxc-cluster-node-1): State transfer to 0.0 (pxc-cluster-node-2) failed: -13 (Permission denied)
2023-09-07T22:30:57.432538Z 0 [ERROR] [MY-000000] [Galera] gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():1216: Will never receive state. Need to abort.
It says permission denied, but it doesn’t specify which permission to do what?
The datadir on both nodes is owned by the mysql
user, and as far as I understand in PXC 8, we don’t need to create an SST user in MySQL as its done transparently and automatically. How can I figure out what permission is failing?