Hello;
I had a crash on my donor node so I wanted to pass one of the machines in the cluster node as donor.
The configuration file for the node that must pass donor is below:
The MySQL server
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
#skip-external-locking
key_buffer_size = 384M
max_allowed_packet = 1M
table_open_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
Try number of CPU’s*2 for thread_concurrency
thread_concurrency = 8
max_connections=10000
max_connect_errors=10000
#######################################################Configuration Percona
wsrep_provider_options=gmcast.listen_addr=tcp://0.0.0.0:4567
wsrep_cluster_address=gcomm://node1,node2,node3
datadir=/var/lib/mysql
user=mysql
Path to Galera library
wsrep_provider=/usr/lib64/libgalera_smm.so
In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
This is a recommended tuning variable for performance
innodb_locks_unsafe_for_binlog=1
This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
Node #3 address
wsrep_node_address=@ip_node3
SST method
wsrep_sst_method=xtrabackup
Cluster name
wsrep_cluster_name=cluster_name
Authentication for SST method
wsrep_sst_auth=“sstuser:PWD”
wsrep_sst_donor= node3
server-id = 3
When I restart the new node donor without any parameters with / etc / init.d / mysql restart - wsrep-cluster-address = gcomm /, mysql starts without problem.
Case 1: Now when I want to add other machines in the cluster.Et I get the following error:
140123 11:26:27 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140123 11:26:27 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.cOxHIxHCdo
140123 11:26:33 mysqld_safe WSREP: Recovered position c9a557c5-7eb7-11e3-0800-b30c642d54fb:631798
140123 11:26:33 [Note] WSREP: wsrep_start_position var submitted: ‘c9a557c5-7eb7-11e3-0800-b30c642d54fb:631798’
140123 11:26:33 [Note] WSREP: Read nil XID from storage engines, skipping position init
140123 11:26:33 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/libgalera_smm.so’
140123 11:26:33 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
140123 11:26:33 [Note] WSREP: Found saved state: c9a557c5-7eb7-11e3-0800-b30c642d54fb:-1
140123 11:26:33 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
140123 11:26:33 [Note] WSREP: Passing config to GCS: base_host = 10.xx.xx.xx; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; gmcast.listen_addr = tcp://0.0.0.0:4567; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
140123 11:26:33 [Note] WSREP: Assign initial position for certification: 631798, protocol version: -1
140123 11:26:33 [Note] WSREP: wsrep_sst_grab()
140123 11:26:33 [Note] WSREP: Start replication
140123 11:26:33 [Note] WSREP: Setting initial position to c9a557c5-7eb7-11e3-0800-b30c642d54fb:631798
140123 11:26:33 [Note] WSREP: protonet asio version 0
140123 11:26:33 [Note] WSREP: backend: asio
140123 11:26:33 [Note] WSREP: GMCast version 0
140123 11:26:33 [Note] WSREP: (d4fd9a88-8418-11e3-0800-e6447370e946, ‘tcp://0.0.0.0:4567’) listening at tcp://0.0.0.0:4567
140123 11:26:33 [Note] WSREP: (d4fd9a88-8418-11e3-0800-e6447370e946, ‘tcp://0.0.0.0:4567’) multicast: , ttl: 1
140123 11:26:33 [Note] WSREP: EVS version 0
140123 11:26:33 [Note] WSREP: PC version 0
140123 11:26:33 [Note] WSREP: gcomm: connecting to group ‘cluster_name’, peer ‘node3:4567’
140123 11:26:33 [Warning] WSREP: (d4fd9a88-8418-11e3-0800-e6447370e946, ‘tcp://0.0.0.0:4567’) address ‘tcp://node3:4567’ points to own listening address, blacklisting
140123 11:26:36 [Warning] WSREP: no nodes coming from prim view, prim not possible
140123 11:26:36 [Note] WSREP: view(view_id(NON_PRIM,d4fd9a88-8418-11e3-0800-e6447370e946,1) memb {
d4fd9a88-8418-11e3-0800-e6447370e946,
} joined {
} left {
} partitioned {
})
140123 11:26:37 [Warning] WSREP: last inactive check more than PT1.5S ago, skipping check
140123 11:27:06 [Note] WSREP: view((empty))
140123 11:27:06 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():139
140123 11:27:06 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
140123 11:27:06 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1290: Failed to open channel ‘cluster_name’ at ‘gcomm://node3:4567’: -110 (Connection timed out)
140123 11:27:06 [ERROR] WSREP: gcs connect failed: Connection timed out
140123 11:27:06 [ERROR] WSREP: wsrep::connect() failed: 6
140123 11:27:06 [ERROR] Aborting
140123 11:27:06 [Note] WSREP: Service disconnected.
140123 11:27:07 [Note] WSREP: Some threads may fail to exit.
140123 11:27:07 [Note] /usr/sbin/mysqld: Shutdown complete
140123 11:27:07 mysqld_safe mysqld from pid file /var/lib/mysql/node3.pid ended
Case 2: I start the donor machine and 2 other machines I add wsrep-cluster-address = gcomm / donor_node on all machines but I get the following error:
140123 14:40:49 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140123 14:40:49 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.IxqyTq9fv5
140123 14:40:55 mysqld_safe WSREP: Recovered position 693ba945-81de-11e3-0800-277c5cbcb35b:0
140123 14:40:55 [Note] WSREP: wsrep_start_position var submitted: ‘693ba945-81de-11e3-0800-277c5cbcb35b:0’
140123 14:40:55 [Note] WSREP: Read nil XID from storage engines, skipping position init
140123 14:40:55 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/libgalera_smm.so’
140123 14:40:55 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
140123 14:40:55 [Note] WSREP: Found saved state: 693ba945-81de-11e3-0800-277c5cbcb35b:-1
140123 14:40:55 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
140123 14:40:55 [Note] WSREP: Passing config to GCS: base_host = node1; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; gmcast.listen_addr = tcp://0.0.0.0:4567; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
140123 14:40:55 [Note] WSREP: Assign initial position for certification: 0, protocol version: -1
140123 14:40:55 [Note] WSREP: wsrep_sst_grab()
140123 14:40:55 [Note] WSREP: Start replication
140123 14:40:55 [Note] WSREP: Setting initial position to 693ba945-81de-11e3-0800-277c5cbcb35b:0
140123 14:40:55 [Note] WSREP: protonet asio version 0
140123 14:40:55 [Note] WSREP: backend: asio
140123 14:40:55 [Note] WSREP: GMCast version 0
140123 14:40:55 [Note] WSREP: (fbbe9aa8-8433-11e3-0800-cbe33796b957, ‘tcp://0.0.0.0:4567’) listening at tcp://0.0.0.0:4567
140123 14:40:55 [Note] WSREP: (fbbe9aa8-8433-11e3-0800-cbe33796b957, ‘tcp://0.0.0.0:4567’) multicast: , ttl: 1
140123 14:40:55 [Note] WSREP: EVS version 0
140123 14:40:55 [Note] WSREP: PC version 0
140123 14:40:55 [Note] WSREP: gcomm: connecting to group ‘cluster_name’, peer ‘10.128.26.154:4567’
140123 14:40:55 [Note] WSREP: declaring 3c8938b5-841d-11e3-0800-fac3799a7d7e stable
140123 14:40:55 [Note] WSREP: Node 3c8938b5-841d-11e3-0800-fac3799a7d7e state prim
140123 14:40:55 [Note] WSREP: view(view_id(PRIM,3c8938b5-841d-11e3-0800-fac3799a7d7e,2) memb {
3c8938b5-841d-11e3-0800-fac3799a7d7e,
fbbe9aa8-8433-11e3-0800-cbe33796b957,
} joined {
} left {
} partitioned {
})
140123 14:40:56 [Note] WSREP: gcomm: connected
140123 14:40:56 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
140123 14:40:56 [Note] WSREP: Shifting CLOSED → OPEN (TO: 0)
140123 14:40:56 [Note] WSREP: Opened channel ‘cluster_name’
140123 14:40:56 [Note] WSREP: Waiting for SST to complete.
140123 14:40:56 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
140123 14:40:56 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
140123 14:40:56 [Note] WSREP: STATE EXCHANGE: sent state msg: fc0c0787-8433-11e3-0800-6c8d004124e1
140123 14:40:56 [Note] WSREP: STATE EXCHANGE: got state msg: fc0c0787-8433-11e3-0800-6c8d004124e1 from 0 (xts-priv-xtsinf-pp-percona3)
140123 14:40:56 [Note] WSREP: STATE EXCHANGE: got state msg: fc0c0787-8433-11e3-0800-6c8d004124e1 from 1 (node1)
140123 14:40:56 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 1,
members = 1/2 (joined/total),
act_id = 669294,
last_appl. = -1,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = c9a557c5-7eb7-11e3-0800-b30c642d54fb
140123 14:40:56 [Note] WSREP: Flow-control interval: [23, 23]
140123 14:40:56 [Note] WSREP: Shifting OPEN → PRIMARY (TO: 669294)
140123 14:40:56 [Note] WSREP: State transfer required:
Group state: c9a557c5-7eb7-11e3-0800-b30c642d54fb:669294
Local state: 693ba945-81de-11e3-0800-277c5cbcb35b:0
140123 14:40:56 [Note] WSREP: New cluster view: global state: c9a557c5-7eb7-11e3-0800-b30c642d54fb:669294, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 2
140123 14:40:56 [Warning] WSREP: Gap in state sequence. Need state transfer.
140123 14:40:58 [Note] WSREP: Running: ‘wsrep_sst_xtrabackup --role ‘joiner’ --address ‘node1’ --auth ‘sstuser:PWD’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --parent ‘30106’’
nc: Address already in use
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors
140123 14:40:58 [Note] WSREP: Prepared SST request: xtrabackup|node1:4444/xtrabackup_sst
140123 14:40:58 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
140123 14:40:58 [Note] WSREP: Assign initial position for certification: 669294, protocol version: 2
140123 14:40:58 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (693ba945-81de-11e3-0800-277c5cbcb35b) does not match group state UUID (c9a557c5-7eb7-11e3-0800-b30c642d54fb): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():436. IST will be unavailable.
140123 14:40:58 [ERROR] WSREP: Requesting state transfer failed: -113(No route to host)
140123 14:40:58 [ERROR] WSREP: State transfer request failed unrecoverably: 113 (No route to host). Most likely it is due to inability to communicate with the cluster primary component. Restart required.
140123 14:40:58 [Note] WSREP: Closing send monitor…
140123 14:40:58 [Note] WSREP: Closed send monitor.
140123 14:40:58 [Note] WSREP: gcomm: terminating thread
140123 14:40:58 [Note] WSREP: gcomm: joining thread
140123 14:40:58 [Note] WSREP: gcomm: closing backend
WSREP_SST: [ERROR] Error while getting st data from donor node: 1, 2 (20140123 14:40:58.423)
140123 14:40:58 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role ‘joiner’ --address ‘node1’ --auth ‘sstuser:PWD’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --parent ‘30106’: 32 (Broken pipe)
140123 14:40:58 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
140123 14:40:58 [ERROR] WSREP: SST failed: 32 (Broken pipe)
140123 14:40:58 [ERROR] Aborting
140123 14:40:59 [Note] WSREP: view(view_id(NON_PRIM,3c8938b5-841d-11e3-0800-fac3799a7d7e,2) memb {
fbbe9aa8-8433-11e3-0800-cbe33796b957,
} joined {
} left {
} partitioned {
3c8938b5-841d-11e3-0800-fac3799a7d7e,
})
140123 14:40:59 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140123 14:40:59 [Note] WSREP: view((empty))
140123 14:40:59 [Note] WSREP: gcomm: closed
140123 14:40:59 [Note] WSREP: Flow-control interval: [16, 16]
140123 14:40:59 [Note] WSREP: Received NON-PRIMARY.
140123 14:40:59 [Note] WSREP: Shifting PRIMARY → OPEN (TO: 669298)
140123 14:40:59 [Note] WSREP: Received self-leave message.
140123 14:40:59 [Note] WSREP: Flow-control interval: [0, 0]
140123 14:40:59 [Note] WSREP: Received SELF-LEAVE. Closing connection.
140123 14:40:59 [Note] WSREP: Shifting OPEN → CLOSED (TO: 669298)
140123 14:40:59 [Note] WSREP: RECV thread exiting 0: Success
140123 14:40:59 [Note] WSREP: recv_thread() joined.
140123 14:40:59 [Note] WSREP: Closing slave action queue.
140123 14:40:59 [Note] WSREP: /usr/sbin/mysqld: Terminated.
140123 14:40:59 mysqld_safe mysqld from pid file /var/lib/mysql/node1.pid ended
The configurations of the other two machines are as follows:
wsrep_provider_options=“gmcast.listen_addr=tcp://0.0.0.0:4567”
wsrep_cluster_address=gcomm://node3:4567
datadir=/var/lib/mysql
user=mysql
Path to Galera library
wsrep_provider=/usr/lib64/libgalera_smm.so
In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
This is a recommended tuning variable for performance
innodb_locks_unsafe_for_binlog=1
This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
Node #1 address
wsrep_node_address=@ip_node
SST method
wsrep_sst_method=xtrabackup
Cluster name
wsrep_cluster_name=cluster_name
Authentication for SST method
wsrep_sst_auth=“sstuser:MDP”
wsrep_sst_donor=node3
Don’t listen on a TCP/IP port at all. This can be a security enhancement,
if all processes that need to connect to mysqld run on the same host.
All interaction with mysqld must be made via Unix sockets or named pipes.
Note that using this option without enabling named pipes on Windows
(via the “enable-named-pipe” option) will render mysqld useless!
#skip-networking
Replication Master Server (default)
binary logging is required for replication
log-bin=mysql-bin