rsync or xtrabackup in wsrep_sst_method?

I just install xtrabdb cluster on 3 nodes , but I have some problems:

-If I use xtrabackup the node 2 and 3 refuse to start.
-If I use rsync , theses 2 nodes start but when I create a database in one the node , it is not replicated on the others.

Someone please can help me to find out what is going wrong with my configuration?

Thanks a lot.

Here is the configuration I have in my.cnf

node 1

wsrep_provider_options=“gmcast.listen_addr=tcp://0.0.0.0:4567”
wsrep_cluster_address=gcomm://

datadir=/var/lib/mysql
user=mysql

Path to Galera library

wsrep_provider=/usr/lib64/libgalera_smm.so

Cluster connection URL contains the IPs of node#1, node#2 and node#3

In order for Galera to work correctly binlog format should be ROW

binlog_format=ROW

MyISAM storage engine has only experimental support

default_storage_engine=InnoDB

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog=1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode=2

Node #1 address

wsrep_node_address=@IP node 1
wsrep_node_name=node1

SST method

wsrep_sst_method=rsync

Cluster name

wsrep_cluster_name=my_cluster

Authentication for SST method

wsrep_sst_auth=“sstuser:mdp”
server-id = 1

node 2

wsrep_provider_options=“gmcast.listen_addr=tcp://0.0.0.0:4567”
wsrep_cluster_address=gcomm://@IP node1

datadir=/var/lib/mysql
user=mysql

Path to Galera library

wsrep_provider=/usr/lib64/libgalera_smm.so

In order for Galera to work correctly binlog format should be ROW

binlog_format=ROW

MyISAM storage engine has only experimental support

default_storage_engine=InnoDB

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog=1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode=2

Node #1 address

wsrep_node_address=@IP node2

SST method

wsrep_sst_method=rsync

Cluster name

wsrep_cluster_name=my_cluster

Authentication for SST method

wsrep_sst_auth=“sstuser:mdp”
server-id = 2

node3
wsrep_provider_options=“gmcast.listen_addr=tcp://0.0.0.0:4567”
wsrep_cluster_address=gcomm://@IP node1

datadir=/var/lib/mysql
user=mysql

Path to Galera library

wsrep_provider=/usr/lib64/libgalera_smm.so

In order for Galera to work correctly binlog format should be ROW

binlog_format=ROW

MyISAM storage engine has only experimental support

default_storage_engine=InnoDB

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog=1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode=2

Node #1 address

wsrep_node_address=@IP node3

SST method

wsrep_sst_method=rsync

Cluster name

wsrep_cluster_name=my_cluster

Authentication for SST method

wsrep_sst_auth=“sstuser:mdp”
server-id = 3

So When I try to start the node 2 and 3 , here are the log I have :

130710 16:30:02 mysqld_safe mysqld from pid file /var/lib/mysql/node2.pid ended
130710 16:39:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130710 16:39:08 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.RtwzeB83XD
130710 16:39:13 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
130710 16:39:13 [Note] WSREP: wsrep_start_position var submitted: ‘00000000-0000-0000-0000-000000000000:-1’
130710 16:39:13 [Note] WSREP: Read nil XID from storage engines, skipping position init
130710 16:39:13 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/libgalera_smm.so’
130710 16:39:13 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
130710 16:39:13 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130710 16:39:13 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
130710 16:39:13 [Note] WSREP: Passing config to GCS: base_host = @IPnode2; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130710 16:39:13 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130710 16:39:13 [Note] WSREP: wsrep_sst_grab()
130710 16:39:13 [Note] WSREP: Start replication
130710 16:39:13 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130710 16:39:13 [Note] WSREP: protonet asio version 0
130710 16:39:13 [Note] WSREP: backend: asio
130710 16:39:13 [Note] WSREP: GMCast version 0
130710 16:39:13 [Note] WSREP: (7d9463fd-e96e-11e2-0800-2d863130a295, ‘tcp://0.0.0.0:4567’) listening at tcp://0.0.0.0:4567
130710 16:39:13 [Note] WSREP: (7d9463fd-e96e-11e2-0800-2d863130a295, ‘tcp://0.0.0.0:4567’) multicast: , ttl: 1
130710 16:39:13 [Note] WSREP: EVS version 0
130710 16:39:13 [Note] WSREP: PC version 0
130710 16:39:13 [Note] WSREP: gcomm: connecting to group ‘my_cluster’, peer ‘@IPnode1:’
130710 16:39:14 [Note] WSREP: declaring 43070462-e96e-11e2-0800-a8d5b7bb6865 stable
130710 16:39:14 [Note] WSREP: Node 43070462-e96e-11e2-0800-a8d5b7bb6865 state prim
130710 16:39:14 [Note] WSREP: view(view_id(PRIM,43070462-e96e-11e2-0800-a8d5b7bb6865,2) memb {
43070462-e96e-11e2-0800-a8d5b7bb6865,
7d9463fd-e96e-11e2-0800-2d863130a295,
} joined {
} left {
} partitioned {
})
130710 16:39:14 [Note] WSREP: gcomm: connected
130710 16:39:14 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
130710 16:39:14 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
130710 16:39:14 [Note] WSREP: Opened channel ‘my_cluster’
130710 16:39:14 [Note] WSREP: Waiting for SST to complete.
130710 16:39:14 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
130710 16:39:14 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
130710 16:39:14 [Note] WSREP: STATE EXCHANGE: sent state msg: 7de25592-e96e-11e2-0800-6c7412230421
130710 16:39:14 [Note] WSREP: STATE EXCHANGE: got state msg: 7de25592-e96e-11e2-0800-6c7412230421 from 0 (node1)
130710 16:39:14 [Note] WSREP: STATE EXCHANGE: got state msg: 7de25592-e96e-11e2-0800-6c7412230421 from 1 (node2)
130710 16:39:14 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 1,
members = 1/2 (joined/total),
act_id = 0,
last_appl. = -1,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = bce764fb-e884-11e2-0800-f7e913b7323c
130710 16:39:14 [Note] WSREP: Flow-control interval: [23, 23]
130710 16:39:14 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
130710 16:39:14 [Note] WSREP: State transfer required:
Group state: bce764fb-e884-11e2-0800-f7e913b7323c:0
Local state: 00000000-0000-0000-0000-000000000000:-1
130710 16:39:14 [Note] WSREP: New cluster view: global state: bce764fb-e884-11e2-0800-f7e913b7323c:0, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 2
130710 16:39:14 [Warning] WSREP: Gap in state sequence. Need state transfer.
130710 16:39:16 [Note] WSREP: Running: ‘wsrep_sst_xtrabackup --role ‘joiner’ --address ‘@IPnode2’ --auth ‘sstuser:mdp’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --parent ‘4209’’
130710 16:39:16 [Note] WSREP: Prepared SST request: xtrabackup|@IPnode2:4444/xtrabackup_sst
130710 16:39:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
130710 16:39:16 [Note] WSREP: Assign initial position for certification: 0, protocol version: 2
130710 16:39:16 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (bce764fb-e884-11e2-0800-f7e913b7323c): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():436. IST will be unavailable.
130710 16:39:16 [Note] WSREP: Node 1 (node2) requested state transfer from ‘any’. Selected 0 (node1)(SYNCED) as donor.
130710 16:39:16 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
130710 16:39:16 [Note] WSREP: Requesting state transfer: success, donor: 0
WSREP_SST: [ERROR] xtrabackup process ended without creating ‘/var/lib/mysql//xtrabackup_galera_info’ (20130710 16:39:25.555)
130710 16:39:25 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role ‘joiner’ --address ‘@IPnode2’ --auth ‘sstuser:mdp’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/my.cnf’ --parent ‘4209’: 32 (Broken pipe)
130710 16:39:25 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
130710 16:39:25 [ERROR] WSREP: SST failed: 32 (Broken pipe)
130710 16:39:25 [ERROR] Aborting

130710 16:39:25 [Warning] WSREP: 0 (node1): State transfer to 1 (node2) failed: -1 (Operation not permitted)
130710 16:39:25 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort.
130710 16:39:25 [Note] WSREP: gcomm: terminating thread
130710 16:39:25 [Note] WSREP: gcomm: joining thread
130710 16:39:25 [Note] WSREP: gcomm: closing backend
130710 16:39:26 [Note] WSREP: view(view_id(NON_PRIM,43070462-e96e-11e2-0800-a8d5b7bb6865,2) memb {
7d9463fd-e96e-11e2-0800-2d863130a295,
} joined {
} left {
} partitioned {
43070462-e96e-11e2-0800-a8d5b7bb6865,
})
130710 16:39:26 [Note] WSREP: view((empty))
130710 16:39:26 [Note] WSREP: gcomm: closed
130710 16:39:26 [Note] WSREP: gcomm: closed
130710 16:39:26 [Note] WSREP: /usr/sbin/mysqld: Terminated.
130710 16:39:26 mysqld_safe mysqld from pid file /var/lib/mysql/node2.pid ended

Check the innobackupex.backup.log in the datadir on the first node after the second fails with xtrabackup – this will tell you what error xtrabackup is facing. Did you execute the grants necessary for Xtrabackup?

As far as with rsync – I’m not sure what’s happening. Can you confirm all nodes have joined the cluster from SHOW GLOBAL STATUS like ‘wsrep%’?

Hi ,

After updating the sstuser rights on mysql and adding the following option on my.cnf (wsrep_sst_donor= node1), everything gone well .

All the nodes join the cluster , here is the result of the command --SHOW GLOBAL STATUS like ‘wsrep%’;-- :

wsrep_local_state_comment | Synced |
| wsrep_cert_index_size | 134145 |
| wsrep_causal_reads | 0 |
| wsrep_incoming_addresses | node1:3306,node2:3306,node3:3306 |
| wsrep_cluster_conf_id | 17 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | bce354fb-e884-11e2-0800-f7e913b7323c |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_index | 2 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
| wsrep_provider_version | 2.5(r150) |
| wsrep_ready | ON

Thanks a lots for your answer.