Cluster doesnt accept wsrep_cluster_address in my.cnf

hi experts,

I’m new to percona cluster.
I just installed it:

sudo apt-get install percona-xtradb-cluster-client-5.5
percona-xtradb-cluster-server-5.5 percona-xtrabackup

and now try to make it running however without access.

I created /etc/mysql/my.cnf on all 3 nodes which look like this one from the first node:

[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock

[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0

[mysqld]

datadir=/var/lib/mysql/

Path to Galera library

wsrep_provider=/usr/lib64/libgalera_smm.so

Cluster connection URL contains the IPs of node#1, node#2 and node#3

#wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3

In order for Galera to work correctly binlog format should be ROW

binlog_format=ROW

MyISAM storage engine has only experimental support

default_storage_engine=InnoDB

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog=1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode=2

Node #1 address

wsrep_node_address=xxx.xx.xxx.xx1

SST method

wsrep_sst_method=xtrabackup

Cluster name

wsrep_cluster_name=my_debian_cluster

Authentication for SST method

wsrep_sst_auth=“username:password”

this way mysql starts, but when I comment out wsrep_cluster_address:
wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3
it won’t start.

the first node was started this way:
/etc/init.d/mysql start --wsrep-cluster-address=“gcomm://”

the two others this way:
/etc/init.d/mysql start

so none of the nodes accepts wsrep_cluster_address option in my.cnf.

OS: debian 6

mysql> show status like ‘wsrep%’; from 3 nodes looks like this:
±---------------------------±----------------------------------+
| Variable_name | Value |
±---------------------------±----------------------------------+
| wsrep_local_state_uuid | |
| wsrep_protocol_version | 18446744073709551615 |
| wsrep_last_committed | 18446744073709551615 |
| wsrep_replicated | 0 |
| wsrep_replicated_bytes | 0 |
| wsrep_received | 0 |
| wsrep_received_bytes | 0 |
| wsrep_local_commits | 0 |
| wsrep_local_cert_failures | 0 |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_replays | 0 |
| wsrep_local_send_queue | 0 |
| wsrep_local_send_queue_avg | 0.000000 |
| wsrep_local_recv_queue | 0 |
| wsrep_local_recv_queue_avg | 0.000000 |
| wsrep_flow_control_paused | 0.000000 |
| wsrep_flow_control_sent | 0 |
| wsrep_flow_control_recv | 0 |
| wsrep_cert_deps_distance | 0.000000 |
| wsrep_apply_oooe | 0.000000 |
| wsrep_apply_oool | 0.000000 |
| wsrep_apply_window | 0.000000 |
| wsrep_commit_oooe | 0.000000 |
| wsrep_commit_oool | 0.000000 |
| wsrep_commit_window | 0.000000 |
| wsrep_local_state | 0 |
| wsrep_local_state_comment | Initialized |
| wsrep_cert_index_size | 0 |
| wsrep_causal_reads | 0 |
| wsrep_incoming_addresses | |
| wsrep_cluster_conf_id | 18446744073709551615 |
| wsrep_cluster_size | 0 |
| wsrep_cluster_state_uuid | |
| wsrep_cluster_status | Disconnected |
| wsrep_connected | OFF |
| wsrep_local_index | 18446744073709551615 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy <info@codership.com> |
| wsrep_provider_version | 2.5(r150) |
| wsrep_ready | OFF

what am I doing wrong?
thank you indeed!

now I could start the first node with:
wsrep_cluster_address=gcomm://
in my.cnf

however in the documentation they recommend :
After this single-node cluster is started, variable wsrep_cluster_address should be updated to the list of all nodes in the cluster. For example:
wsrep_cluster_address=gcomm://192.168.70.2,192.168.70.3,192.168.70.4

so I stop mysql , change it in my.cnf to wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3
and it fails again!
where is the mistake?

All of your nodes (including the first node) are in Initialized state? Paste your logs.

here you go

attachments from 3 nodes

attachment function in this forum doesnt seem to work

Hmm, I let someone know. In the meantime, you can put them on pastebin or sprunge.us (or similar) and just paste the links.

https://www.dropbox.com/sh/odcod4fnwivcx6q/EvD7g8Y9Ti

3 uploaded files here

1 file here

ok doesnt work, please use the link to dropbox.
by the way this forums threads can’t be seen in chrome browser, at least in the chrome version for linux

no worries, but bump

Sorry for the delayed reply here Zuri.

Rereading your comments, I think you’re confused about how to bootstrap the cluster. The first node must be bootstrapped by providing ‘wsrep_cluster_address=gcomm://’.
This first node should have a status like this:

mysql> show status like ‘wsrep%’; from 3 nodes looks like this:
±---------------------------±----------------------------------+
| Variable_name | Value |
±---------------------------±----------------------------------+
| wsrep_local_state_comment | Synced |
| wsrep_cluster_conf_id | 1 |
| wsrep_cluster_size | 1 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_ready | ON

After this node is started in this state, you can start the other nodes with the full wsrep_cluster_address. They should SST and join the cluster (you should see the cluster size increase, and all nodes in the Synced and Primary states).

After you get the other nodes started, you do want to make sure the first node’s my.cnf has your full cluster address (a restart is not required after you get the other nodes up).

hi percona.jayj,

I’ve given it an one more try.

first node’s my.cnf:
wsrep_cluster_address=gcomm://
started like this:
/etc/init.d/mysql start --wsrep-cluster-address=“gcomm://”

so far success and

mysql> show status like ‘wsrep%’;
±---------------------------±----------------------------------+
| Variable_name | Value |
±---------------------------±----------------------------------+
| wsrep_local_state_comment | Synced |
| wsrep_cluster_conf_id | 1 |
| wsrep_cluster_size | 1 |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_ready | ON

looks ok.

second node’s my.cnf:
wsrep_cluster_address=gcomm://first_nod_ip_here,second_node_ip_here

second node started like this:
/etc/init.d/mysql start

start not successfull. error log shows:

130708 15:41:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130708 15:41:28 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.QIFh4U6aiJ
130708 15:41:33 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
130708 15:41:33 [Note] WSREP: wsrep_start_position var submitted: ‘00000000-0000-0000-0000-000000000000:-1’
130708 15:41:33 [Note] WSREP: Read nil XID from storage engines, skipping position init
130708 15:41:33 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/libgalera_smm.so’
130708 15:41:33 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
130708 15:41:33 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130708 15:41:33 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
130708 15:41:33 [Note] WSREP: Passing config to GCS: base_host = second_node_ip_here; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130708 15:41:33 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130708 15:41:33 [Note] WSREP: wsrep_sst_grab()
130708 15:41:33 [Note] WSREP: Start replication
130708 15:41:33 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130708 15:41:33 [Note] WSREP: protonet asio version 0
130708 15:41:33 [Note] WSREP: backend: asio
130708 15:41:33 [Note] WSREP: GMCast version 0
130708 15:41:33 [Note] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, ‘tcp://0.0.0.0:4567’) listening at tcp://0.0.0.0:4567
130708 15:41:33 [Note] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, ‘tcp://0.0.0.0:4567’) multicast: , ttl: 1
130708 15:41:33 [Note] WSREP: EVS version 0
130708 15:41:33 [Note] WSREP: PC version 0
130708 15:41:33 [Note] WSREP: gcomm: connecting to group ‘my_wsrep_cluster’, peer ‘first_node_ip_here:,second_node_ip_here:’
130708 15:41:33 [Warning] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, ‘tcp://0.0.0.0:4567’) address ‘tcp://second_node_ip_here:4567’ points to own listening address, blacklisting
130708 15:41:36 [Warning] WSREP: no nodes coming from prim view, prim not possible
130708 15:41:36 [Note] WSREP: view(view_id(NON_PRIM,1a786454-e7d4-11e2-0800-4bed3fdb41d0,1) memb {
1a786454-e7d4-11e2-0800-4bed3fdb41d0,
} joined {
} left {
} partitioned {
})
130708 15:41:37 [Warning] WSREP: last inactive check more than PT1.5S ago, skipping check
130708 15:42:06 [Note] WSREP: view((empty))
130708 15:42:06 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at gcomm/src/pc.cpp:connect():139
130708 15:42:06 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
130708 15:42:06 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1290: Failed to open channel ‘my_wsrep_cluster’ at ‘gcomm://first_node_ip_here,second_node_ip_here’: -110 (Connection timed out)
130708 15:42:06 [ERROR] WSREP: gcs connect failed: Connection timed out
130708 15:42:06 [ERROR] WSREP: wsrep::connect() failed: 6
130708 15:42:06 [ERROR] Aborting

130708 15:42:06 [Note] WSREP: Service disconnected.
130708 15:42:07 [Note] WSREP: Some threads may fail to exit.
130708 15:42:07 [Note] /usr/sbin/mysqld: Shutdown complete

130708 15:42:07 mysqld_safe mysqld from pid file /var/lib/mysql/first_node_dnsname_here.pid ended
130708 15:42:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130708 15:42:56 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.LgQydon7i3
130708 15:43:01 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
130708 15:43:01 [Note] WSREP: wsrep_start_position var submitted: ‘00000000-0000-0000-0000-000000000000:-1’
130708 15:43:01 [Note] WSREP: Read nil XID from storage engines, skipping position init
130708 15:43:01 [Note] WSREP: wsrep_load(): loading provider library ‘/usr/lib64/libgalera_smm.so’
130708 15:43:01 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
130708 15:43:01 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130708 15:43:01 [Note] WSREP: Reusing existing ‘/var/lib/mysql//galera.cache’.
130708 15:43:01 [Note] WSREP: Passing config to GCS: base_host = second_node_ip_here; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130708 15:43:01 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130708 15:43:01 [Note] Plugin ‘FEDERATED’ is disabled.
130708 15:43:01 InnoDB: The InnoDB memory heap is disabled
130708 15:43:01 InnoDB: Mutexes and rw_locks use GCC atomic builtins
130708 15:43:01 InnoDB: Compressed tables use zlib 1.2.3
130708 15:43:01 InnoDB: Using Linux native AIO
130708 15:43:01 InnoDB: Initializing buffer pool, size = 128.0M
130708 15:43:01 InnoDB: Completed initialization of buffer pool
130708 15:43:01 InnoDB: highest supported file format is Barracuda.
130708 15:43:02 InnoDB: Waiting for the background threads to start
130708 15:43:03 Percona XtraDB (http://www.percona.com) 5.5.30-rel30.2 started; log sequence number 1598139
130708 15:43:03 [Note] Event Scheduler: Loaded 0 events
130708 15:43:03 [Note] /usr/sbin/mysqld: ready for connections.
Version: ‘5.5.30-30.2’ socket: ‘/var/run/mysqld/mysqld.sock’ port: 3306 Percona Server (GPL), Release 30.2, wsrep_23.7.4.r3843

after deleting wsrep_cluster_address=gcomm://first_nod_ip_here,second_node_ip_here
in my.cnf on second node, mysql on second node can be started successfull.

Any thoughts or ideas?

regards,
zuri

Can your second node connect to your first node on TCP port 4567? It looks to me like it cannot.

I have tried to find documantation for this. no luck. can you provide me some links?

I can only guess to write into my.cnf on all notes:
wsrep_provider_options ="gmcast.listen_addr=tcp://0.0.0.0:4567;

but actually it should only be used to overwrite the default port which is 4567

I had a lot of trouble with this too.

​I was running 5.5.28 and it worked, but after upgrading to 5.5.34 it didn’t.

After some trouble I tried to move wsrep_cluster_address from [mysqld_safe] section to [mysqld] and now it works.