Can't start Percona Cluster on second node

Hi there:

I have 4 servers with ubuntu server 12.04 and I have instaled Percona XtraDB Cluster server from the binaries repositories using apt-get method. It appears that everything is installed correctly, because it runs the mysql server standalone from each node.

The first node (the one with wsrep_cluster_address=gcomm:// line) starts without any problem. And I can run mysql -e " SHOW STATUS like ‘%wsrep%’;" -u root -p and get the spected output saying that there is a cluster initialized with wsrep_cluster_size=1.

But when I start the second (or any subsequent node) the service mysql start finish with [fail] message. I check the .err log file and see that it stops at following lines:

120505 10:43:03 [Note] WSREP: Shifting OPEN → PRIMARY (TO: 89)120505 10:43:03 [Note] WSREP: State transfer required: Group state: dbf7d14a-959c-11e1-0800-7f1665e7e657:89 Local state: 00000000-0000-0000-0000-000000000000:-1120505 10:43:03 [Note] WSREP: New cluster view: global state: dbf7d14a-959c-11e1-0800-7f1665e7e657:89, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 1120505 10:43:03 [Warning] WSREP: Gap in state sequence. Need state transfer.120505 10:43:05 [Note] WSREP: Running: ‘wsrep_sst_rsync ‘joiner’ ‘192.168.10.22’ ‘root:somepassword’ ‘/var/lib/mysql/’ ‘/etc/mysql/my.cnf’ ‘5579’ 2>sst.err’

The sst.err file is empty. And the mysql process never starts. And I can’t stop it eather. I have to kill -9 al the process containing the mysql word.

The my.cnf file for the first node is:

[client]port=3306socket=/var/run/mysqld/mysqld.sock[mysqld_safe]socket=/var/run/mysqld/mysqld.socknice=0[mysqld] port=3306socket=/var/run/mysqld/mysqld.sockdatadir=/var/lib/mysqltmpdir=/tmp#language=/usr/share/mysql/spanishuser=mysql binlog_format=ROW wsrep_provider=/usr/lib/libgalera_smm.sowsrep_provider_options = “gmcast.listen_addr=tcp://bdpl001.arrod.net; sst.recv_addr=bdpl001.arrod.net” wsrep_cluster_address=gcomm://wsrep_slave_threads=8 wsrep_cluster_name=ArrodCluster01wsrep_sst_method=rsync wsrep_node_name=BDPL001wsrep_sst_auth=root:somepasswordinnodb_locks_unsafe_for_binlog=1 innodb_autoinc_lock_mode=2default_storage_engine=InnoDB

And the my.cnf file for the other nodes is (change 002 depending on the node)

[client]port=3306socket=/var/run/mysqld/mysqld.sock[mysqld_safe]socket=/var/run/mysqld/mysqld.socknice=0[mysqld] port=3306socket=/var/run/mysqld/mysqld.sockdatadir=/var/lib/mysqltmpdir=/tmp#language=/usr/share/mysql/spanishuser=mysql binlog_format=ROW wsrep_provider=/usr/lib/libgalera_smm.sowsrep_provider_options = “gmcast.listen_addr=tcp://bdpl002.arrod.net; sst.recv_addr=bdpl002.arrod.net” wsrep_cluster_address=gcomm://bdpl001.arrod.netwsrep_slave_threads=8 wsrep_cluster_name=ArrodCluster01wsrep_sst_method=rsync wsrep_node_name=BDPL002wsrep_sst_auth=root:somepasswordinnodb_locks_unsafe_for_binlog=1 innodb_autoinc_lock_mode=2default_storage_engine=InnoDB

I followed the instructions from the Percona XtraDB Cluster Operation Manual and this other tutorial with no luck.

Can you help me with this setup, I need it urgently. If you need further information I’m glad to give it to you.

Thanks in advance.

I have the same problem, which is problem with the /etc/mysql/debian-start.
When i tried to start mysql with the command
service mysql start
I get the message start failed. This is because mysql user has not privileges to read file debian.cnf
Try starting mysql with
sudo /etc/init.d/mysql start

This work for me and I believe it will works for you.

Hi, thanks for your reply.

I tryed that too, but with the same result. Then I restart all the servers one by one, starting with the first node, and after restart, every additional node was abble to connect correctly and I have a Percona XTraDB Cluster running on 4 servers.

Now, I need to load balance and failover the servers. On the servers I have a small glusterfs cluster too, this will be used very often to store some files that we need to share with other servers, but will be accessed about one or two times per day from like 10 machines. I set up keepalived to load balance and failover the NFS port and it works as spected, with no issue. I tryed to do the same but with the MySQL port. I set the bind-address on each mysql server config file to bind to the local address, not the default 0.0.0.0. I restarted the keepalived process but the setup does not work. When I do telnet to the virtual ip to the port 3306 I get an error telnet: Unable to connect to remote host: No route to host. I can connect directly to the IPs for each mysql server, but not the shared IP.

One thing, when I first configure the NFS failover, I use two servers from the cluster, and the NFS works correctly, I was abble to do telnet to the NFS port, but when I changed the keepalived config to enable the mysql port, telneting to the NFS port on the virtual ip also return the same error. I think I mesed all up. Initially I followed this tutorial to enable the NFS failover and it works well.

This is my keepalive.conf file:

######### /etc/keepalived/keepalived.conf ####################global_defs { notification_email { abimael.avila@wnethn.com } notification_email_from noreply@wnethn.com smtp_server mail.wnethn.com smtp_connect_timeout 30 lvs_id LVS1}# Ini - Configuracion para NFSvirtual_server 192.168.10.20 2049 { delay_loop 30 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.10.21 2049 { weight 100 TCP_CHECK { connect_port 2049 connect_timeout 3 } } real_server 192.168.10.22 2049 { weight 100 TCP_CHECK { connect_port 2049 connect_timeout 3 } } real_server 192.168.10.23 2049 { weight 100 TCP_CHECK { connect_port 2049 connect_timeout 3 } } real_server 192.168.10.24 2049 { weight 100 TCP_CHECK { connect_port 2049 connect_timeout 3 } }}# Fin - Configuracion para NFS ## Ini - Configuracion para MySQL #virtual_server 192.168.10.20 3306 { delay_loop 30 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.10.21 3306 { weight 100 TCP_CHECK { connect_port 3306 connect_timeout 3 } } real_server 192.168.10.22 3306 { weight 100 TCP_CHECK { connect_port 3306 connect_timeout 3 } } real_server 192.168.10.23 3306 { weight 100 TCP_CHECK { connect_port 3306 connect_timeout 3 } } real_server 192.168.10.24 3306 { weight 100 TCP_CHECK { connect_port 3306 connect_timeout 3 } }}# Fin - Configuracion para MySQL #vrrp_instance VI_1 { state MASTER interface eth0 lvs_sync_daemon_inteface eth0 virtual_router_id 51 priority 150 advert_int 1 smtp_alert authentication { auth_type PASS auth_pass somepassword } virtual_ipaddress { 192.168.10.20 }}####### fin /etc/keepalived/keepalived.conf ############

I use the same file on the backup server, but with state BACKUP.

Do you know a good tutorial to enable failover and load balance only with keepalived (both NFS and MySQL or other port on the future)? by using two of the four clustered servers will be a problem or I must use two separated servers to do this? If I use two other servers to make the load balance and failover of NFS and MySQL, will that affect the performance using lb_algo rr and lb_kind DR? What do you recomend for a setup like this?

Thanks in advance for your help.