Percona Cluster: 2nd node can't start on "Ubuntu 12.04.1 LTS"

Hi,

I have 3 servers with “Ubuntu 12.04.1 LTS” and I have instaled Percona XtraDB Cluster server from the binaries repositories using apt-get method. It appears that everything is installed correctly, because it runs the mysql server standalone from each node.
The first node (with wsrep_cluster_address=gcomm://) starts without problem. And I can run mysql --defaults-file=/etc/mysql/debian.cnf -e “show status like ‘wsrep%’;” and get the spected output saying that there is a cluster initialized with wsrep_cluster_size=1.

But when I start the second (or third node) the service mysql start finish with [[COLOR=#FF0000]fail] message. When i tried to start mysql with the command : “/etc/init.d/mysql start” or “service mysql start”, i get the message start [COLOR=#FF0000]failed.

my.cnf” file for the first node is:
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld]
user = mysql
default_storage_engine = InnoDB
port = 3306
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock

MyISAM

key_buffer_size = 32M

SAFETY

max_allowed_packet = 16M
max_connect_errors = 1000

DATA STORAGE

datadir = /var/opt/hosting/db

BINARY LOGGING

log_bin = /var/opt/hosting/tmp/mysql-bin-log/log-bin-node01.log
expire_logs_days = 10

INNODB

innodb_flush_method = ALL_O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 150M
innodb_import_table_from_xtrabackup = 1
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 1G

LOGGING

log-error = /var/opt/hosting/db/node1.err
long_query_time = 2
slow-query-log-file = /var/opt/hosting/log/mysql/mysql-slow-queries.node1.log

GALERA

Path to Galera library

wsrep_provider = /usr/lib/libgalera_smm.so

Cluster connection URL contains the IPs of node#1, node#2 and node#3

wsrep_cluster_address = gcomm://

In order for Galera to work correctly binlog format should be ROW

binlog_format = ROW

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog = 1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode = 2

Node address

wsrep_node_address = node1_IP

SST method

wsrep_sst_method = xtrabackup

Cluster name

wsrep_cluster_name = my_first_node

Authentication for SST method

wsrep_sst_auth = “sst_user:password”

my.cnf” file for the 2nd node
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld]
user = mysql
default_storage_engine = InnoDB
port = 3306
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock

MyISAM

key_buffer_size = 32M

SAFETY

max_allowed_packet = 16M
max_connect_errors = 1000

DATA STORAGE

datadir = /var/opt/hosting/db

BINARY LOGGING

log_bin = /var/opt/hosting/tmp/mysql-bin-log/log-bin-node02.log
expire_logs_days = 10

INNODB

innodb_flush_method = ALL_O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 150M
innodb_import_table_from_xtrabackup = 1
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 1G

LOGGING

log-error = /var/opt/hosting/db/poolm/node2.err
long_query_time = 2
slow-query-log-file = /var/opt/hosting/log/mysql/mysql-slow-queries.node2.log

GALERA

Path to Galera library

wsrep_provider = /usr/lib/libgalera_smm.so

Cluster connection URL contains the IPs of node#1, node#2 and node#3

wsrep_cluster_address = gcomm://node1_IP

In order for Galera to work correctly binlog format should be ROW

binlog_format = ROW

This is a recommended tuning variable for performance

innodb_locks_unsafe_for_binlog = 1

This changes how InnoDB autoincrement locks are managed and is a requirement for Galera

innodb_autoinc_lock_mode = 2

Node address

wsrep_node_address = node2_IP

SST method

wsrep_sst_method = xtrabackup

Cluster name

wsrep_cluster_name = my_first_node

Authentication for SST method

wsrep_sst_auth = “sst_user:password”

I followed the documentation from the Percona XtraDB Cluster : http://www.percona.com/doc/percona-xtradb-cluster/index.html

During the process of starting up with “/etc/init.d/mysql start” command on “node2”, it looks like the synchronization from this node(node2) with “node1” set more time than the 14seconds which were defined. I think “/etc/init.d/mysql start” command, don’t wait that this synchronization is ended before going out in error.

What do you think about it? Can you help me with this setup, I need it urgently.

Thanks in advance.

PS : PERCONA SERVER VERSION : 5.5.30-30.2-log Percona Server (GPL), Release 30.2, wsrep_23.7.4.r3843

So – what does the log say on the 2nd node? My guess would be this is some SST error, so you can also check the innobackup.backup.log file in the datadir on the first node for clues.

Thank you for your reply.
For the log, please see attached files : “node1log.txt”
I noticed that “/etc/init.d/mysql” deployed during installation of “percona-xtradb-cluster-server.5.5” was erroneous. See attached file “etc_initd_mysql_deployed”.
I replaced it by “etc_initd_mysql_replaced” file before my installation and the 2nd node started well and joined the cluster maybe because of in this script timeout waiting is set to “service_startup_timeout=900” seconds.

Expect 15 minutes that 2nd node synchronizes with the cluster don’t seem to me very determinist. If size of my data in cluster make 10Go,15Go,50Go or 1To…, 15 minutes seems to me too few. Wouldn’t it be better to wait that the synchronization of the second node is ended instead of waiting 15minutes?

etc_initd_mysql_deployed.sh (8.95 KB)

etc_initd_mysql_replaced.sh (17.3 KB)

node2log.txt ==>

node2log.txt (17.3 KB)

Based on your logs, I’m not sure I agree with the assessment that your problems are due to init script timeouts – there are some clear crashes with stack traces in the log. I’d try clearing the datadir on your second node to see if that helps at all. I’d also check the innobackup.apply.log on that node to see if there were any errors recovering the Xtrabackup SST before the node starts up.

Before every installation i clean my datadir on 2ndnode but that doesn’t seem to solve my problem…Even before “2ndnode” tries to connect to the cluster, i already have an error in /var/opt/hosting/db/poolm/node2.err log log : 130704 16:27:40 Percona XtraDB (http://www.percona.com) 5.5.30-rel30.2 started; log sequence number 1597945
ERROR: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER TABLE user ADD column Show_view_priv enum(‘N’,‘Y’) CHARA
130704 16:27:40 [ERROR] Aborting

What do you think of the difference between “etc_initd_mysql_deployed” and “etc_initd_mysql_replaced” (I got back etc_initd_mysql_replaced here : http://www.percona.com/downloads/Percona-XtraDB-Cluster/5.5.30-23.7.4/binary/linux/x86_64/) files. Why during installation of “percona-xtradb-cluster-server.5.5” “etc_initd_mysql_deployed” was deployed in place of “etc_initd_mysql_replaced”?

Compared to my previous post, if size of my data in cluster make 10Go,15Go,50Go or 1To wouldn’t it be better to wait that the synchronization of the second node is ended instead of waiting 15minutes?

I’m not a huge fan of all the “help” the init scripts try to give you in Ubuntu/Debian. In my experience errors like “ERROR: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER TABLE user ADD column Show_view_priv enum(‘N’,‘Y’) CHARA” happen on newly created datadirs and you may be better off with leaving a working datadir on node2 before you start it.

I can’t comment on the inner workings of the init script. If you think there’s something wrong there, then I’d direct you to: http://www.percona.com/doc/percona-x…bugreport.html

OK. Thank You. For information, I tried to install the lastest version of “percona-xtradb-cluster-server-5.5 (5.5.31-23.7.5-438.precise)” and I haven’t errors on node2 during her synchronization with node1. However something wrong in “/etc/init.d/mysql” script and i am going to report it.

A hardcoded timeout in the init script can be too short and this was addressed in Percona XtraDB Cluster version 5.5.31 - see this bug: https://bugs.launchpad.net/percona-x…r/+bug/1099428 The other thing you should make sure is that newly joined nodes have the same credentials in /etc/mysql/debian.cnf as primary node. This is also needed for init script to work properly.