Failure to add new XtraDB cluster nodes

Hi,

I’m currently testing Percona Cluster so I setup it like this:

{ One of current MySQL cluster server } is replicating to one of Percona nodes (galera01). I managed to add couple Percona’s nodes with SST and wsrep_sst_donor = ‘galera01’. But when I’m using any other node as wsrep_sst_donor or xtrabackup to use IST it always fail with error like this:

2014-06-10 18:51:59 28598 [Note] WSREP: New cluster view: global state: d8b28d3b-a321-11e3-94dd-6ac079041663:26810684, view# 136: Primary, number of nodes: 5, my index: 1, protocol version 2
2014-06-10 18:51:59 28598 [Warning] WSREP: Gap in state sequence. Need state transfer.
2014-06-10 18:52:01 28598 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role ‘joiner’ --address ‘<external_ip>’ --auth ‘root:’ --datadir ‘/var/lib/mysql/’ --defaults-file ‘/etc/mysql/my.cnf’ --parent ‘28598’ ‘’ ’
WSREP_SST: [INFO] Streaming with tar (20140610 18:52:01.810)
WSREP_SST: [INFO] Using socat as streamer (20140610 18:52:01.814)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | tar xfi - --recursive-unlink -h; RC=( ${PIPESTATUS[@]} ) (20140610 18:52:01.978)
2014-06-10 18:52:02 28598 [Note] WSREP: Prepared SST request: xtrabackup|<external_ip>:4444/xtrabackup_sst
2014-06-10 18:52:02 28598 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2014-06-10 18:52:02 28598 [Note] WSREP: REPL Protocols: 5 (3, 1)
2014-06-10 18:52:02 28598 [Note] WSREP: Service thread queue flushed.
2014-06-10 18:52:02 28598 [Note] WSREP: Assign initial position for certification: 26810684, protocol version: 3
2014-06-10 18:52:02 28598 [Note] WSREP: Service thread queue flushed.
2014-06-10 18:52:02 28598 [Note] WSREP: Prepared IST receiver, listening at: tcp://<internal_ip>:4568
2014-06-10 18:52:02 28598 [Note] WSREP: Member 1.0 (galera03) requested state transfer from ‘galera04’. Selected 2.0 (galera04)(SYNCED) as donor.
2014-06-10 18:52:02 28598 [Note] WSREP: Shifting PRIMARY → JOINER (TO: 26810696)
2014-06-10 18:52:02 28598 [Note] WSREP: Requesting state transfer: success, donor: 2
WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20140610 18:52:03.010)
WSREP_SST: [INFO] Total time on joiner: 0 seconds (20140610 18:52:03.016)
WSREP_SST: [INFO] Removing the sst_in_progress file (20140610 18:52:03.020)
2014-06-10 18:52:03 28598 [Note] WSREP: SST complete, seqno: 26540326
2014-06-10 18:52:03 28598 [Warning] Using unique option prefix myisam_recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2014-06-10 18:52:03 28598 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
2014-06-10 18:52:03 28598 [Note] Plugin ‘FEDERATED’ is disabled.
2014-06-10 18:52:03 28598 [Note] InnoDB: The InnoDB memory heap is disabled
2014-06-10 18:52:03 28598 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2014-06-10 18:52:03 28598 [Note] InnoDB: Compressed tables use zlib 1.2.3.4
2014-06-10 18:52:03 28598 [Note] InnoDB: Using Linux native AIO
2014-06-10 18:52:03 28598 [Note] InnoDB: Using CPU crc32 instructions
2014-06-10 18:52:03 28598 [Note] InnoDB: Initializing buffer pool, size = 45.0G
2014-06-10 18:52:07 28598 [Note] InnoDB: Completed initialization of buffer pool
2014-06-10 18:52:07 28598 [Note] InnoDB: Highest supported file format is Barracuda.
2014-06-10 18:56:37 28598 [Note] InnoDB: 128 rollback segment(s) are active.
2014-06-10 18:56:37 28598 [Note] InnoDB: Waiting for purge to start
2014-06-10 18:56:37 28598 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.15-rel63.0 started; log sequence number 861369093072
/usr/sbin/mysqld: File ‘/var/lib/mysql/mysql-bin.000024’ not found (Errcode: 2 - No such file or directory)
2014-06-10 18:56:37 28598 [ERROR] Failed to open log (file ‘/var/lib/mysql/mysql-bin.000024’, errno 2)
2014-06-10 18:56:37 28598 [ERROR] Could not open log file
2014-06-10 18:56:37 28598 [ERROR] Can’t init tc log
2014-06-10 18:56:37 28598 [ERROR] Aborting

Here are my questions:

  • Why does it try to read /var/lib/mysql/mysql-bin.000024 which is not on galera01 either?
  • What is a proper way to add new Percona node and avoid SST?
  • How to recover from the error above?

Thanks.

Please share donor and joiner my.cnf file.
Probably you are using wsrep_sst_method set to xtrabackup-v2 and if you are creating binary logs inside datadir then it creates problems.
Check this bug report Bug #1326012 “SST fails when binlogs are in dedicated directory ...” : Bugs : Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC

my.cnf will help to identify what causing the issue.

I don’t use anything from the bug you pointed. Here is my.cnf files.

Donor configuration:

[mysql]

CLIENT

default-character-set = utf8
port = 3306
socket = /var/run/mysqld/mysqld.sock

[mysqld]

GENERAL

bind-address = 0.0.0.0
character-set-server = utf8
default-storage-engine = InnoDB
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
ssl-ca = /etc/mysql/cacert.pem
ssl-cert = /etc/mysql/server.pem
ssl-key = /etc/mysql/server.pem
user = mysql

MyISAM

key-buffer-size = 32M
myisam-recover = FORCE,BACKUP

SAFETY

innodb = FORCE
innodb-strict-mode = 1
max-allowed-packet = 16M
max-connect-errors = 1000000
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate-is-now = 1

DATA STORAGE

datadir = /var/lib/mysql/

BINARY LOGGING

expire-logs-days = 14
log-bin = /var/lib/mysql/mysql-bin
sync-binlog = 1
log_bin_use_v1_row_events = 1

REPLICATION

binlog_format = ROW
log-slave-updates = 1
relay-log = /var/lib/mysql/mysqld-relay-bin
server_id = 76
skip-slave-start = 1
slave-net-timeout = 60
sync-master-info = 1
sync-relay-log = 1
sync-relay-log-info = 1

slave-skip-errors = 1007,1008,1017,1032,1051,1053,1062,1067,1396

XTRADB

wsrep_cluster_address = gcomm://1.2.3.4,2.3.4.5
wsrep_cluster_name = ‘my_cluster_name’
wsrep_node_address = ‘1.2.3.4’
wsrep_node_name = ‘galera01’
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_provider_options = “gcache.size=70G”
wsrep_replicate_myisam = ON
wsrep_slave_threads = 16
wsrep_sst_auth = root:password
wsrep_sst_method = xtrabackup

CACHES AND LIMITS

max-connections = 300
max-heap-table-size = 32M
open-files-limit = 800000
query-cache-size = 0
query-cache-type = 0
table-definition-cache = 4096
table-open-cache = 10240
thread-cache-size = 100
tmp-table-size = 32M

INNODB

innodb_autoinc_lock_mode = 2
innodb_data_file_path = ibdata1:10M:autoextend
innodb_file_format = Barracuda
innodb-buffer-pool-size = 90G
innodb-file-per-table = 1
innodb-flush-log-at-trx-commit = 1
innodb-flush-method = O_DIRECT
innodb-log-file-size = 1G
innodb-log-files-in-group = 2

LOGGING

log-error = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/log/mysql/mysql-slow.log

Joiner configuration:

[mysql]

CLIENT

default-character-set = utf8
port = 3306
socket = /var/run/mysqld/mysqld.sock

[mysqld]

GENERAL

bind-address = 0.0.0.0
character-set-server = utf8
default-storage-engine = InnoDB
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
ssl-ca = /etc/mysql/cacert.pem
ssl-cert = /etc/mysql/server.pem
ssl-key = /etc/mysql/server.pem
user = mysql

MyISAM

key-buffer-size = 32M
myisam-recover = FORCE,BACKUP

SAFETY

innodb = FORCE
innodb-strict-mode = 1
max-allowed-packet = 16M
max-connect-errors = 1000000
sql-mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate-is-now = 1

DATA STORAGE

datadir = /var/lib/mysql/

BINARY LOGGING

expire-logs-days = 14
#Log-bin = /var/lib/mysql/mysql-bin
sync-binlog = 1
log_bin_use_v1_row_events = 1

XTRADB

wsrep_cluster_address = gcomm://1.2.3.4,2.3.4.5
wsrep_cluster_name = ‘pyx_galera_cluster’
wsrep_node_address = ‘2.3.4.5’
wsrep_node_name = ‘galera03’
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_provider_options = “gcache.size=70G”
wsrep_replicate_myisam = ON
wsrep_slave_threads = 16
wsrep_sst_auth = root:password
wsrep_sst_donor = ‘galera01’
wsrep_sst_method = xtrabackup

CACHES AND LIMITS

max-connections = 300
max-heap-table-size = 32M
open-files-limit = 800000
query-cache-size = 0
query-cache-type = 0
table-definition-cache = 4096
table-open-cache = 10240
thread-cache-size = 100
tmp-table-size = 32M

INNODB

innodb_autoinc_lock_mode = 2
innodb_data_file_path = ibdata1:10M:autoextend
innodb_file_format = Barracuda
innodb-buffer-pool-size = 45G
innodb-file-per-table = 1
innodb-flush-log-at-trx-commit = 1
innodb-flush-method = O_DIRECT
innodb-log-file-size = 1G
innodb-log-files-in-group = 2

LOGGING

log-error = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes = 1
slow-query-log = 1
slow-query-log-file = /var/log/mysql/mysql-slow.log

Anything? Shall I open bug report?

Just to someone who may wonder… The fix was to separate logs and data, i.e. use log-bin = /var/log/mysql/mysql-bin. This way Percona’s sync won’t try to copy and use index files.