Not the answer you need?
Register and ask your own question!

PXC 5.6.24 - BF applier failed to open_and_lock_tables

XengulaiXengulai EntrantInactive User Role Beginner
I am running a 3 node cluster of PXC and I keep getting random crashes on all 3 nodes.

Setup:
Ubuntu 14.04.2 LTS
- 60 GB RAM
- SSD RAID 10 (130GB)
PXC 5.6.24-72.2-56-log - Percona XtraDB Cluster (GPL), Release rel72.2, Revision 43abf03, WSREP version 25.11, wsrep_25.11
>>
[mysqld]

# GENERAL #
bind-address = 0.0.0.0
character-set-server = utf8
collation-server = utf8_general_ci
default_storage_engine = InnoDB
event-scheduler = ON
pid-file = /var/run/mysqld/mysqld.pid
port = 3306
server-id = 1
socket = /var/run/mysqld/mysqld.sock
user = mysql

# MyISAM #
key-buffer-size = 32M
myisam-recover-options = FORCE,BACKUP

# SAFETY #
innodb = FORCE
innodb-strict-mode = 1
max-allowed-packet = 64M
max-connect-errors = 1000000
skip-external-locking
skip-host-cache
skip-name-resolve
sql-mode = STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_AUTO_VA LUE_ON_ZERO,NO_ENGINE_SUBSTITUTION
sysdate-is-now = 1

# DATA STORAGE #
datadir = /var/lib/mysql

# BINARY LOGGING #
expire-logs-days = 14
log-bin = /var/lib/mysql/mysql-bin
log-slave-updates
sync-binlog = 1

# CACHES AND LIMITS #
back-log = 1000
connect-timeout = 20
interactive-timeout = 30
join-buffer-size = 8M
max-binlog-size = 100M
max-connections = 2000
max-heap-table-size = 32M
open-files-limit = 65535
preload-buffer-size = 65536
query-cache-size = 0
query-cache-type = 0
sort-buffer-size = 2M
read-buffer-size = 4M
read-rnd-buffer-size = 4M
table-definition-cache = 4096
table-open-cache = 5000
thread-cache-size = 100
thread-stack = 256K
tmp-table-size = 32M
wait-timeout = 30

# INNODB #
innodb-buffer-pool-instances = 8
innodb-buffer-pool-size = 40G
innodb-file-per-table = 1
innodb-flush-log-at-trx-commit = 1
innodb-flush-method = O_DIRECT
innodb-lock-wait-timeout = 15
innodb-log-files-in-group = 2
innodb-log-file-size = 512M

# LOGGING *
log-error = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes = 0
slow-query-log = 0

# WSREP #
wsrep_provider = /usr/lib/galera3/libgalera_smm.so
wsrep_cluster_address = gcomm://<redacted>,<redacted>,<redacted>
binlog_format = ROW
innodb_autoinc_lock_mode = 2
wsrep_node_address = <redacted>
wsrep_node_name = "db01"
wsrep_sst_method = xtrabackup-v2
wsrep_cluster_name = <redacted>
wsrep_sst_auth = "<redacted>"
wsrep_slave_threads = 8
wsrep_notify_cmd = /etc/mysql/wsrep_notify
<<

I was getting crashes every 1-3 days on all 3 nodes until the release of PXC 5.6.24. Now I get them about once a week. At first, I thought it was a specific cron job because the crash timestamps had similar minutes, so I looked at the job but I couldn't replicate the error manually. Then the crash timestamps started to be different so I haven't been able to track any pattern. When the nodes crash, I get the same error across the board:

2015-07-09 00:44:36 19638 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 46677454)
2015-07-09 00:44:36 19638 [Warning] WSREP: RBR event 3 Write_rows apply warning: 1615, 46677454
2015-07-09 00:44:36 19638 [Warning] WSREP: Failed to apply app buffer: seqno: 46677454, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 2th time
2015-07-09 00:44:36 19638 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 46677454)
2015-07-09 00:44:36 19638 [Warning] WSREP: RBR event 3 Write_rows apply warning: 1615, 46677454
2015-07-09 00:44:36 19638 [Warning] WSREP: Failed to apply app buffer: seqno: 46677454, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 3th time
2015-07-09 00:44:36 19638 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 46677454)
2015-07-09 00:44:36 19638 [Warning] WSREP: RBR event 3 Write_rows apply warning: 1615, 46677454
2015-07-09 00:44:36 19638 [Warning] WSREP: Failed to apply app buffer: seqno: 46677454, status: 1
at galera/src/trx_handle.cpp:apply():351
Retrying 4th time
2015-07-09 00:44:36 19638 [Warning] WSREP: BF applier failed to open_and_lock_tables: 1615, fatal: 0 wsrep = (exec_mode: 1 conflict_state: 5 seqno: 46677454)
2015-07-09 00:44:36 19638 [Warning] WSREP: RBR event 3 Write_rows apply warning: 1615, 46677454
2015-07-09 00:44:36 19638 [Warning] WSREP: failed to replay trx: source: f3b46697-1ff6-11e5-af61-0b245f7246eb version: 3 local: 1 state: REPLAYING flags: 129 conn_id: 6286358 trx_id: 366113962 seqnos (l: 6786080, g: 46677454, s: 46677452, d: 46677453, ts: 2721954221674858)
2015-07-09 00:44:36 19638 [Warning] WSREP: Failed to apply trx 46677454 4 times
2015-07-09 00:44:36 19638 [ERROR] WSREP: trx_replay failed for: 6, query: void
2015-07-09 00:44:36 19638 [ERROR] Aborting

After the node fails, it ALWAYS has to do a SST (instead of an IST). I have tried to Google around and I have found several people having the same issue, but no resolutions. Is this a configuration problem? A bug that needs to be reported?

Any help would be appreciated. Thanks in advance!
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.