Hi krunalbauskar,
Do you have an idea under which circumstances this bug might occur? We ran into it in our production instance (5.7.19-29.22-3 on Ubuntu 16.04 LTS) with a 3 node cluster several times yesterday, with symptoms matching those of the original post. Our solution was to revert back to 5.7.16-27.19.
For reference, we’d see errors like these:
2017-11-28T01:45:45.840545Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140037937301248 has waited at lock0lock.cc line 4427 for 365.00 seconds the semaphore:
Mutex at 0x7f967c634068, Mutex LOCK_SYS created lock0lock.cc:451, lock var 1
2017-11-28T01:45:45.840563Z 0 [Warning] InnoDB: A long semaphore wait:
--Thread 140038020478720 has waited at lock0lock.cc line 4427 for 365.00 seconds the semaphore:
Mutex at 0x7f967c634068, Mutex LOCK_SYS created lock0lock.cc:451, lock var 1
--Thread 140038339815168 has waited at ibuf0ibuf.cc line 2531 for 365.00 seconds the semaphore:
S-lock on RW-latch at 0x7f9663ac0870 created in file buf0buf.cc line 1456
a writer (thread id 140032994010880) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file ibuf0ibuf.cc line 4568
Last time write locked in file /mnt/workspace/percona-xtradb-cluster-5.7-debian-binary/label_exp/ubuntu-xenial-64bit/percona-xtradb-cluster-5.7-5.7.19-29.22/storage/innobase/ibuf/ibuf0ibuf.cc line 4568
I chose that excerpt because thread 140038339815168 was he main thread
Process ID=181443, Main thread ID=140038339815168, state: doing insert buffer merge
Number of rows inserted 39760179, updated 55926900, deleted 38526, read 801630968
693.77 inserts/s, 0.00 updates/s, 0.00 deletes/s, 702.83 reads/s
I initially hypothesized that an increased load on the system after running for about a week in production brought to light some concurrency issues. Specifically, after upgrading from 5.6 to 5.7 we removed the innodb_locks_unsafe_for_binlog setting. When the problem was occurring, pt-stalk showed information like this:
*************************** 811. row ***************************
trx_id: 38964706
trx_state: RUNNING
trx_started: 2017-11-27 22:41:45
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 2
trx_mysql_thread_id: 16964
trx_query: INSERT INTO mdl_sessions (state,sid,sessdata,userid,timemodified,timecreated,lastip,firstip) VALUES('0','xxxxxxx',NULL,'0','1511844105','1511844105','xxx.xxx.xxx.xxx','xxx.xxx.xxx.xxx')
trx_operation_state: NULL
trx_tables_in_use: 1
trx_tables_locked: 1
trx_lock_structs: 1
trx_lock_memory_bytes: 1136
trx_rows_locked: 0
trx_rows_modified: 1
trx_concurrency_tickets: 0
trx_isolation_level: REPEATABLE READ
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 0
trx_is_read_only: 0
trx_autocommit_non_locking: 0
*************************** 1. row ***************************
lock_id: 38964255:7131:449:75
lock_trx_id: 38964255
lock_mode: X
lock_type: RECORD
lock_table: `moodle`.`mdl_sessions`
lock_index: PRIMARY
lock_space: 7131
lock_page: 449
lock_rec: 75
lock_data: 224357301
Since all of our writes (and most reads) go to one node, I wasn’t expecting the isolation level to cause problems. I still changed it to READ-COMMITTED via
transaction_isolation = READ-COMMITTED
Since I’ve made some changes before discovering this thread and reverting, our new configuration looks like this.
[sst]
inno-apply-opts = "--use-memory=40G"
inno-backup-opts = "--parallel=20 --stream=xbstream --compress --compress-threads=20"
[mysqld]
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
datadir = /var/lib/mysql/data
innodb = FORCE
innodb_use_native_aio = 0
innodb_file_format = Barracuda
innodb_file_per_table = 1
innodb_large_prefix = ON
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
skip-character-set-client-handshake = 1
key-buffer-size = 32M
max-heap-table-size = 48M
max-allowed-packet = 16M
tmp-table-size = 48M
query-cache-type = 0
query-cache-size = 0
max-connections = 2000
thread-cache-size = 50
open-files-limit = 65535
table-definition-cache = 4096
table-open-cache = 8192
innodb-log-files-in-group = 2
innodb-log-file-size = 2G
innodb-flush-log-at-trx-commit = 2
innodb-file-per-table = 1
innodb-buffer-pool-size = 216G
innodb-buffer-pool-instances = 8
innodb-io-capacity = 6000
innodb-read-io-threads = 15
innodb-write-io-threads = 15
innodb_file_format = Barracuda
innodb_doublewrite = 1
innodb_adaptive_hash_index = False
pxc_strict_mode = PERMISSIVE
transaction_isolation = READ-COMMITTED
innodb-thread-concurrency = 32
innodb_flush_log_at_timeout = 2
innodb-thread-concurrency = 0
log-error = /var/log/mysql/mysql-error.log
log-queries-not-using-indexes = 0
slow-query-log = 0
slow-query-log-file = /var/log/mysql/mysql-slow.log
long-query-time = 2
wsrep_provider = /usr/lib/libgalera_smm.so
wsrep_node_address = 172.22.xy.10
wsrep_cluster_name = prod
wsrep_node_name = sql.ppsb.learn.illinois.edu
wsrep_cluster_address = gcomm://172.22.xy.10,172.22.xx.10,172.22.xz.10
wsrep_provider_options = "gcache.size=4G;"
wsrep_retry_autocommit = 0
wsrep_sst_method = xtrabackup-v2
wsrep_slave_threads = 4
wsrep_sst_auth = sstuser:xxxxxxx
innodb-autoinc-lock-mode = 2
log-bin = mysql-binlog
binlog_format = ROW
expire-logs-days = 2
sync-binlog = 1
innodb-flush-neighbors = 0
innodb_checksum_algorithm = crc32
innodb_log_group_home_dir = /var/lib/mysql/innodb-logs
server-id = 7
innodb_numa_interleave = 0
[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
Initially we didn’t have a double buffer (running on zfs), pxc_strict_mode was set to default, and innodb_numa_interleave was set to 1.
Our servers are PowerEdge R630’s with 256GB of RAM and an SSD for /var/lib/mysql, with /var/lib/mysql/data having a zfs recordsize of 16k and /var/lib/mysq/innodb-logs having a recordsize of 128k. Each server has 2 Intel Xeon E5-2630 v4 processors.