Replication hanged at midnight 10/21/2012

Hi!

Percona-Server-server-55-5.5.16-rel22.0.189.rhel6.x86_64

Exactly at 0:00 replication stopped without any error message.
Simultaneously CPU usage raised as well as InnoDB active transactions
and mutex spin waits (please see RRD graphs below).

Slave IO thread still reads master binlog while slave SQL thread freezed:

Master_Log_File Read_Master_Log_Pos Relay_Log_File Relay_Log_Pos
mysql-bin.001042 371236869 relay-log.002962 500282540
mysql-bin.001042 371613510 relay-log.002962 500282540
mysql-bin.001042 371613510 relay-log.002962 500282540
mysql-bin.001042 388124382 relay-log.002962 500282540

Restarting slave and MySQL didn’t help.

Relay log file as well as master binlog lookes strange:

at 500282540

#121021 0:00:00 server id 103 end_log_pos 500282462 Query thread_id=233107850 exec_time=36 error_code=0
SET TIMESTAMP=1350763200/!/;
/!\C latin1 //!/;
SET @@session.character_set_client=8,@@session.collation_connection=8,@@session.collation_server=192/!/;
BEGIN
/!/;

at 500282608

at 500282664

at 500283693

at 500284722

at 500285751

at 500286780

at 500287808

at 500288837

at 500289865

at 500290894

SHOW ENGINE INNODB STATUS:

=====================================
121021 22:12:42 INNODB MONITOR OUTPUT

Per second averages calculated from the last 3 seconds

BACKGROUND THREAD

srv_master_thread loops: 18324 1_second, 18323 sleeps, 1813 10_second, 467 background, 467 flush
srv_master_thread log flush and writes: 18387

SEMAPHORES

OS WAIT ARRAY INFO: reservation count 46729, signal count 2366764
Mutex spin waits 38592331, rounds 40147028, OS waits 28623
RW-shared spins 328396, rounds 1436974, OS waits 6949
RW-excl spins 298041, rounds 3014995, OS waits 9560
Spin rounds per wait: 1.04 mutex, 4.38 RW-shared, 10.12 RW-excl

FILE I/O

I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (log thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (read thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
I/O thread 9 state: waiting for completed aio requests (write thread)
Pending normal aio reads: 0 [0, 0, 0, 0] , aio writes: 0 [0, 0, 0, 0] ,
ibuf aio reads: 0, log i/o’s: 0, sync i/o’s: 0
Pending flushes (fsync) log: 0; buffer pool: 0
109627 OS file reads, 187792 OS file writes, 31338 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 1.00 writes/s, 1.00 fsyncs/s

INSERT BUFFER AND ADAPTIVE HASH INDEX

Ibuf: size 1, free list len 15180, seg size 15182, 13461 merges
merged operations:
insert 0, delete mark 62934, delete 0
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 84999127, node heap has 4903 buffer(s)
692966.68 hash searches/s, 2839.39 non-hash searches/s

LOG

Log sequence number 4779687205905
Log flushed up to 4779687205611
Last checkpoint at 4779687203740
Max checkpoint age 3224918385
Checkpoint age target 3124139686
Modified age 2165
Checkpoint age 2165
0 pending log writes, 0 pending chkp writes
21166 log i/o’s done, 1.00 log i/o’s/second

BUFFER POOL AND MEMORY

Total memory allocated 44124078080; in additional pool allocated 0
Internal hash tables (constant factor + variable factor)
Adaptive hash index 760327984 (679993016 + 80334968)
Page hash 42500456 (buffer pool 0 only)
Dictionary cache 171036778 (169999664 + 1037114)
File system 302232 (82672 + 219560)
Lock system 106544576 (106249768 + 294808)
Recovery system 0 (0 + 0)
Dictionary memory allocated 1037114
Buffer pool size 2621439
Buffer pool size, bytes 42949656576
Free buffers 2490028
Database pages 126491
Old database pages 46674
Modified db pages 9
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 109525, created 30009, written 160436
0.00 reads/s, 0.00 creates/s, 0.00 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 126491, unzip_LRU len: 0
I/O sum[116]:cur[0], unzip sum[0]:cur[0]

ROW OPERATIONS

0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
—OLDEST VIEW—
Normal read view
Read view low limit trx n:o 2B6D0A6DF
Read view up limit trx id 2B6910600
Read view low limit trx id 2B6D0A6DF
Read view individually stored trx ids:
Read view trx id 2B6910600

Main thread process no. 4961, id 139991755536128, state: sleeping
Number of rows inserted 2584527, updated 1760614, deleted 1433650, read 572374996
0.00 inserts/s, 0.00 updates/s, 4.67 deletes/s, 5.00 reads/s

TRANSACTIONS

Trx id counter 2B6D0A715
Purge done for trx’s n:o < 2B6D0A6C9 undo n:o < 0
History list length 341
LIST OF TRANSACTIONS FOR EACH SESSION:
—TRANSACTION 0, not started
MySQL thread id 125110, OS thread handle 0x7f5e648ed700, query id 224105070 10.201.201.70 mdb
SHOW ENGINE INNODB STATUS
—TRANSACTION 2B6D0A708, not started
MySQL thread id 125101, OS thread handle 0x7f5e649e2700, query id 224104922 192.168.51.51 imdbuser
—TRANSACTION 2B6D0A714, not started
MySQL thread id 15, OS thread handle 0x7f5e66d72700, query id 224105066 localhost collectd
—TRANSACTION 0, not started
MySQL thread id 13, OS thread handle 0x7f5e66dd4700, query id 224105058 localhost collectd
—TRANSACTION 2B6910600, ACTIVE 18755 sec starting index read
mysql tables in use 1, locked 1
1998 lock struct(s), heap size 293304, 625515 row lock(s), undo log entries 312435
MySQL thread id 2, OS thread handle 0x7f525b15b700, query id 3 Reading event from the relay log
TABLE LOCK table imdb.jrnl_im_items trx id 2B6910600 lock mode IX
RECORD LOCKS space id 97 page no 179011 n bits 864 index enable of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X
RECORD LOCKS space id 97 page no 79017 n bits 368 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
RECORD LOCKS space id 97 page no 79018 n bits 352 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
RECORD LOCKS space id 97 page no 179012 n bits 864 index enable of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X
RECORD LOCKS space id 97 page no 79019 n bits 352 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
RECORD LOCKS space id 97 page no 79020 n bits 352 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
RECORD LOCKS space id 97 page no 179013 n bits 864 index enable of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X
RECORD LOCKS space id 97 page no 79021 n bits 352 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
RECORD LOCKS space id 97 page no 79022 n bits 352 index GEN_CLUST_INDEX of table imdb.jrnl_im_items trx id 2B6910600 lock_mode X locks rec but not gap
TOO MANY LOCKS PRINTED FOR THIS TRX: SUPPRESSING FURTHER PRINTS

END OF INNODB MONITOR OUTPUT