I’m having weird problem with only one of my 3 PXC nodes. For the past couple of months and at random intervals, even though mysql service starts up ok and receives full SST from donor, sometimes after a couple of days, sometimes after a week, it crashes with the exact same error every time. The log is as follows:
2015-10-07 01:21:51 7fc7e8ff7700 InnoDB: Assertion failure in thread 140496584275712 in file fsp0fsp.cc line 1509 InnoDB: Failing assertion: frag_n_used > 0 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 23:21:51 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Please help us make Percona XtraDB Cluster better by reporting any bugs at https://bugs.launchpad.net/percona-xtradb-cluster key_buffer_size=268435456 read_buffer_size=131072 max_used_connections=0 max_threads=2502 thread_count=3 connection_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1258688 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x7fc7cc000990 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 7fc7e8ff6a60 thread_stack 0x40000 /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x90013e] /usr/sbin/mysqld(handle_fatal_signal+0x494)[0x698714] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fc967f78cb0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fc9673ce0d5] /lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7fc9673d183b] /usr/sbin/mysqld[0xaa97e7] /usr/sbin/mysqld[0xab08d2] /usr/sbin/mysqld[0xa15ffa] /usr/sbin/mysqld[0xa19a7c] /usr/sbin/mysqld[0xa13bd7] /usr/sbin/mysqld[0xa14868] /usr/sbin/mysqld[0xa148f2] /usr/sbin/mysqld[0x91f798] /usr/sbin/mysqld[0x927ef4] /usr/sbin/mysqld(_Z13ha_commit_lowP3THDbb+0x112)[0x5e5262] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG26process_commit_stage_queueEP3THDS1_+0x36a)[0x8b2aba] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG14ordered_commitEP3THDbb+0x441)[0x8ba281] /usr/sbin/mysqld(_ZN13MYSQL_BIN_LOG6commitEP3THDb+0x56c)[0x8ba90c] /usr/sbin/mysqld(_Z15ha_commit_transP3THDbb+0x34f)[0x5e5aef] /usr/sbin/mysqld(_Z12trans_commitP3THD+0x47)[0x7a32e7] /usr/sbin/mysqld(_Z15wsrep_commit_cbPvjPK14wsrep_trx_metaPbb+0x20e)[0x5df42e] /usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM9apply_trxEPvPNS_9TrxHandleE+0x150)[0x7fc930223ed0] /usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM8recv_ISTEPv+0x2a4)[0x7fc930231d64] /usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM22request_state_transferEPvRK10wsrep_uuidlPKvl+0x681)[0x7fc930233921] /usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM19process_conf_changeEPvRK15wsrep_view_infoiNS_10Replicator5StateEl+0xb49)[0x7fc930226249] /usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource8dispatchEPvRK10gcs_actionRb+0x67b)[0x7fc9301fe0cb] /usr/lib/libgalera_smm.so(_ZN6galera15GcsActionSource7processEPvRb+0x5e)[0x7fc9301ff14e] /usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM10async_recvEPv+0x78)[0x7fc930226628] /usr/lib/libgalera_smm.so(galera_recv+0x1e)[0x7fc930238b5e] /usr/sbin/mysqld[0x5df881] /usr/sbin/mysqld(start_wsrep_THD+0x2f8)[0x5c73b8] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7fc967f70e9a] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fc96748b8bd] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0): is an invalid pointer Connection ID (thread ID): 1 Status: NOT_KILLED You may download the Percona XtraDB Cluster operations manual by visiting http://www.percona.com/software/percona-xtradb-cluster/. You may find information in the manual which will help you identify the cause of the crash. 151007 01:21:52 mysqld_safe Number of processes running now: 0 151007 01:21:52 mysqld_safe WSREP: not restarting wsrep node automatically 151007 01:21:52 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended
Obviously I searched before posting but couldn’t find anyone who doesn’t have this kind of problem at service startup. I also did a full hardware check on my node and nothing came up. Even had an fsck on both members of my raid1 array with no errors whatsoever, same with smartctl. I cannot believe it’s something to do with InnoDB tablespace integrity since the other 2 nodes have the exact same data and they haven’t thrown an error all this time.
Anyone has any clues?
EDIT: the cluster is comprised of 3 nodes running “5.6.24-72.2-56-log Percona XtraDB Cluster (GPL), Release rel72.2, Revision 43abf03, WSREP version 25.11, wsrep_25.11” with Galera 3.11(r93aca2d)