mysql crashes with InnoDB: Assertion failure in thread 1459657024 in file srv/srv0srv.c line 2523

Hi,

We are using Percona MySQL builds on CentOS-5.5. The details are as follows

Hardware: HP dc7900 with 16 GB RAM, Intel Core 2 Duo E8400/3Ghz/4GB RAM, 160GB HDD.

OS: CentOS release 5.5 (Final) / Linux blr-cos-mdb01.digi.com 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux.

Percona MySQL:
Percona-XtraDB-1.0.6-10.2-5.1.45-10.2.rhel5
Percona-Server-server-51-5.1.47-rel11.1.51.rhel5
Percona-XtraDB-1.0.3-5-5.1.34-5.rhel5
Percona-Server-shared-compat-5.1.43-3
Percona-Server-client-51-5.1.47-rel11.1.51.rhel5
Percona-Server-test-51-5.1.47-rel11.1.51.rhel5
Percona-Server-devel-51-5.1.47-rel11.1.51.rhel5
Percona-Server-shared-51-5.1.47-rel11.1.51.rhel5

The MySQL Server crashes 3~4 times a days along with crashing the system totally by rendering it totally unresponsive. The ping /Magic-SysRq do not work at all. We have to do hard power recycling (plug out the power cable and plug in back after 30 secs).

The /var/log/mysqld.log has

[B]Quote:[/B]
InnoDB: ###### Diagnostic info printed to the standard error stream InnoDB: Error: semaphore wait has lasted > 600 seconds InnoDB: We intentionally crash the server, because it appears to be hung. 100824 15:19:57 InnoDB: Assertion failure in thread 1459657024 in file srv/srv0srv.c line 2523 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to [URL]http://bugs.mysql.com.[/URL] InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: [URL="http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html"] http://dev.mysql.com/doc/refman/5.1/en/forcing-recovery.html[/URL] InnoDB: about forcing recovery. 100824 15:19:57 - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.

key_buffer_size=33554432
read_buffer_size=2097152
max_used_connections=101
max_threads=500
threads_connected=100
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5158166 K
bytes of memory
Hope that’s ok; if not, decrease some variables in the equation.

thd: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong…
stack_bottom = (nil) thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x86892e]
/usr/sbin/mysqld(handle_segfault+0x322)[0x5b0f72]
/lib64/libpthread.so.0[0x35d480eb10]
/lib64/libc.so.6(gsignal+0x35)[0x35d3c30265]
/lib64/libc.so.6(abort+0x110)[0x35d3c31d10]
/usr/sbin/mysqld[0x7914bb]
/lib64/libpthread.so.0[0x35d480673d]
/lib64/libc.so.6(clone+0x6d)[0x35d3cd3d1d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

The “–memlock” argument, which was enabled, uses system calls that are
unreliable and unstable on some operating systems and operating-system
versions (notably, some versions of Linux). This crash could be due to use
of those buggy OS calls. You should consider whether you really need the
“–memlock” parameter and/or consult the OS distributer about “mlockall”
bugs.
100824 18:22:30 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql/data
100824 18:22:30 [Note] Plugin ‘FEDERATED’ is disabled.
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3
100824 18:22:32 InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 51613706496
100824 18:22:32 InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files…
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer…
InnoDB: Doing recovery: scanned up to log sequence number 51613709094
InnoDB: 2 transaction(s) which must be rolled back or cleaned up
InnoDB: in total 9 row operations to undo
InnoDB: Trx id counter is 5989A00
100824 18:22:46 InnoDB: Starting an apply batch of log records to the database…
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 57735, file name /var/lib/mysql/binlogs/mtrak32-bin.000123
InnoDB: Starting in background the rollback of uncommitted transactions
100824 18:22:47 InnoDB: Rolling back trx with id 5989818, 6 rows to undo
100824 18:22:47 Percona XtraDB (http://www.percona.com) 1.0.8-11.1 started; log sequence number 51613709094
100824 18:22:47 [Note] Recovering after a crash using /var/lib/mysql/binlogs/mtrak32-bin
100824 18:22:47 [Note] Starting crash recovery…
100824 18:22:47 [Note] Crash recovery finished.
InnoDB: Rolling back of trx id 5989818 completed
100824 18:22:48 InnoDB: Rolling back trx with id 5989817, 3 rows to undo
InnoDB: Rolling back of trx id 5989817 completed
100824 18:22:48 InnoDB: Rollback of non-prepared transactions completed
100824 18:22:53 [Note] Event Scheduler: Loaded 0 events
100824 18:22:53 [Note] /usr/sbin/mysqld: ready for connections.
Version: ‘5.1.47-rel11.1-log’ socket: ‘/var/lib/mysql/mysql.sock’ port: 3306 Percona Server (GPL), 11.1, Revision 51
100824 18:27:57 [ERROR] /usr/sbin/mysqld: Table ‘./mobiq/jms_messages’ is marked as crashed and should be repaired
100824 18:27:57 [ERROR] /usr/sbin/mysqld: Table ‘./mobiq/jms_messages’ is marked as crashed and should be repaired
100824 18:27:57 [ERROR] /usr/sbin/mysqld: Table ‘./mobiq/jms_messages’ is marked as crashed and should be repaired
100824 18:27:57 [ERROR] /usr/sbin/mysqld: Table ‘./mobiq/jms_messages’ is marked as crashed and should be repaired

and the /var/log/messages has
[B]Quote:[/B]
Aug 24 18:05:35 blr-cos-mdb01 kernel: BUG: soft lockup - CPU#0 stuck for 10s! [mysqld:5365] Aug 24 18:05:35 blr-cos-mdb01 kernel: CPU 0: Aug 24 18:05:45 blr-cos-mdb01 kernel: Modules linked in: ipv6 xfrm_nalgo crypto_api hidp l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy snd_hda_intel sr_mod tpm_infineon snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq cdrom tpm snd_seq_device snd_pcm_oss snd_mixer_oss tpm_bios snd_pcm e1000e snd_timer shpchp serio_raw snd_page_alloc snd_hwdep pcspkr sg snd soundcore dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Aug 24 18:05:45 blr-cos-mdb01 kernel: Pid: 5365, comm: mysqld Not tainted 2.6.18-194.3.1.el5 #1 Aug 24 18:05:45 blr-cos-mdb01 kernel: RIP: 0010:[] [] __d_lookup+0xe2/0xff Aug 24 18:05:45 blr-cos-mdb01 kernel: RSP: 0018:ffff8101a415fc88 EFLAGS: 00000282 Aug 24 18:05:45 blr-cos-mdb01 kernel: RAX: ffff8103c6c864c8 RBX: ffff8103c6c864c8 RCX: 0000000000000015 Aug 24 18:05:45 blr-cos-mdb01 kernel: RDX: 00000000000db1d6 RSI: ffff8101a415fd28 RDI: ffff8103c83274b0 Aug 24 18:05:45 blr-cos-mdb01 kernel: RBP: ffff810417bbf800 R08: 0000000000008001 R09: ffff81041747e5c0 Aug 24 18:05:45 blr-cos-mdb01 kernel: R10: ffff810188c8c580 R11: ffffffff8002c3e0 R12: ffff81040c0840c0 Aug 24 18:05:45 blr-cos-mdb01 kernel: R13: 0000000000000000 R14: ffff8101a394e348 R15: ffff8101a394e348 Aug 24 18:05:45 blr-cos-mdb01 kernel: FS: 00000000405c8940(0063) GS:ffffffff803ca000(0000) knlGS:0000000000000000 Aug 24 18:05:45 blr-cos-mdb01 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Aug 24 18:05:45 blr-cos-mdb01 kernel: CR2: 00002aace6224000 CR3: 0000000401db3000 CR4: 00000000000006e0 Aug 24 18:05:45 blr-cos-mdb01 kernel: Aug 24 18:05:45 blr-cos-mdb01 kernel: Call Trace: Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __d_lookup+0xb0/0xff Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_lookup+0x2c/0x1e6 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __link_path_walk+0xa01/0xf42 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] link_path_walk+0x42/0xb2 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_path_lookup+0x275/0x2f1 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] __path_lookup_intent_open+0x56/0x97 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] open_namei+0x73/0x6d5 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_page_fault+0x4fe/0x874 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_filp_open+0x1c/0x38 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] _atomic_dec_and_lock+0x39/0x57 Aug 24 18:05:45 blr-cos-mdb01 kernel: [] do_sys_open+0x44/0xbe Aug 24 18:05:45 blr-cos-mdb01 kernel: [] tracesys+0xd5/0xe0 Aug 24 18:05:45 blr-cos-mdb01 kernel: Aug 24 18:22:23 blr-cos-mdb01 syslogd 1.4.1: restart.
My /etc/my.cnf is
[B]Quote:[/B]
[client] port = 3306 socket = /var/lib/mysql/mysql.sock

[mysqld]
port = 3306
datadir = /var/lib/mysql/data
socket = /var/lib/mysql/mysql.sock
user=mysql
back_log = 50
skip-name-resolve
max_connections = 500
max_connect_errors = 10
table_cache = 4096
max_allowed_packet = 64M
innodb-status-file=1
binlog_cache_size = 32M
max_heap_table_size = 64M
lower_case_table_names=1
sort_buffer_size = 8M
join_buffer_size = 8M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 64M
ft_min_word_len = 4
memlock
default_table_type = MYISAM
thread_stack = 192K
transaction_isolation = READ-COMMITTED
max_write_lock_count = 1
net_read_timeout = 60
tmp_table_size = 64M
log_bin = /var/lib/mysql/binlogs/mtrak32-bin
log-bin-index = /var/lib/mysql/binlogs/mtrak32-bin.index
expire_logs_days = 10
max_binlog_size=100M
max_binlog_cache_size=32M
binlog_format=MIXED
binlog-ignore-db=mysql
binlog-ignore-db=test
binlog-ignore-db=information_schema
relay_log = /var/lib/mysql/binlogs/mysql-relay-bin
relay_log_index = /var/lib/mysql/binlogs/mysql-relay-bin.index
log_warnings
long_query_time = 2
server-id = 1
key_buffer_size = 32M
read_buffer_size = 2M
read_rnd_buffer_size = 32M
bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_repair_threads = 10
innodb_file_per_table=1
innodb_additional_mem_pool_size = 20M
innodb_buffer_pool_size = 8G
innodb_data_file_path = ibdata1:100M:autoextend
innodb_read_io_threads = 20
innodb_write_io_threads = 10
innodb_thread_concurrency = 16
innodb_flush_log_at_trx_commit = 2
innodb_log_file_size = 256M
innodb_max_dirty_pages_pct = 90
innodb_flush_method = O_DIRECT
innodb_autoextend_increment = 1
innodb_open_files = 4096
sync-binlog=1
log-slave-updates
auto_increment_increment = 10
auto_increment_offset = 1
innodb_io_capacity = 2000
innodb_log_files_in_group = 3
innodb_log_buffer_size = 64M
innodb_extra_rsegments = 32
innodb_lock_wait_timeout = 120
innodb_table_locks = 0

[mysqldump]
quick
max_allowed_packet = 64M

[mysql]
no-auto-rehash

[isamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[myisamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M

[mysqlhotcopy]
interactive-timeout

[mysqld_safe]
open-files-limit = 10240
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

The /etc/security/limits.conf has
[B]Quote:[/B]
mysql soft nofile 24000 mysql hard nofile 32000
I googled and got a bug report [URL]http://bugs.mysql.com/bug.php?id=25645[/URL]

How to resolve this ?

Regards
Shann

I suggest you check your hardware thoroughly. It looks to me like the server is crashing MySQL, not the other way around.

Issue resolved. BAD … BAD MEMORY, after removing one if the DIMM’s, the server no longer crashes.

I wonder how memtest passed the Memory !!!