Heavy swap usage MySQL 5.5 following kernel:BUG: soft lockup

We just built a new development server running CentOS 6.3 (Kernel 2.6.32-279.5.2.el6.x86_64) with 4 CPUs and 24GB of memory. The server is a VM running ESXi 5. MySQL version is 5.5.27, with 74 databases, totaling 28,429 tables.

The import of the databases went without a hitch, this morning however at the console was the following error:

Message from syslogd@ksoppiscesdbdev02 at Sep 18 08:37:47 … kernel:BUG: soft lockup - CPU#1 stuck for 67s! [mysqld:16744]

This has only occurred once throughout the day, but following that the system has been using 9GB of 12GB of swap space.

The system is to replace our dev server running running CentOS 5.6 with 4 CPUs and 8GB of memory on the same ESXi 5 host. While this system could use more memory it does not swap like the new box.

Any ideas about what could be causing this? Any help would be greatly appreciated.

Below are the contents of my.cnf and the full output from the syslog.
my.cnf

[mysqld]## Generaldatadir = /var/lib/mysqltmpdir = /var/lib/mysqltmpsocket = /var/lib/mysql/mysql.sockskip-name-resolvesql-mode = NO_ENGINE_SUBSTITUTION#event-scheduler = 1## Cachethread-cache-size = 16table-open-cache = 20000table-definition-cache = 2048query-cache-size = 64Mquery-cache-limit = 10M## Per-thread Bufferssort-buffer-size = 1Mread-buffer-size = 1Mread-rnd-buffer-size = 1Mjoin-buffer-size = 24M## Temp Tablestmp-table-size = 512Mmax-heap-table-size = 512M## Networkingback-log = 100max-connections = 300max-connect-errors = 10000max-allowed-packet = 16Minteractive-timeout = 600wait-timeout = 28800### Storage Enginesdefault-storage-engine = InnoDBinnodb = FORCE## MyISAMkey-buffer-size = 64Mmyisam-sort-buffer-size = 128M## InnoDBinnodb-buffer-pool-size = 18Ginnodb-log-file-size = 100Minnodb-log-buffer-size = 10Minnodb-flush-method = O_DIRECTinnodb-file-per-table = 1innodb-flush-log-at-trx-commit = 1#innodb-open-files = 20000## Replicationserver-id = 1#log-bin = /var/lib/mysqllogs/bin-log#relay-log = /var/lib/mysqllogs/relay-logrelay-log-space-limit = 16Gexpire-logs-days = 7#read-only = 1#sync-binlog = 1#log-slave-updates = 1#binlog-format = STATEMENT#auto-increment-offset = 1#auto-increment-increment = 2## Logginglog-output = FILEslow-query-log = 1slow-query-log-file = /var/lib/mysqllogs/slow-log#log-slow-slave-statementslong-query-time = 2[mysqld_safe]log-error = /var/log/mysqld.logopen-files-limit = 65535[mysql]no-auto-rehash[mysqldump]quickmax_allowed_packet = 16M

/var/log/messages

Sep 18 08:37:47 ksoppiscesdbdev02 kernel: BUG: soft lockup - CPU#1 stuck for 67s! [mysqld:16744]Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Modules linked in: vsock(U) vmci(U) bnx2fc fcoe libfcoe libfc 8021q garp scsi_transport_fc stp scsi_tgt llc sunrpc ipt_REJECT ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ppdev parport_pc parport vmware_balloon shpchp sg i2c_piix4 i2c_core ext4 mbcache jbd2 e1000 sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: nf_defrag_ipv4]Sep 18 08:37:47 ksoppiscesdbdev02 kernel: CPU 1Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Modules linked in: vsock(U) vmci(U) bnx2fc fcoe libfcoe libfc 8021q garp scsi_transport_fc stp scsi_tgt llc sunrpc ipt_REJECT ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ppdev parport_pc parport vmware_balloon shpchp sg i2c_piix4 i2c_core ext4 mbcache jbd2 e1000 sd_mod crc_t10dif sr_mod cdrom mptspi mptscsih mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio ipv6 cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi [last unloaded: nf_defrag_ipv4]Sep 18 08:37:47 ksoppiscesdbdev02 kernel:Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Pid: 16744, comm: mysqld Not tainted 2.6.32-279.5.2.el6.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference PlatformSep 18 08:37:47 ksoppiscesdbdev02 kernel: RIP: 0010:[] [] _spin_unlock_irqrestore+0x17/0x20Sep 18 08:37:47 ksoppiscesdbdev02 kernel: RSP: 0000:ffff880551631d88 EFLAGS: 00000282Sep 18 08:37:47 ksoppiscesdbdev02 kernel: RAX: dead000000200200 RBX: ffff880551631d88 RCX: 0000000000000000Sep 18 08:37:47 ksoppiscesdbdev02 kernel: RDX: ffff8804e8b79ca8 RSI: 0000000000000282 RDI: 0000000000000282Sep 18 08:37:47 ksoppiscesdbdev02 kernel: RBP: ffffffff8100bc0e R08: ffff880551631e50 R09: 00000000ffffffffSep 18 08:37:47 ksoppiscesdbdev02 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000Sep 18 08:37:47 ksoppiscesdbdev02 kernel: R13: 0000000000000000 R14: ffff880551631db8 R15: ffffffff8100bc0eSep 18 08:37:47 ksoppiscesdbdev02 kernel: FS: 00007fb192e7a700(0000) GS:ffff880028280000(0000) knlGS:0000000000000000Sep 18 08:37:47 ksoppiscesdbdev02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033Sep 18 08:37:47 ksoppiscesdbdev02 kernel: CR2: 00007ff0f73c7000 CR3: 0000000452a54000 CR4: 00000000000006e0Sep 18 08:37:47 ksoppiscesdbdev02 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000Sep 18 08:37:47 ksoppiscesdbdev02 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Process mysqld (pid: 16744, threadinfo ffff880551630000, task ffff88062aa18080)Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Stack:Sep 18 08:37:47 ksoppiscesdbdev02 kernel: ffff880551631db8 ffffffff8109248c ffffffff81092450 ffff8804e8b79c80Sep 18 08:37:47 ksoppiscesdbdev02 kernel: 0000000000000001 ffff88062aa18080 ffff880551631f38 ffffffff811c4139Sep 18 08:37:47 ksoppiscesdbdev02 kernel: 00000000ffffffff ffff880551631e38 0000000000000000 ffff88062aa18080Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Call Trace:Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? remove_wait_queue+0x3c/0x50Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? remove_wait_queue+0x0/0x50Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? read_events+0x329/0x450Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? default_wake_function+0x0/0x20Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? timeout_func+0x0/0x20Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? sys_io_getevents+0x56/0xb0Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? sys_io_getevents+0x1/0xb0Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? system_call_fastpath+0x16/0x1bSep 18 08:37:47 ksoppiscesdbdev02 kernel: Code: 00 00 00 01 74 05 e8 69 e3 d7 ff c9 c3 0f 1f 80 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 48 89 fa c7 02 00 00 00 00 48 89 f7 57 9d <0f> 1f 44 00 00 c9 c3 66 90 55 48 89 e5 0f 1f 44 00 00 f0 ff 07Sep 18 08:37:47 ksoppiscesdbdev02 kernel: Call Trace:Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? remove_wait_queue+0x3c/0x50Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? remove_wait_queue+0x0/0x50Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? read_events+0x329/0x450Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? default_wake_function+0x0/0x20Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? timeout_func+0x0/0x20Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? sys_io_getevents+0x56/0xb0Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? sys_io_getevents+0x1/0xb0Sep 18 08:37:47 ksoppiscesdbdev02 kernel: [] ? system_call_fastpath+0x16/0x1bSep 18 08:37:47 ksoppiscesdbdev02 kernel: INFO: task mysqld:17859 blocked for more than 120 seconds.Sep 18 08:37:47 ksoppiscesdbdev02 kernel: “echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.Sep 18 08:37:47 ksoppiscesdbdev02 kernel: mysqld D 0000000000000002 0 17859 16104 0x00000080Sep 18 08:37:47 ksoppiscesdbdev02 kernel: ffff88027df7bdf8 0000000000000082 ffff88060383cde0 0000000000000082Sep 18 08:37:47 ksoppiscesdbdev02 kernel: ffff88027df7bd88 ffff880629145e80 0000000000000000 ffff8806297c90d0

Hi

I guess, you are hitting this bug for CentOS 6.3 ([URL]CentOS Bug Tracker). As you said, it was working properly with CentOS 5.6, is MySQL version being changed while migrating from CentOS 5.6 to 6.3? If not then it doesn’t look like MySQL issue. Can you please check and let me know if that is not the case?

Regards,
Nil

Thanks for the suggestion Nil. As part of the migration we went from MySQL 5.1 to 5.5.

Since my initial post, I have done testing with a staging server with the same OS (6.3, same kernel version as Dev too) and MySQL (5.5) version, no soft lockups, minimal swapping under load. This box is running on a different ESXi cluster at our co-lo data center, the VMWare host software is the same, but the hardware is different, 4 x Dell PowerEdge R910s vs 4 x HP ProLiant DL 580 G7s in our local data center.

It’s got us very puzzled, I am rebuilding the local (dev box) with CentOS 6.2 to see if the problem still occurs.