Percona Server for mongodb crash

I have PSMDB 3.0.10-1.5 with PerconaFT engine running on CentOS 6.7 kernel 2.6.32-573.22.1.el6.x86_64. System has 24GB RAM and 8 cores of CPU.

Here is my mongod configuration file:

dbpath = /data/db
storageEngine = PerconaFT
PerconaFTIndexCompression = lzma
PerconaFTCollectionCompression = lzma
PerconaFTEngineDirectio = true
logpath = /var/log/mongo/mongod.log
logappend = true
port = 27017
journal = true
fork = true
profile = 1
rest = true
slowms = 1000
quiet = true

I am observing frequent mongo crashes. Attached is crash log.

2016-04-25T20:02:51.591+0530 F - Invalid access at address: 0
2016-04-25T20:02:51.672+0530 F - Got signal: 11 (Segmentation fault).

0x10b7ca2 0x10b7553 0x10b78b4 0x7fa6d5b6f7e0 0x15c37ff 0x15c5f8e 0x162eeb4 0x15d2d4b 0x16008a7 0x1601e64 0x15b62ae 0x15fe4ed 0x1621f26 0x7fa6d5b67aa1 0x7fa6d46a093d
----- BEGIN BACKTRACE -----
{“backtrace”:[{“b”:“400000”,“o”:“CB7CA2”,“s”:“_ZN5mongo15printStackTraceERSo”},{“b”:“400000”,“o”:“CB7553”},{“b”:“400000”,“o”:“CB78B4”},{“b”:“7FA6D5B60000”,“o”:“F7E0”},{“b”:“400000”,“o”:“11C37FF”,“s”:“Z7le_packP3uleP7bn_datajPvjjjPP9leafentryPS3”},{“b”:“400000”,“o”:“11C5F8E”,“s”:“_Z23toku_le_garbage_collectP9leafentryP7bn_datajPvjP11txn_gc_infoPS0_Pl”},{“b”:“400000”,“o”:“122EEB4”,“s”:“_Z23toku_ftnode_leaf_run_gcP2ftP6ftnode”},{“b”:“400000”,“o”:“11D2D4B”,“s”:“_Z26toku_ftnode_flush_callbackP9cachefilei10blocknum_sPvPS2_S2_11pair_attr_sPS4_bbbb”},{“b”:“400000”,“o”:“12008A7”,“s”:“_ZN12checkpointer24checkpoint_pending_pairsEv”},{“b”:“400000”,“o”:“1201E64”,“s”:“ZN12checkpointer14end_checkpointEPFvPvES0”},{“b”:“400000”,“o”:“11B62AE”,“s”:“_Z15toku_checkpointP12checkpointerP10tokuloggerPFvPvES3_S5_S3_19checkpoint_caller_t”},{“b”:“400000”,“o”:“11FE4ED”},{“b”:“400000”,“o”:“1221F26”},{“b”:“7FA6D5B60000”,“o”:“7AA1”},{“b”:“7FA6D45B8000”,“o”:“E893D”,“s”:“clone”}],“processInfo”:{ “mongodbVersion” : “3.0.10-1.5”, “gitVersion” : “376fb629b3911b74ee040029dd2bdc57fb2927af”, “uname” : { “sysname” : “Linux”, “release” : “2.6.32-573.22.1.el6.x86_64”, “version” : “#1 SMP Wed Mar 23 03:35:39 UTC 2016”, “machine” : “x86_64” }, “somap” : [ { “elfType” : 2, “b” : “400000”, “buildId” : “B8B62B3C418730D5E40C1942E70D4E6ED3C3128B” }, { “b” : “7FFF42EF8000”, “elfType” : 3, “buildId” : “2F78F7B8A7307DD9C340F3CA735BE2CAA9C157D1” }, { “b” : “7FA6D61B8000”, “path” : “/lib64/libz.so.1”, “elfType” : 3, “buildId” : “5FA8E5038EC04A774AF72A9BB62DC86E1049C4D6” }, { “b” : “7FA6D5FA0000”, “path” : “/lib64/libbz2.so.1”, “elfType” : 3, “buildId” : “732F8FD5054C4FA43CF0CD4CC8C5FF02CEA3CC54” }, { “b” : “7FA6D5D80000”, “path” : “/usr/lib64/libsasl2.so.2”, “elfType” : 3, “buildId” : “90F78E5E1AF40EE481268BEAF15F9A2E3E8EFBD2” }, { “b” : “7FA6D5B60000”, “path” : “/lib64/libpthread.so.0”, “elfType” : 3, “buildId” : “C56DD1B811FC0D9263248EBB308C73FCBCD80FC1” }, { “b” : “7FA6D58F0000”, “path” : “/usr/lib64/libssl.so.10”, “elfType” : 3, “buildId” : “B84C31B86733DE212F6886FE6F55630FE56180A9” }, { “b” : “7FA6D5508000”, “path” : “/usr/lib64/libcrypto.so.10”, “elfType” : 3, “buildId” : “A30A68D2F579614CBEA988BDAAC20CD56D8C48FC” }, { “b” : “7FA6D5300000”, “path” : “/lib64/librt.so.1”, “elfType” : 3, “buildId” : “95159178F1A4A3DBDC7819FBEA2C80E5FCDD6BAC” }, { “b” : “7FA6D50F8000”, “path” : “/lib64/libdl.so.2”, “elfType” : 3, “buildId” : “29B61382141595ECBA6576232E44F2310C3AAB72” }, { “b” : “7FA6D4DF0000”, “path” : “/usr/lib64/libstdc++.so.6”, “elfType” : 3, “buildId” : “C03877A9EE01DDC572E2B0F55F64C757773CF8D6” }, { “b” : “7FA6D4B68000”, “path” : “/lib64/libm.so.6”, “elfType” : 3, “buildId” : “989FE3A42CA8CEBDCC185A743896F23A0CF537ED” }, { “b” : “7FA6D4950000”, “path” : “/lib64/libgcc_s.so.1”, “elfType” : 3, “buildId” : “9350579A4970FA47F3144AD8F40B183B0954497D” }, { “b” : “7FA6D45B8000”, “path” : “/lib64/libc.so.6”, “elfType” : 3, “buildId” : “8E6FA4C4B0594C355C1B90C1D49990368C81A040” }, { “b” : “7FA6D63D0000”, “path” : “/lib64/ld-linux-x86-64.so.2”, “elfType” : 3, “buildId” : “959C5E10A47EE8A633E7681B64B4B9F74E242ED5” }, { “b” : “7FA6D4398000”, “path” : “/lib64/libresolv.so.2”, “elfType” : 3, “buildId” : “C39D7FFB49DFB1B55AD09D1D711AD802123F6623” }, { “b” : “7FA6D4160000”, “path” : “/lib64/libcrypt.so.1”, “elfType” : 3, “buildId” : “128802B73016BE233837EA9F2DCBC2153ACC2D6A” }, { “b” : “7FA6D3F18000”, “path” : “/lib64/libgssapi_krb5.so.2”, “elfType” : 3, “buildId” : “441FA45097A11508E50D55A3D1FF169BF2BE7C62” }, { “b” : “7FA6D3C30000”, “path” : “/lib64/libkrb5.so.3”, “elfType” : 3, “buildId” : “F62622218875795666E08B92D176A50791183EEC” }, { “b” : “7FA6D3A28000”, “path” : “/lib64/libcom_err.so.2”, “elfType” : 3, “buildId” : “152E2C18A7A2145021A8A879A01A82EE134E3946” }, { “b” : “7FA6D37F8000”, “path” : “/lib64/libk5crypto.so.3”, “elfType” : 3, “buildId” : “B8DEDADC140347276164C729418C7A37B7224135” }, { “b” : “7FA6D35F0000”, “path” : “/lib64/libfreebl3.so”, “elfType” : 3, “buildId” : “58BAC04A1DB3964A8F594EFFBE4838AD01214EDC” }, { “b” : “7FA6D33E0000”, “path” : “/lib64/libkrb5support.so.0”, “elfType” : 3, “buildId” : “4BDFC7A19C1F328EB4FCFBCE7A1E27606928610D” }, { “b” : “7FA6D31D8000”, “path” : “/lib64/libkeyutils.so.1”, “elfType” : 3, “buildId” : “AF374BAFB7F5B139A0B431D3F06D82014AFF3251” }, { “b” : “7FA6D2FB8000”, “path” : “/lib64/libselinux.so.1”, “elfType” : 3, “buildId” : “E6798A06BEE17CF102BBA44FD512FF8B805CEAF1” } ] }}
mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x10b7ca2]
mongod(+0xCB7553) [0x10b7553]
mongod(+0xCB78B4) [0x10b78b4]
libpthread.so.0(+0xF7E0) [0x7fa6d5b6f7e0]
mongod(Z7le_packP3uleP7bn_datajPvjjjPP9leafentryPS3+0x4EF) [0x15c37ff]
mongod(_Z23toku_le_garbage_collectP9leafentryP7bn_datajPvjP11txn_gc_infoPS0_Pl+0x13E) [0x15c5f8e]
mongod(_Z23toku_ftnode_leaf_run_gcP2ftP6ftnode+0x264) [0x162eeb4]
mongod(_Z26toku_ftnode_flush_callbackP9cachefilei10blocknum_sPvPS2_S2_11pair_attr_sPS4_bbbb+0x2BB) [0x15d2d4b]
mongod(_ZN12checkpointer24checkpoint_pending_pairsEv+0x267) [0x16008a7]
mongod(ZN12checkpointer14end_checkpointEPFvPvES0+0x44) [0x1601e64]
mongod(_Z15toku_checkpointP12checkpointerP10tokuloggerPFvPvES3_S5_S3_19checkpoint_caller_t+0x1EE) [0x15b62ae]
mongod(+0x11FE4ED) [0x15fe4ed]
mongod(+0x1221F26) [0x1621f26]
libpthread.so.0(+0x7AA1) [0x7fa6d5b67aa1]
libc.so.6(clone+0x6D) [0x7fa6d46a093d]
----- END BACKTRACE -----

Please help.

Hi pareshp,

I have done some initial investigation. First, I would like to rule out potential memory and disk i/o problems. Can you check [B] is mounted on? Also, what is the available space (size & percentage) on that volume?

Thanks,
–Dave

Hello DBennett,

I checked dmesg and /var/log/messages, there are no errors with respect to disk or memory.

Here is output of df -h command on the server,
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_vaultize-lv_root
447G 29G 396G 7% /
tmpfs 12G 0 12G 0% /dev/shm
/dev/sdb1 477M 78M 374M 18% /boot
/dev/sda1 917G 531G 340G 61% /data/chunks

my db folder resides on root partition which is 93% free.

Regards,
Paresh

Hi DBennett,

I have run memtest86+ on the server and found no errors. I have also tested disks using badblock and fsck. No errors reported.

Please help me since the mongo is crashing on daily basis.

Thanks and Regards,
Paresh

Hi pareshp,

I will bring this up in our stand up meeting today.

–Dave

Hi pareshp,

We believe the issue is due to a corrupt fractal tree. If you can identify which collection is causing the issue, a dump and reload of that collection should fix the problem. If you need further assistance, Percona Support can help, [url]https://www.percona.com/about-percona/contact[/url]

–Dave

Thanks Dave. Will check.

Regards,
Paresh