MongoDB server freeze - large amount of collections

We have large MongoDB database (about 1,4mln collections), MongoDB 3.0, engine rocksDB, operating system Ubuntu 14.04.

This DB is located on virtual machine (VmWare vCloud) with 16 cores and 108 GB RAM (currently mongoDB used 70GB of memory without swap).

Production setup options:

  • data on dedicated partition - XFS filesystem
  • transparent_hugepage enabled - never
  • transparent_hugepage defrag - never

DB stats:

{
"db" : "ctp",
"collections" : 1369486,
"objects" : 20566852,
"avgObjSize" : 1126.82749999854,
"dataSize" : 23175294422,
"storageSize" : 23231888384,
"numExtents" : 0,
"indexes" : 6686175,
"indexSize" : 685981393,
"ok" : 1
}

Sample collections sizes:


{
"ns" : "ctp._cf123_ct49_dfc-r_dtc-r_tof2_groupat",
"count" : 33,
"size" : 38172,
"avgObjSize" : 1156,
"storageSize" : 38144,
"capped" : false,
"nindexes" : 5,
"totalIndexSize" : 6312,
"indexSizes" : {
"_id_" : 18,
"exAt" : 16,
"unique" : 6246,
"_smp" : 10,
"_smpdf" : 22
},
"ok" : 1
}

{
"ns" : "ctp._afpoznan123_atlondyn49_df2016-09_dt2016-09_tof2_groupdfdt",
"count" : 188,
"size" : 208677,
"avgObjSize" : 1109,
"storageSize" : 208640,
"capped" : false,
"nindexes" : 5,
"totalIndexSize" : 7945,
"indexSizes" : {
"_id_" : 2845,
"exAt" : 256,
"_smp" : 160,
"_smpdf" : 352,
"unique" : 4332
},
"ok" : 1
}
{
"ns" : "ctp._cf123_ct42_dfc-r_dtc-r_tof2_groupat",
"count" : 27,
"size" : 30400,
"avgObjSize" : 1125,
"storageSize" : 30208,
"capped" : false,
"nindexes" : 5,
"totalIndexSize" : 84,
"indexSizes" : {
"_id_" : 18,
"exAt" : 16,
"unique" : 18,
"_smp" : 10,
"_smpdf" : 22
},
"ok" : 1
}

Periodically every 5 minutes is running script which writes to these collections and creates new one if this collection doesn’t exists (collections names are based on data which are inside these collections) and creating indexes.

We’ve noticed that this server has some freezes during writing data to collections. This kind of freeze can take from 5 to 60 seconds.

Has anyone experienced this issue and could help us?


db.serverStatus()["rocksdb"];

{
"stats" : [ 
"", 
"** Compaction Stats [default] **", 
"Level Files Size(MB) Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) Stall(cnt) KeyIn KeyDrop", 
"---------------------------------------------------------------------------------------------------------------------------------------------------------------------", 
" L0 0/0 0.00 0.0 0.0 0.0 0.0 1.4 1.4 0.0 0.0 0.0 120.1 12 39 0.312 0 0 0", 
" L4 0/0 0.00 0.0 1.8 1.8 0.0 1.7 1.7 0.0 1.0 102.0 99.7 18 11 1.606 7 21M 153K", 
" L5 15/0 620.47 1.0 6.6 1.4 5.2 5.5 0.3 0.0 3.9 44.4 37.0 152 25 6.086 0 110M 840K", 
" L6 106/0 6401.43 0.0 3.5 0.3 3.3 3.3 -0.0 0.0 12.6 25.9 23.7 140 7 20.057 0 162M 14M", 
" Sum 121/0 7021.90 0.0 11.9 3.4 8.5 11.9 3.4 0.0 8.3 37.8 37.8 322 82 3.932 7 295M 15M", 
" Int 0/0 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0 0", 
"Flush(GB): cumulative 1.429, interval 0.000", 
"Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 pending_compaction_bytes, 0 memtable_compaction, 7 leveln_slowdown_soft, 0 leveln_slowdown_hard", 
"", 
"** DB Stats **", 
"Uptime(secs): 34952.0 total, 0.2 interval", 
"Cumulative writes: 4990K writes, 17M keys, 4989K batches, 1.0 writes per batch, ingest: 2.02 GB, 0.06 MB/s", 
"Cumulative WAL: 4990K writes, 0 syncs, 4990122.00 writes per sync, written: 2.02 GB, 0.06 MB/s", 
"Cumulative compaction: 11.90 GB write, 0.35 MB/s write, 11.90 GB read, 0.35 MB/s read, 322.4 seconds", 
"Cumulative stall: 00:00:3.548 H:M:S, 0.0 percent", 
"Interval writes: 0 writes, 0 keys, 0 batches, 0.0 writes per batch, ingest: 0.00 MB, 0.00 MB/s", 
"Interval WAL: 0 writes, 0 syncs, 0.00 writes per sync, written: 0.00 MB, 0.00 MB/s", 
"Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds", 
"Interval stall: 00:00:0.000 H:M:S, 0.0 percent"
],
"num-immutable-mem-table" : "0",
"mem-table-flush-pending" : "0",
"compaction-pending" : "0",
"background-errors" : "0",
"cur-size-active-mem-table" : "33MB",
"cur-size-all-mem-tables" : "33MB",
"num-entries-active-mem-table" : "185495",
"num-entries-imm-mem-tables" : "0",
"estimate-table-readers-mem" : "91MB",
"num-snapshots" : "1",
"oldest-snapshot-time" : "1465911051",
"num-live-versions" : "1",
"total-live-recovery-units" : 60,
"block-cache-usage" : "34GB",
"transaction-engine-keys" : NumberLong(4210),
"transaction-engine-snapshots" : NumberLong(1),
"thread-status" : []
}

db.serverStatus()['globalLock'];

{
"totalTime" : NumberLong(34952090000),
"currentQueue" : {
"total" : 57,
"readers" : 56,
"writers" : 1
},
"activeClients" : {
"total" : 124,
"readers" : 57,
"writers" : 1
}
}

And screen from mongostat:

Best Regards