Hi all.
We run a 3 member replicaset of tokumx 2.0.2 servers each with 60gb ram (r3.2xlarge instances)
Ever since upgrading to version 2.0.2 we get this random issue where the mongod process dies.
Last messages in tokumx.log are normal insert/update etc.
I see this in dmesg :
init: tokumx main process (4312) killed by SEGV signal
Please advise
Hello. Sure. here’s the information :
Tokumx log prior and a few lines after i restarted the serer :
[url]http://pastebin.com/5RnFK9fE[/url]
When the issue happens, rest of the cluster is fine, it reconfigures itself and keeps on humming. (Although for some reason it takes the frontend .NET driver around 20 minutes to recover with the new configuration, which is another issue).
Nothing out of the ordinary was being performed when it happened. And, as i mentioned, it happens around once a week randomly on different nodes of the replicaset.
and lastly,
imrs:SECONDARY> db.adminCommand(‘getCmdLineOpts’)
{
“argv” : [
“/usr/bin/mongod”,
“–config”,
“/etc/tokumx.conf”
],
“parsed” : {
“auth” : “true”,
“config” : “/etc/tokumx.conf”,
“dbpath” : “/toku/”,
“expireOplogDays” : 3,
“keyFile” : “/toku/key”,
“logappend” : “true”,
“logpath” : “/var/log/tokumx/tokumx.log”,
“pluginsDir” : “/usr/lib/tokumx/plugins”,
“replSet” : “imrs”
},
“ok” : 1
}
[COLOR=#2C2D30]
Nov 15 02:08:30 imdb2.integral-marketing.com tokumx.log: Sun Nov 15 02:08:30.834 [conn1054034] command imads_db.$cmd command: { aggregate: “ad_stats_201511”, pipeline: [ { $match: { offer_id: “559c052be21adc1b38e56aa7”, date: 20151114 } }, { $group: { _id: { offer_id: “$offer_id”, date: “$date” }, total: { $sum: “$hours.25.im_revenue” } } } ] } ntoreturn:1 keyUpdates:0 locks(micros) r:101915 reslen:132 101ms
[COLOR=#2C2D30]