TokuMX: M/R Post Processing Performance Issues

niraj · July 13, 2016, 6:15am

When we checked mongo logs, we noticed that actual map & reduce functions run with same speed both the times, but the difference is due to “M/R Reduce Post Processing”.

In the first case, it finishes in a jiffy & we see following lines in the log…

So post processing done in 4-5 secs.

So more than 10mins for ~50k records.

In case of deletes, TokuMX inserts a delete message into a buffer in the fractal tree but the actual entry containing the data could still be present in the leaf node. So our guess is that actual data entries are being deleted one by one during Reduce Post Processing, since new entries with same IDs need to be added.

DBennett · July 18, 2016, 11:25am

Hi niraj,

Yes, we have observed the mapReduce performance issue in both TokuMX as well as PSMDB using the PerconaFT engine. Your observations regarding the delete messages are also correct. Node deletes are the Achilles Heel of the fractal tree. Sadly, now that PerconaFT has been deprecated for PSMDB and Aggregation Pipeline is now preferred over mapReduce in most scenarios, it is not on our road map to address this issue at the present time.

If you require mapReduce functionality, I recommend migrating to Percona Server for MongoDB with the rocksdb or wiredTiger storage engines.

–Dave

niraj · July 19, 2016, 8:49am

Hi Dave,

Appreciate your response & I understand this.

Aggregation would not suffice for our needs & we cannot avoid using map-reduce. We would consider migrating to PSMDB with rocksdb or wiredTiger.

Niraj

Topic		Replies	Views
insert/update performace after upgrading from tokumx Percona Server for MongoDB	1	571	December 9, 2015
TokuMX: Wrong collection size & document count Percona Server for MongoDB	4	1180	February 26, 2016
PSMDB vs TokuMX Percona Server for MongoDB	3	755	February 29, 2016
Capped delete optimizer Percona Server for MongoDB	0	584	October 27, 2015
Migrating from TokuMX to Percona for MongoDB Percona Server for MongoDB	1	629	July 18, 2016

TokuMX: M/R Post Processing Performance Issues

Related topics