Not the answer you need?
Register and ask your own question!

TokuMX: M/R Post Processing Performance Issues

nirajniraj EntrantCurrent User Role Participant
When we checked mongo logs, we noticed that actual map & reduce functions run with same speed both the times, but the difference is due to "M/R Reduce Post Processing".

In the first case, it finishes in a jiffy & we see following lines in the log...
Tue Jul 12 03:36:57.033 [conn1546990] M/R Reduce Post Processing Progress: 9800
Tue Jul 12 03:37:00.158 [conn1546990] M/R Reduce Post Processing Progress: 51300
So post processing done in 4-5 secs.
Wed Jul 13 03:00:31.802 [conn1565018] M/R Reduce Post Processing Progress: 200
Wed Jul 13 03:00:36.065 [conn1565018] M/R Reduce Post Processing Progress: 400
Wed Jul 13 03:00:40.378 [conn1565018] M/R Reduce Post Processing Progress: 600
Wed Jul 13 03:00:44.607 [conn1565018] M/R Reduce Post Processing Progress: 800
.
.
.
Wed Jul 13 03:11:51.050 [conn1565018] M/R Reduce Post Processing Progress: 53800

So more than 10mins for ~50k records.

In case of deletes, TokuMX inserts a delete message into a buffer in the fractal tree but the actual entry containing the data could still be present in the leaf node. So our guess is that actual data entries are being deleted one by one during Reduce Post Processing, since new entries with same IDs need to be added.

Comments

  • DBennettDBennett Percona Director of DS Current User Role
    Hi niraj,

    Yes, we have observed the mapReduce performance issue in both TokuMX as well as PSMDB using the PerconaFT engine. Your observations regarding the delete messages are also correct. Node deletes are the Achilles Heel of the fractal tree. Sadly, now that PerconaFT has been deprecated for PSMDB and Aggregation Pipeline is now preferred over mapReduce in most scenarios, it is not on our road map to address this issue at the present time.

    If you require mapReduce functionality, I recommend migrating to Percona Server for MongoDB with the rocksdb or wiredTiger storage engines.

    --Dave
  • nirajniraj Entrant Current User Role Participant
    Hi Dave,

    Appreciate your response & I understand this.

    Aggregation would not suffice for our needs & we cannot avoid using map-reduce. We would consider migrating to PSMDB with rocksdb or wiredTiger.

    - Niraj
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.