For the record: I end-up generating 6 text files of 4 MM of lines each, “a statement per line”. e.g: deleteOne (Object of primary key _id)…
The line contains the primaryKey of a record I want to delete. Then of +95 MM I want to delete around 24 MM records.
Loading the files like:
mongo mongodb://%2Ftmp%2Fmongodb-27017.sock/myCollectionName < /tmp/deletes1.json &
mongo mongodb://%2Ftmp%2Fmongodb-27017.sock/myCollectionName < /tmp/deletes6.json &
of course putting each command in background. Then loading the file 1 and 2, took like 70 minutes, then loading 2 other files at the same time (file 3 and 4), and finally loading the last 2 files (5 and 6). In total a bit over 3 hours and half. Which is was fastest than Java one delete at the time, then around 7 and half hours, of course Java is doing one record at the time without using threads.
On another day, I loaded file 1, 2 and 3 at the same time then when those 3 finished, I loaded file 4, 5 and 6, that took 2.5 hours, and on another day I loaded the 6 files at the same time “6 processes” it also took 2.5 hours.
The I think if I load the file 1 ,2 and 3 and once those are finish then I loaded the files 4, 5, 6 because that doesn’t stress the system as much as 6 files at the same time, and the outcome is the same.