Hi there,
On the daily basis, from a ReplicaSet a dump is made, we start a standalone server, removing any replicaSet configuration, on that standalone server a Java application will sanitized/scrambling personal data like lastnames, e-mails, etc. The Java application uses multithreading. Running some updateMany over some collection takes many hours, we can have around 300K e-mails to be sanitized/scrambling. We have a “kind of big” collection of 45GB “73 MM of documents on that collection and many indexes”, the entire database is about 91GB, the Debian 8 box has 14GB of RAM.
From Mongo config, I have setup a bit more of cacheSizeGB to try to improve some performance. I was wondering what other settings I can modify to try to get performance?
wiredTiger:
engineConfig:
cacheSizeGB: 10
checkpointSizeMB: 1000
statisticsLogDelaySecs: 0
journalCompressor: snappy
directoryForIndexes: false
collectionConfig:
blockCompressor: snappy
indexConfig:
prefixCompression: true
The next setting to try will be to disable the journal, and see if it will increase some performance.
Server: Percona Mongo v3.6.17-4.0 standalone
Thanks in advance.