Hi there,
my team is currently migrating from MongoDB Atlas (Community Edition) to a self-hosted cluster running a Percona MongoDB Distribution. While in general we are extremely excited to benefit from things like AWS IAM Authentication etc., we just discovered a performance shortcoming in Percona when comparing it to MongoDB for the following query (the content of the document is probably not further relevant, but as you will see it’s about an $or query that queries either for a time range with recurrence set to null, or a document in any timerange with a recurrence set):
{
"$or": [
{
"end": {
"$gte": ISODate('2023-08-29T09:42:30.064Z')
},
"start": {
"$lte": ISODate('2023-08-29T10:42:30.064Z')
},
"recurrence": null,
"resource": {
"$in": [
ObjectId("6074742a035dda4a37174065")
]
}
},
{
"recurrence": {
"$type": "object"
},
"resource": {
"$in": [
ObjectId("6074742a035dda4a37174065")
]
}
}
]
}
The tricky part comes, when one wants to project the result to a certain set of properties only, e.g. like so: { _id: 1, recurrence: 1, start: 1, end: 1, resources: 1 }
: when recurrence: 1
is included, the query becomes extremely slow - taking over a minute on our (admittedly not so small) collection of 13.5 million docs. If recurrence is omitted, it’s as fast as one second again.
Note that we of course have indices applied to that collection, but in all cases the explain query plan looks equal. The indexes being used are on resource,end,start
for the first $or clause and on resource,recurrence
for the second clause.
We are still on version 4.4.22 as of right now. Is this maybe a known issue that has already been improved in a more recent version?
As a workaround, we will split our $or queries where possible and adapt the projections to a minimum.
Looking forward to an answer.
Best regards,
Alex