readPreferenceTags behaviour

Based on behaviour observation it remains unclear how are members considered worthy. According to doc there’s latency consideration present, but it’s not clear if it’s latency between driver and mongo router or latency between primary and secondary.

My issue is that when querying from region which has one local secondary present, but primary with other secondary is in remote region, query seems to end up randomly in both regions. If I remove last ‘fallback’ readPreferenceTags, which allows other regions too, then I get results with expected latency.

The comprehensive answer to how drivers should behave in regards to nearly all points can be found in the “specifications” repo of mongodb’s github page.

For read preferences specifically the page to see is https://github.com/mongodb/specifications/blob/master/source/driver-read-preferences.rst. This is a specifications document though, it’s not easy reading I’ll warn.

> but it’s not clear if it’s latency between driver and mongo router or latency between primary and secondary

It’s not between primary and secondary. If the connection is to a non-sharded replicaset then it is the latency from the (driver-using) application client to the replica set nodes. If it is a cluster then I would think it’s the latency between the mongos node and the shard replicaset nodes.

As for seeing the query end up being randomly distributed around valid nodes that is expected - search in page for the word “random” in the spec document linked above.

Yes, I’ve read that too, but it doesn’t exactly cover my question.

I’ll try to create an example of the setup

secondary mongo, region a - tags something:x, region:a

primary mongo, region b - tags something:x, region:b

secondary mongo, region b - tags something:x, region:b

there’s mongos router between driver and mongo replicaset, located in region:a

query1: readpreference nearest, readPreferenceTags=something:x,region:a; readPreferenceTags=something:x

query2: readpreference nearest, readPreferenceTags=something:x,region:a

while query2 works as expected, query 1 seems to choose randomly using last readPreferenceTags and without respecting 1. readPreferenceTags.

while I’m good with that random selection while only considering 2. readPreferenceTags, I cannot understand why 1. readPreferenceTags “filter” is dropped/not used

I don’t get what you mean when you say query 1 is “readpreference nearest, readPreferenceTags=something:x,region:a; readPreferenceTags=something:x”

Does the “readPreferenceTags” parameter appear twice in the connection string URI? Maybe the latter occurrence overrides the first one - this will happen with the way BSON objects are parsed when received. Eg. { “a”: 1, “b”: 2, “b”: 3 } is effectively {“a”: 1, “b”: 3 }. I’m not sure that it’s about BSON though, it could also be URI parsing before that resulting the same thing.

Either way the idea above is premature as I don’t see your full code example.

Yes, correct, multiple readPreferenceTags.

See the documentation at:https://docs.mongodb.com/manual/reference/connection-string/index.html#read-preference-options


Order matters when using multiple readPreferenceTags. The readPreferenceTags are tried in order until a match is found. Once found, that specification is used to find all eligible matching members and any remaining readPreferenceTags are ignored.

— and —https://docs.mongodb.com/manual/core/read-preference-tags/#tag-set-and-read-preference-modes

However, the nearest read mode, when combined with a tag set, selects the matching member with the lowest network latency. This member may be a primary or secondary.


Should of pasted this in question. So what I’m trying to tell is that obviously nodes list returned based on first set of tags is “not eglible” for some reason, but I’ve been not able to figure out the reason, why those nodes are not eglible.

Please share the exact code example. You’re still explaining it in the abstract, but abstractly there’s no reason for there to be a problem. So let’s look at it in concrete detail.

  1. Yes sure, this can be replicated directly from mongo router's console:

    db.getMongo().setReadPref(‘nearest’, [ { “something”: “x”, “region”: “a” } ])

  2. db.getMongo().setReadPref('nearest', [ { "something": "x", "region": "a" }, {"something": "x"} ])

db.getCollection(‘mytest’).find({ _id: ObjectId(“my-object-id”) });

While using 1. setting, I get expected results. While using 2. setting, I can see that query is executed in random node matching second tag (as region is different)

OK, thanks, now I can rule out several possibilities I was thinking of, and not worry about ones I hadn’t even imagined either.

The mongos node will be using the DBClient* classes in the C++ code of the core server. Think of it as being more or less that it is using the C++ driver to connect to the shards and configsvr replica set.

I soon found DBClientReplicaSet::selectNodeUsingTags() (mongo/dbclient_rs.cpp at 6fe78a092be6b3a87ec9a91693c7dc77bd45fe5e · mongodb/mongo · GitHub). The logic is delegated out the to ReplicaSetMonitor class’s getHostOrRefresh() function and this ultimately comes to SetState::getMatchingHosts(). In the case of readPreference: “nearest” with tags the most important code block appears to be this: mongo/replica_set_monitor.cpp at v4.2 · mongodb/mongo · GitHub

I was hoping to find the answer here but the code disagrees. If { “something”: “x”, “region”: “a” } matches exactly one node as your starting scenario showed I expect that to be returned, no falling back to { “something”: “x” }.

Please take a look at the code and tell me what you think!

FYI criteria.tags.getTagBSON() will return an array, not a single BSON object like I was wondering about earlier. So BSONForEach(tagElem, criteria.tags.getTagBSON()) { … many lines … } would iterate first round with { “something”: “x”, “region”: “a” } as the “tag” value and then, if it finds nothing, do a second round with { "something": "x" } as “tag”. I am confident this order of tags will be kept as the user specified it.

If there are any matches in a pass of the loop the function will return, either at line 1150 or 1202. I.e. find anything with { “something”: “x”, “region”: “a” } and the loop iteration with { “something”: “x” } will never happen.

If the readPreference is “primary”, “secondary” or “primaryPreferred” SetState::getMatchingHosts() can be used recursively and that would be another thing to think about, but in the case of “nearest” the host selected in this L1133-L2103 block is the selected node.

I assume the minOpTime value will be set only when maxStalenessSeconds is used, which you haven’t, and besides the nodes would have to be at least 90 secs behind which is high.

The latency threshold will matter (15ms + best latency of any node) but I believe you would have read about that already and be aware of that.

The crucial block doesn’t have log lines we can get to but these files will start printing out a lot more if you run db.setLogLevel(3, “network”) on the mongos node. This will turn on a lot of log messages from network-related classes all over, so I would then filter by looking by removing lines from other threads unrelated. Eg. the “[ReplicaSetMonitorWatcher]” is related. “[initandlisten]” is not.

The logLevel thing will possibly be different with v4.4, if you’re using that.

I see. We are using mongo 4.4. There appears to be an extra minOpTime check compared to 4.2, of which I cannot tell right now if it’s affecting or not.

Thank you for some valuable input. As we have some other issues to work through right now, I’ll get back to this.

To not leave this here without an ending - We created a bugreport to mongo and from their answer I read that it’s confirmed now that readPreferenceTags are not working as supposed to in mongo router 4.4

[SERVER-51835] Mongos readPreferenceTags are not working as expected - MongoDB

Thank you for the support, Akira!

regards,

urmas

I see! Well, a pat on our backs for the detective work.

So the reading of the v4.2 code was correct, it’s a 4.4.0 (and 4.4.1) bug. Could in theory be fixed as early as 4.4.2, but I figure that’s about to be released, so I doubt it would be earlier than 4.4.3. When they patch it we’ll merge into Percona Server for MongoDB of the same version of course.

Cheers,

Akira