readPreferenceTags behaviour

urmasu · October 19, 2020, 8:27am

Based on behaviour observation it remains unclear how are members considered worthy. According to doc there’s latency consideration present, but it’s not clear if it’s latency between driver and mongo router or latency between primary and secondary.

My issue is that when querying from region which has one local secondary present, but primary with other secondary is in remote region, query seems to end up randomly in both regions. If I remove last ‘fallback’ readPreferenceTags, which allows other regions too, then I get results with expected latency.

Akira_Kurogane · October 20, 2020, 3:38am

The comprehensive answer to how drivers should behave in regards to nearly all points can be found in the “specifications” repo of mongodb’s github page.

For read preferences specifically the page to see is https://github.com/mongodb/specifications/blob/master/source/driver-read-preferences.rst. This is a specifications document though, it’s not easy reading I’ll warn.

> but it’s not clear if it’s latency between driver and mongo router or latency between primary and secondary

It’s not between primary and secondary. If the connection is to a non-sharded replicaset then it is the latency from the (driver-using) application client to the replica set nodes. If it is a cluster then I would think it’s the latency between the mongos node and the shard replicaset nodes.

As for seeing the query end up being randomly distributed around valid nodes that is expected - search in page for the word “random” in the spec document linked above.

urmasu · October 20, 2020, 3:52am

Yes, I’ve read that too, but it doesn’t exactly cover my question.

I’ll try to create an example of the setup

secondary mongo, region a - tags something:x, region:a

primary mongo, region b - tags something:x, region:b

secondary mongo, region b - tags something:x, region:b

there’s mongos router between driver and mongo replicaset, located in region:a

query1: readpreference nearest, readPreferenceTags=something:x,region:a; readPreferenceTags=something:x

query2: readpreference nearest, readPreferenceTags=something:x,region:a

while query2 works as expected, query 1 seems to choose randomly using last readPreferenceTags and without respecting 1. readPreferenceTags.

while I’m good with that random selection while only considering 2. readPreferenceTags, I cannot understand why 1. readPreferenceTags “filter” is dropped/not used

Akira_Kurogane · October 20, 2020, 7:25pm

I don’t get what you mean when you say query 1 is “readpreference nearest, readPreferenceTags=something:x,region:a; readPreferenceTags=something:x”

Does the “readPreferenceTags” parameter appear twice in the connection string URI? Maybe the latter occurrence overrides the first one - this will happen with the way BSON objects are parsed when received. Eg. { “a”: 1, “b”: 2, “b”: 3 } is effectively {“a”: 1, “b”: 3 }. I’m not sure that it’s about BSON though, it could also be URI parsing before that resulting the same thing.

Either way the idea above is premature as I don’t see your full code example.

urmasu · October 21, 2020, 1:49am

Yes, correct, multiple readPreferenceTags.

See the documentation at:https://docs.mongodb.com/manual/reference/connection-string/index.html#read-preference-options

Order matters when using multiple readPreferenceTags. The readPreferenceTags are tried in order until a match is found. Once found, that specification is used to find all eligible matching members and any remaining readPreferenceTags are ignored.

— and —https://docs.mongodb.com/manual/core/read-preference-tags/#tag-set-and-read-preference-modes

However, the nearest read mode, when combined with a tag set, selects the matching member with the lowest network latency. This member may be a primary or secondary.

Should of pasted this in question. So what I’m trying to tell is that obviously nodes list returned based on first set of tags is “not eglible” for some reason, but I’ve been not able to figure out the reason, why those nodes are not eglible.

Akira_Kurogane · October 21, 2020, 2:01am

Please share the exact code example. You’re still explaining it in the abstract, but abstractly there’s no reason for there to be a problem. So let’s look at it in concrete detail.

urmasu · October 21, 2020, 3:25am

Yes sure, this can be replicated directly from mongo router's console:
db.getMongo().setReadPref(‘nearest’, [ { “something”: “x”, “region”: “a” } ])

db.getMongo().setReadPref('nearest', [ { "something": "x", "region": "a" }, {"something": "x"} ])

db.getCollection(‘mytest’).find({ _id: ObjectId(“my-object-id”) });

While using 1. setting, I get expected results. While using 2. setting, I can see that query is executed in random node matching second tag (as region is different)

Akira_Kurogane · October 21, 2020, 11:54pm

OK, thanks, now I can rule out several possibilities I was thinking of, and not worry about ones I hadn’t even imagined either.

The mongos node will be using the DBClient* classes in the C++ code of the core server. Think of it as being more or less that it is using the C++ driver to connect to the shards and configsvr replica set.

I soon found DBClientReplicaSet::selectNodeUsingTags() (mongo/dbclient_rs.cpp at 6fe78a092be6b3a87ec9a91693c7dc77bd45fe5e · mongodb/mongo · GitHub). The logic is delegated out the to ReplicaSetMonitor class’s getHostOrRefresh() function and this ultimately comes to SetState::getMatchingHosts(). In the case of readPreference: “nearest” with tags the most important code block appears to be this: mongo/replica_set_monitor.cpp at v4.2 · mongodb/mongo · GitHub

I was hoping to find the answer here but the code disagrees. If { “something”: “x”, “region”: “a” } matches exactly one node as your starting scenario showed I expect that to be returned, no falling back to { “something”: “x” }.

Please take a look at the code and tell me what you think!

FYI criteria.tags.getTagBSON() will return an array, not a single BSON object like I was wondering about earlier. So BSONForEach(tagElem, criteria.tags.getTagBSON()) { … many lines … } would iterate first round with { “something”: “x”, “region”: “a” } as the “tag” value and then, if it finds nothing, do a second round with { "something": "x" } as “tag”. I am confident this order of tags will be kept as the user specified it.

If there are any matches in a pass of the loop the function will return, either at line 1150 or 1202. I.e. find anything with { “something”: “x”, “region”: “a” } and the loop iteration with { “something”: “x” } will never happen.

If the readPreference is “primary”, “secondary” or “primaryPreferred” SetState::getMatchingHosts() can be used recursively and that would be another thing to think about, but in the case of “nearest” the host selected in this L1133-L2103 block is the selected node.

I assume the minOpTime value will be set only when maxStalenessSeconds is used, which you haven’t, and besides the nodes would have to be at least 90 secs behind which is high.

The latency threshold will matter (15ms + best latency of any node) but I believe you would have read about that already and be aware of that.

Akira_Kurogane · October 22, 2020, 12:01am

The crucial block doesn’t have log lines we can get to but these files will start printing out a lot more if you run db.setLogLevel(3, “network”) on the mongos node. This will turn on a lot of log messages from network-related classes all over, so I would then filter by looking by removing lines from other threads unrelated. Eg. the “[ReplicaSetMonitorWatcher]” is related. “[initandlisten]” is not.

Akira_Kurogane · October 22, 2020, 12:01am

The logLevel thing will possibly be different with v4.4, if you’re using that.

urmasu · October 23, 2020, 7:44am

I see. We are using mongo 4.4. There appears to be an extra minOpTime check compared to 4.2, of which I cannot tell right now if it’s affecting or not.

Thank you for some valuable input. As we have some other issues to work through right now, I’ll get back to this.

urmasu · November 5, 2020, 6:35am

To not leave this here without an ending - We created a bugreport to mongo and from their answer I read that it’s confirmed now that readPreferenceTags are not working as supposed to in mongo router 4.4

[SERVER-51835] Mongos readPreferenceTags are not working as expected - MongoDB

Thank you for the support, Akira!

–

regards,

urmas

Akira_Kurogane · November 6, 2020, 2:05am

I see! Well, a pat on our backs for the detective work.

So the reading of the v4.2 code was correct, it’s a 4.4.0 (and 4.4.1) bug. Could in theory be fixed as early as 4.4.2, but I figure that’s about to be released, so I doubt it would be earlier than 4.4.3. When they patch it we’ll merge into Percona Server for MongoDB of the same version of course.

Cheers,

Akira

Topic		Replies	Views
Could not find host matching read preference mode nearest for set rs0 Percona Server for MongoDB	4	5577	January 11, 2022
Using tags with mongodb percona operator Percona Operator for MongoDB	1	540	March 29, 2022
MongoDb Multi-Region Sharded Percona Server for MongoDB	1	605	June 7, 2021
Questions about Default Read/Write Concern settings Percona Operator for MongoDB	2	136	July 16, 2024
Could not find host matching read preference { mode: "nearest" } for set rs0 Percona Operator for MongoDB	5	2068	October 5, 2023

readPreferenceTags behaviour

Related topics