Cache Reader No keys found for HMAC that is valid for time


PSMDB version: 4.4.9-10;
OS: CentOS 7.9;
MongoDB configuration: 3-node Replica Set: Primary-Secondary-Secondary, one of the Secondaries is hidden with priority 0. No Arbiter;
Each node is configured in a VM.

I faced an issue where an application that accesses a database in a MongoDB cluster with the configuration above, had the following error:
MongoDB.Driver.MongoCommandException: Command aggregate failed: Cache Reader No keys found for HMAC that is valid for time: { ts: Timestamp(1683718817, 1) } with id: 7195156194690859009.
This application is using the MongoDB C# driver v2.15.1 since the beginning, and does simple queries to fetch data. It is configured with readPreference=secondaryPreferred and the connection string points to the replica set.

There is a patching process which runs every week on the VMs of the MongoDB cluster that updates packages (excluding MongoDB-related) in every VM, making sure that it is a Secondary node (if not, failover) and restarting the VM after the updates. During a run of this process, there were two relevant restarts (excluding the mention of hidden node restart here since the application doesn’t reed from there). The error mentioned above happened during each restart (where the node was Secondary and the application was reading from there). They happened for around 10-13 seconds and triggered 50k errors each, on the application side, where the application was not able to read.

After investigating on the MongoDB side, I didn’t find any evidence of specific errors in the log. Also, I was able to check that the application was able to read from the Primary while the Secondary was unavailable, as it is supposed when using readPreference=secondaryPreferred. Then, when the Secondary node was back, the application pointed to the Secondary again.

Additional details about the first restart, as an example:
2023-05-10T11:39:15 - shutdown (with restart) on the Secondary started;
2023-05-10T11:39:16 - Primary node started to have connections;
2023-05-10T11:39:37 - last connection opened on the Primary node at (this application usually opens several connections per second, but it can also have gaps of a few seconds (5-10);
2023-05-10T11:40:15 - Secondary node was available for connections;
2023-05-10T11:40:17 - timestamp of the error on the driver - 1683718817;
2023-05-10T11:40:25 - Secondary node passed from STARTUP to STARTUP2;
2023-05-10T11:40:25 - Secondary node passed from STARTUP2 to RECOVERING;
2023-05-10T11:40:25 - Secondary node passed from RECOVERING to SECONDARY;
2023-05-10T11:40:25 - Secondary node started to have connections from the application again;
2023-05-10T11:40:25 - Errors started on the driver;
2023-05-10T11:40:38 - Errors stopped on the driver.

Other details
The patching process occurs very often (every week) so the restarts are frequent. This issue happened 4 times since the service is using the database on the cluster, for the last 8 months (since September 2022).
#1 - November 2022 - 2 restarts - errors only in 1 of the restarts;
#2 - March 2023 - 2 restarts - errors only in 1 of the restarts;
#3 - March 2023 - 2 restarts - errors in the 2 restarts;
#4 - May 2023 - 2 restarts - errors in the 2 restarts (occurrence explained in this post).
There is only 1 database in this cluster. There are 2 other clusters, in different regions, with the same MongoDB configuration, and very similar workload and dataset, that never had any error. Also, I have several more MongoDB clusters configured the same way, for years, that never presented any issue like this one.

After some investigation online I found little information regarding this error but apparently, it has been resolved in previous versions of MongoDB. There was also an issue related to a GO driver.
Since I didn’t find any specific error on the MongoDB side, I will upgrade the C# driver to a more recent stable version 2.19.0 or 2.19.1 as a first step.
I found a great explanation about HMAC and $clusterTime that may be helpful in understanding the problem Error while converting a mongoDB Cluster into a Replica Set - Stack Overflow.

Is anyone aware of why this may be happening? Is there any known issue about this? If it is possible to be a bug on MongoDB, should I open a ticket with this information for further investigation?

Kind regards,
João Soares

Hi @joaorpsoares,

For seeing at the PSMDB version, I would recommend to upgrade to the latest minor patch of 4.4 version. This current version also have some serious bugs which are fixed in the newer patches. Kindly refer the release notes for bug details.

For the above issue, it seems to be a bug which is fixed in newer patch of 4.4. Kindly refer the MongoDB Forum for your reference.


Hi @Parag_Bhayani,

Thank you for your answer.
I have seen those before but the issue stated on SERVER-40535 was apparently solved in previous versions that the one I’m using. Nevertheless, I understand that the advice in MongoDB forum is to upgrade to the latest version.
I was just trying to understand the issue better before going on that path.

I will evaluate it and get back to you. If you have any other suggestions, feel free to reach out.

Kind regards,
João Soares