Pbm-agent in shard servers not working

I have the following cluster where I’m unable to successfully run the pbm-agent in the shard servers:

pbm status

Cluster:

configReplicaSet:

  • configReplicaSet/mongo-cfg01:27019: pbm-agent v1.8.1 OK
  • configReplicaSet/mongo-cfg02:27019: pbm-agent v1.8.1 OK
  • configReplicaSet/mongo-cfg03:27019: pbm-agent v1.8.1 OK
    replicaSet1:
  • replicaSet1/mongo01:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet1/mongo02:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet1/mongo-arb01:27018: pbm-agent NOT FOUND FAILED status:
    replicaSet2:
  • replicaSet2/mongo03:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet2/mongo04:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet2/mongo-arb02:27018: pbm-agent NOT FOUND FAILED status:
    replicaSet3:
  • replicaSet3/mongo05:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet3/mongo06:27018: pbm-agent NOT FOUND FAILED status:
  • replicaSet3/mongo-arb03:27018: pbm-agent NOT FOUND FAILED status:

I created the “pbm” user locally on each replica set and config servers.

In the config servers I have this connection string working correctly:

PBM_MONGODB_URI=“mongodb://pbm:secret@localhost:27019/?authSource=admin”

And in the shard servers, I have this – that does allow the agent to connect, as I see it creates its collections in the “admin” database:

PBM_MONGODB_URI=“mongodb://pbm:secret@localhost:27018/?authSource=admin”

This is what I see in the logs from the agents in shard servers:

Aug 22 18:41:27 mongo01 pbm-agent: 2022-08-22T18:41:27.000-0300 I node: replicaSet1/mongo01:27018
Aug 22 18:41:27 mongo01 pbm-agent: 2022-08-22T18:41:27.000-0300 I listening for the commands
Aug 22 18:41:32 mongo01 pbm-agent: 2022-08-22T18:41:32.000-0300 W [agentCheckup] get current storage status: query mongo: mongo: no documents in result
Aug 22 18:41:32 mongo01 pbm-agent: 2022-08-22T18:41:32.000-0300 E [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result

Any ideas where these messages come from?

If I understood the documentation correctly, the collections from the admin database are also used for the agents to communicate with each other, is that right? But if the agents in the shards are only connecting to their local mongod instances, how would they by able to access the admin database from the cfg servers?

1 Like

Just found this issue:

Is that a known bug?

1 Like

After finding the issue above I compiled pbm from the master branch:

Version: 1.8.1
Platform: linux/amd64
GitCommit: 67f4d63d2cdead023e6f6e50b2ba8d67391c4e50
GitBranch: main
BuildTime: 2022-08-23_11:47_UTC
GoVersion: go1.17.12

And that fixed the issue.

Now, I do get one single WARNING:

Aug 23 08:49:07 mongo01 pbm-agent[5187]: 2022-08-23T08:49:07.000-0300 W [agentCheckup] get current storage status: query mongo: mongo: no documents in result

But it doesn’t repeat several times as an ERROR.

Also, “pbm status” now shows the pbm-agent correctly:

configReplicaSet:

  • configReplicaSet/mongo-cfg01:27019: pbm-agent v1.8.1 OK
  • configReplicaSet/mongo-cfg02:27019: pbm-agent v1.8.1 OK
  • configReplicaSet/mongo-cfg03:27019: pbm-agent v1.8.1 OK
    replicaSet1:
  • replicaSet1/mongo01:27018: pbm-agent v1.8.1 OK
    (…)

Looks like it that is indeed a bug. Any ideas when a new release will come out with this fix?

1 Like