Unable to start pbm-agent

Hi,
I’m trying to set up PBM for testing in a cluster consisting of one Data RS and one Config RS (3+3 nodes).

I was able to start pbm-agent on the Config RS nodes, but I’m getting the following error when trying to start it on the data nodes:

#  sudo -u mongod  pbm-agent --mongodb-uri "mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"
2025-01-24T10:30:54.000+0000 E Exit: connect to PBM: create mongo connection to configsvr: ping: connection() error occurred during connection handshake: auth error: sasl conversation error: unable to authenticate using mechanism "SCRAM-SHA-1": (AuthenticationFailed) Authentication failed.

I encounter the same issue when starting it via systemctl with PBM_MONGODB_URI.

The pbmuser account has been created on both replica sets with the same properties.
According to mongodb logs pbm-agent disconnects itself for some reason:

2025-01-24T10:06:10.455249+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: 
{
   "t":{
      "$date":"2025-01-24T10:06:10.455+00:00"
   },
   "s":"I",
   "c":"NETWORK",
   "id":22943,
   "ctx":"listener",
   "msg":"Connection accepted",
   "attr":{
      "remote":"127.0.0.1:48620",
      "uuid":"613b96bb-6ce5-4afc-b158-e5be3fcdd246",
      "connectionId":1801,
      "connectionCount":7
   }
}


2025-01-24T10:06:10.456433+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: 
{
   "t":{
      "$date":"2025-01-24T10:06:10.456+00:00"
   },
   "s":"I",
   "c":"NETWORK",
   "id":51800,
   "ctx":"conn1801",
   "msg":"client metadata",
   "attr":{
      "remote":"127.0.0.1:48620",
      "client":"conn1801",
      "doc":{
         "application":{
            "name":"pbm-agent"
         },
         "driver":{
            "name":"mongo-go-driver",
            "version":"1.16.0"
         },
         "os":{
            "type":"linux",
            "architecture":"amd64"
         },
         "platform":"go1.22.8"
      }
   }
}


2025-01-24T10:06:10.458127+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: 
{
   "t":{
      "$date":"2025-01-24T10:06:10.458+00:00"
   },
   "s":"I",
   "c":"NETWORK",
   "id":22944,
   "ctx":"conn1801",
   "msg":"Connection ended",
   "attr":{
      "remote":"127.0.0.1:48620",
      "uuid":"613b96bb-6ce5-4afc-b158-e5be3fcdd246",
      "connectionId":1801,
      "connectionCount":6
   }
}

2025-01-24T10:06:10.458680+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: 
{
   "t":{
      "$date":"2025-01-24T10:06:10.458+00:00"
   },
   "s":"I",
   "c":"NETWORK",
   "id":22943,
   "ctx":"listener",
   "msg":"Connection accepted",
   "attr":{
      "remote":"10.16.128.160:37568",
      "uuid":"c8603449-0c37-4c51-bd12-5485642efef2",
      "connectionId":1802,
      "connectionCount":7
   }
}

2025-01-24T10:06:10.459367+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.459+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1802","msg":"client metadata","attr":{"remote":"10.16.128.160:37568","client":"conn1802","doc":{"application":{"name":"pbm-agent"},"driver":{"name":"mongo-go-driver","version":"1.16.0"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.22.8"}}}

2025-01-24T10:06:10.460526+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.460+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.16.128.160:37580","uuid":"5705fce7-0431-4558-941a-bb2152e233b8","connectionId":1803,"connectionCount":8}}


2025-01-24T10:06:10.461003+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.460+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1803","msg":"client metadata","attr":{"remote":"10.16.128.160:37580","client":"conn1803","doc":{"application":{"name":"pbm-agent"},"driver":{"name":"mongo-go-driver","version":"1.16.0"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.22.8"}}}


2025-01-24T10:06:10.461359+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.461+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.16.128.160:37582","uuid":"bb043ede-59a3-49cf-822b-6124264a729c","connectionId":1804,"connectionCount":9}}


2025-01-24T10:06:10.461769+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.461+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1804","msg":"client metadata","attr":{"remote":"10.16.128.160:37582","client":"conn1804","doc":{"application":{"name":"pbm-agent"},"driver":{"name":"mongo-go-driver","version":"1.16.0"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.22.8"}}}


2025-01-24T10:06:10.477184+00:00 sjc03-t05-ldb01 INFO mongod[1593975]:
{
   "t":{
      "$date":"2025-01-24T10:06:10.477+00:00"
   },
   "s":"I",
   "c":"ACCESS",
   "id":20250,
   "ctx":"conn1804",
   "msg":"Authenticationsucceeded",
   "attr":{
      "mechanism":"SCRAM-SHA-256",
      "speculative":true,
      "principalName":"pbmuser",
      "authenticationDatabase":"admin",
      "remote":"10.16.128.160:37582",
      "extraInfo":{
         
      }
   }
}


2025-01-24T10:06:10.481116+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: 
{
   "t":{
      "$date":"2025-01-24T10:06:10.481+00:00"
   },
   "s":"I",
   "c":"-",
   "id":20883,
   "ctx":"conn1802",
   "msg":"Interrupted operation as its client disconnected",
   "attr":{
      "opId":6446527
   }
}

2025-01-24T10:06:10.481423+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.481+00:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn1804","msg":"Connection ended","attr":{"remote":"10.16.128.160:37582","uuid":"bb043ede-59a3-49cf-822b-6124264a729c","connectionId":1804,"connectionCount":8}}

2025-01-24T10:06:10.481900+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.481+00:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn1803","msg":"Connection ended","attr":{"remote":"10.16.128.160:37580","uuid":"5705fce7-0431-4558-941a-bb2152e233b8","connectionId":1803,"connectionCount":7}}

2025-01-24T10:06:10.482299+00:00 sjc03-t05-ldb01 INFO mongod[1593975]: {"t":{"$date":"2025-01-24T10:06:10.482+00:00"},"s":"I",  "c":"NETWORK",  "id":22944,   "ctx":"conn1802","msg":"Connection ended","attr":{"remote":"10.16.128.160:37568","uuid":"c8603449-0c37-4c51-bd12-5485642efef2","connectionId":1802,"connectionCount":6}}

Hi, the pbm-agent installed on the Data RS also needs to establish connectivity to the config RS. Check that no firewall/routing issues, and also that you can log in from one of the Data nodes manually to the config RS using the PBM credentials

Hi, thank you for the quick response.

Yes I’m able to connect from the data node to each config RS member, I also can connect locally with the same credentials using mongosh:

[24/01 12:52:34] root@sjc03-t05-ldb01 /tmp #  sudo -u mongod  mongosh "mongodb://pbmuser:pbmuser@sjc03-t05-ldc01:27019/?authSource=admin"
Current Mongosh Log ID:	67938d2d6e3851b9e5b29d41
Connecting to:		mongodb://<credentials>@sjc03-t05-ldc01:27019/?authSource=admin&directConnection=true&appName=mongosh+2.2.10
Using MongoDB:		6.0.12-9
Using Mongosh:		2.2.10
mongosh 2.3.8 is available for download: https://www.mongodb.com/try/download/shell
.......
ldb-rs11 [direct: primary] test> quit


[24/01 12:51:32] root@sjc03-t05-ldb01 /tmp #  sudo -u mongod  mongosh "mongodb://pbmuser:pbmuser@localhost:27018/?authSource=admin"
Current Mongosh Log ID:	67938cd67fc82de34db29d41
Connecting to:		mongodb://<credentials>@localhost:27018/?authSource=admin&directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.2.10
Using MongoDB:		6.0.12-9
Using Mongosh:		2.2.10

For mongosh info see: https://docs.mongodb.com/mongodb-shell/
......

can you check the contents of the db.system.version collection in the Data RS? pbm agent gets the config server URI from that

Hi, thanks for the update.
the collection looks good to me:

ldb-rs11 [direct: primary] test> rs.status().members.forEach(function(z){printjson(z.name + ' -> ' + z.stateStr);})
sjc03-t05-ldb01.stage.companydomain.com:27018 -> PRIMARY
sjc03-t05-ldb02.stage.companydomain.com:27018 -> SECONDARY
sjc03-t05-ldb03.stage.companydomain.com:27018 -> SECONDARY

ldb-rs11 [direct: primary] test> use admin
switched to db admin
ldb-rs11 [direct: primary] admin> db.system.version.find().pretty()
[
  { _id: 'featureCompatibilityVersion', version: '6.0' },
  {
    _id: 'shardIdentity',
    shardName: 'ldb-rs11',
    clusterId: ObjectId('6634c9e813748467e5e4039f'),
    configsvrConnectionString: 'ldc-rs01/sjc03-t04-ldc01.stage.companydomain.com:27019,sjc03-t04-ldc02.stage.companydomain.com:27019,sjc03-t04-ldc03.stage.companydomain.com:27019'
  },
  { _id: 'migrationCoordinatorStats', count: 556 },
  { _id: 'authSchema', currentVersion: 5 },
  {
    _id: 'minOpTimeRecovery',
    minOpTime: { ts: Timestamp({ t: 1731591699, i: 3 }), t: Long('-1') },
    minOpTimeUpdaters: 0
  },
  {
    _id: 'startRangeDeletion',
    ns: 'calllog.calls',
    uuid: UUID('01e9588d-e751-4a1c-bf62-0bfa87f9f356'),
    min: { '_id._usr': Long('505717193980392771') },
    max: { '_id._usr': Long('533981164099930433') }
  }
]



[28/01 15:15:01] root@sjc03-t05-ldb01 / # sudo -u mongod  mongosh "mongodb://pbmuser:pbmuser@sjc03-t05-ldc01.stage.companydomain.com:27019/?authSource=admin"
Current Mongosh Log ID:	6798f4856efaac4239b29d41
Connecting to:		mongodb://<credentials>@sjc03-t05-ldc01.stage.companydomain.com:27019/?authSource=admin&directConnection=true&appName=mongosh+2.2.10
Using MongoDB:		6.0.12-9
Using Mongosh:		2.2.10
.....


It looks like the issue is caused by a misconfiguration between the config and data replica sets. I found some errors in the primary node logs. Now trying to fix it.
Thank you for your hint about the config RS.

1 Like