Get mongodb metrics error

Hello,

I have a problem with Pmm agent which is it can not get data from mongodb time to time. I could not find why it is happening from mongodb logs. Pmm agent says there is no data, it can not get data from mongodb but at same time I can query mongodb without any problem. Here is my inventory:

PMM server 2.9.0
Pmm agent 2.9.1
Mongodb 4.4.5

PMM agent status:

MongoDB profiler is disabled. After restarting pmm-agent service status has been changed:

I found some articles about this situation:
https://github.com/prometheus/prometheus/issues/6139
https://jira.percona.com/browse/PMM-5848

But no luck :frowning:
At PMM dashboard, It can not get data

About same mongodb server, not all metrics gave no data, for example Replication Set status, Max member ping time, oplog recovery window, oplog buffer capacity these are have same status, they dont have data. But other metrics like hearthbeat time, elections, operations have data without interruption.

I can not update PMM server and agent because of mongodb_exporter incompability. (MongoDB latency metrics - #16 by Ghan) Can somebody have any idea about this problem? Thanks!

Hi,
Can you check your mongodb_exporter version ?

1 Like

here you are

By the way, I tried to add a blank node and waited to sync finished. Then checked its status via pmm agent but result is the same.

Also I backed up all data from mongo db and restore them to newly created cluster. After restore finished, pmm agent gave same behavior on them.

I got almost 25 mongodb cluster and only this one gave this errors. I think it is related with data, how can I be sure about it?

1 Like

Hello again.
I would try to run the exporter manually and increasing the debug level. Something like this:

mongodb_exporter --mongodb.uri=mongodb://127.0.0.1:17001 --log.level=debug --compatible-mode
ans then browse to http://localhost:9216/metrics and check if you see metrics in the browser and/or errors in the console output.
Also you could try downloading the latest agent from:

Let me know if that helps.

Regards

1 Like

Thank you @carlos.salguero , I am happy that I found you related with mongodb_exporter :smiling_face_with_three_hearts: I will try to run exporter manually and let you know about results.

Also I have two open cases related with mongodb_exporter which are

I will be glad if you have chance to check them out. :pray:t3:

1 Like

Update,

I tried to check manually but it gave error

Then I ran it without and started to debug. I called metrics URL within 10 secs, all of them gave me data. But at same time, at PMM, there is no data? Any idea why is that?

1 Like

Hmmm…given that all other clients are working I’m suspicious of networking. can you check https://<pmm-server>/prometheus/targets and see if there are any errors mentioned about that server?

There’s 3 things happening: PMM client getting metrics from system (but you manually were able to verify that by @carlos.salguero suggestion), PMM server asking for those metrics (PMM server retrieves from client was only option at your version), and PMM Displays the data.

What you find from that URL above might point us to which of the two remaining pieces to look at.

1 Like

It seems like the agent is not really working/connecting.
The ONLY metric you have is replset_my_state showing the server is up, but nothing else.

1 Like

Boy I missed that…you should be getting HUNDREDS of metrics back if the exporter is properly retrieving…so then the issue actually ISN’T network or PMM server that part is working… so why would this one agent not be able to get metrics? Permissions?

1 Like

Hi there,

Thank you for you replies @carlos.salguero and @steve.hoffman . The thing is metric can get data -sometimes- but only mongodb_mongod_replset_my_state and mongodb_mongod_metrics_repl_buffer_size_bytes got this problem. I mean other metrics works fine with that particular server and other server which are connected to PMM.

Node_exporter can get metrics without delay, mongodb_exporter can not. I dont know what is happened… I will migrate that particular machine to another host tonight, maybe it is related with HW, I dont know…

1 Like

@carlos.salguero do you have comment about my other cases?

1 Like

Hello

After migration related VM to another host, my problem is gone :flushed: I really dont know how it is related but I think it is because of IO ?

Can somebody explain me how might be related with my problem? Thanks!

1 Like

Same problem exist at Pmm2.18 with pmm2-client-2.18.0

Here it is

1 Like

Hello.
Sadly, what I can see doesn’t say much.
In the images you attached, I can see the output of a curl request but I would need the console log of the mongo exporter in which you specified --log-level=debug.

Thanks

1 Like

Here you are @carlos.salguero
image

Since the server has “SECONDARY” state, mongodb_exporter can not show it

Here is debug output

DEBU[0000] Compatible mode: true
DEBU[0000] Connection URI: mongodb://username:username!123!@servername-s:27017/admin
INFO[0000] Starting HTTP server for http://:42002/metrics … source=“server.go:144”
DEBU[0079] getDiagnosticData result
{}
DEBU[0079] replSetGetStatus result:
{
“$clusterTime”: {
“clusterTime”: {
“T”: 1623368377,
“I”: 1
},
“signature”: {
“hash”: {
“Subtype”: 0,
“Data”: “mFCLs2axEpchQ7rFR8/mTHukWNI=”
},
“keyId”: 6937552495887515649
}
},
“date”: “2021-06-11T02:39:42.114+03:00”,
“electionParticipantMetrics”: {
“electionCandidateMemberId”: 0,
“electionTerm”: 47,
“lastAppliedOpTimeAtElection”: {
“t”: 46,
“ts”: {
“T”: 1623095762,
“I”: 1
}
},
“lastVoteDate”: “2021-06-07T22:56:08.718+03:00”,
“maxAppliedOpTimeInSet”: {
“t”: 46,
“ts”: {
“T”: 1623095762,
“I”: 1
}
},
“newTermAppliedDate”: “2021-06-07T22:56:11.023+03:00”,
“newTermStartDate”: “2021-06-07T22:56:08.88+03:00”,
“priorityAtElection”: 1,
“voteReason”: “”,
“votedForCandidate”: true
},
“heartbeatIntervalMillis”: 2000,
“lastStableRecoveryTimestamp”: {
“T”: 1623368347,
“I”: 1
},
“majorityVoteCount”: 2,
“members”: [
{
“_id”: 0,
“configTerm”: -1,
“configVersion”: 9,
“electionDate”: “2021-06-07T22:56:08+03:00”,
“electionTime”: {
“T”: 1623095768,
“I”: 1
},
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:41.463+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:40.427+03:00”,
“name”: “servername-p:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“optimeDurable”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDurableDate”: “2021-06-11T02:39:37+03:00”,
“pingMs”: 0,
“state”: 1,
“stateStr”: “PRIMARY”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“uptime”: 272623
},
{
“_id”: 1,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeatMessage”: “”,
“name”: “servername-s:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“self”: true,
“state”: 2,
“stateStr”: “SECONDARY”,
“syncSourceHost”: “servername-p:27017”,
“syncSourceId”: 0,
“uptime”: 1418613
},
{
“_id”: 2,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:41.462+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:41.462+03:00”,
“name”: “servername-a:27017”,
“pingMs”: 0,
“state”: 7,
“stateStr”: “ARBITER”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“uptime”: 1418411
},
{
“_id”: 3,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:40.118+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:40.117+03:00”,
“name”: “servername-s2:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“optimeDurable”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDurableDate”: “2021-06-11T02:39:37+03:00”,
“pingMs”: 0,
“state”: 2,
“stateStr”: “SECONDARY”,
“syncSourceHost”: “servername-s:27017”,
“syncSourceId”: 1,
“uptime”: 428
}
],
“myState”: 2,
“ok”: 1,
“operationTime”: {
“T”: 1623368377,
“I”: 1
},
“optimes”: {
“appliedOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“durableOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“lastAppliedWallTime”: “2021-06-11T02:39:37.044+03:00”,
“lastCommittedOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“lastCommittedWallTime”: “2021-06-11T02:39:37.044+03:00”,
“lastDurableWallTime”: “2021-06-11T02:39:37.044+03:00”,
“readConcernMajorityOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“readConcernMajorityWallTime”: “2021-06-11T02:39:37.044+03:00”
},
“set”: “replsetname”,
“syncSourceHost”: “servername-p:27017”,
“syncSourceId”: 0,
“term”: 47,
“votingMembersCount”: 3,
“writableVotingMembersCount”: 2,
“writeMajorityCount”: 2
}
DEBU[0079] replSetGetStatus result:
{
“$clusterTime”: {
“clusterTime”: {
“T”: 1623368377,
“I”: 1
},
“signature”: {
“hash”: {
“Subtype”: 0,
“Data”: “mFCLs2axEpchQ7rFR8/mTHukWNI=”
},
“keyId”: 6937552495887515649
}
},
“date”: “2021-06-11T02:39:42.135+03:00”,
“electionParticipantMetrics”: {
“electionCandidateMemberId”: 0,
“electionTerm”: 47,
“lastAppliedOpTimeAtElection”: {
“t”: 46,
“ts”: {
“T”: 1623095762,
“I”: 1
}
},
“lastVoteDate”: “2021-06-07T22:56:08.718+03:00”,
“maxAppliedOpTimeInSet”: {
“t”: 46,
“ts”: {
“T”: 1623095762,
“I”: 1
}
},
“newTermAppliedDate”: “2021-06-07T22:56:11.023+03:00”,
“newTermStartDate”: “2021-06-07T22:56:08.88+03:00”,
“priorityAtElection”: 1,
“voteReason”: “”,
“votedForCandidate”: true
},
“heartbeatIntervalMillis”: 2000,
“lastStableRecoveryTimestamp”: {
“T”: 1623368347,
“I”: 1
},
“majorityVoteCount”: 2,
“members”: [
{
“_id”: 0,
“configTerm”: -1,
“configVersion”: 9,
“electionDate”: “2021-06-07T22:56:08+03:00”,
“electionTime”: {
“T”: 1623095768,
“I”: 1
},
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:41.463+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:40.427+03:00”,
“name”: “servername-p:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“optimeDurable”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDurableDate”: “2021-06-11T02:39:37+03:00”,
“pingMs”: 0,
“state”: 1,
“stateStr”: “PRIMARY”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“uptime”: 272623
},
{
“_id”: 1,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeatMessage”: “”,
“name”: “servername-s:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“self”: true,
“state”: 2,
“stateStr”: “SECONDARY”,
“syncSourceHost”: “servername-p:27017”,
“syncSourceId”: 0,
“uptime”: 1418613
},
{
“_id”: 2,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:41.462+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:41.462+03:00”,
“name”: “servername-a:27017”,
“pingMs”: 0,
“state”: 7,
“stateStr”: “ARBITER”,
“syncSourceHost”: “”,
“syncSourceId”: -1,
“uptime”: 1418411
},
{
“_id”: 3,
“configTerm”: -1,
“configVersion”: 9,
“health”: 1,
“infoMessage”: “”,
“lastHeartbeat”: “2021-06-11T02:39:42.119+03:00”,
“lastHeartbeatMessage”: “”,
“lastHeartbeatRecv”: “2021-06-11T02:39:42.118+03:00”,
“name”: “servername-s2:27017”,
“optime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDate”: “2021-06-11T02:39:37+03:00”,
“optimeDurable”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“optimeDurableDate”: “2021-06-11T02:39:37+03:00”,
“pingMs”: 0,
“state”: 2,
“stateStr”: “SECONDARY”,
“syncSourceHost”: “servername-s:27017”,
“syncSourceId”: 1,
“uptime”: 428
}
],
“myState”: 2,
“ok”: 1,
“operationTime”: {
“T”: 1623368377,
“I”: 1
},
“optimes”: {
“appliedOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“durableOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“lastAppliedWallTime”: “2021-06-11T02:39:37.044+03:00”,
“lastCommittedOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“lastCommittedWallTime”: “2021-06-11T02:39:37.044+03:00”,
“lastDurableWallTime”: “2021-06-11T02:39:37.044+03:00”,
“readConcernMajorityOpTime”: {
“t”: 47,
“ts”: {
“T”: 1623368377,
“I”: 1
}
},
“readConcernMajorityWallTime”: “2021-06-11T02:39:37.044+03:00”
},
“set”: “replsetname”,
“syncSourceHost”: “servername-p:27017”,
“syncSourceId”: 0,
“term”: 47,
“votingMembersCount”: 3,
“writableVotingMembersCount”: 2,
“writeMajorityCount”: 2
}
DEBU[0079] getDiagnosticData result
{}

I faced with same problem again even there is no VM or host related problem. I isoleted that particular server. The problem still exists. Kindly require your help.

1 Like

One more info too :

Above post about mongodb_exporter v20 and pmm2-client-2.18. I tried to collect metrics from pmm-agent-2.9.0 with NA mongodb_exporter version and it can collect data.

But when I check it from PMM explorer, It does not have data

I sense that, it took much time to get data from pmm-agent. I tested it via

curl -o /dev/null -s -w ‘Establish Connection: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n’ --user ‘pmm:/agent_id/68f968fd-6496-432f-b60a-1b6f899da1d0’ http://10.0.0.0:42003/metrics

Here is the result

Time to first byte is not it too much? Because of this might be it gave timeout error? How can I be sure about it?

1 Like

Hi @carlos.salguero , do you have time to check please? Thanks!

1 Like

anybody here to help or went vocation :slight_smile: ?

1 Like

Hello,

Any news @carlos.salguero ?