We have configured email alert for mongodb service down from the default teamplate.For shards and config servers it works properly whenever the service is down it throws alert but for mongos servers we are getting alerts frequently but when cheked the services are running on the node but in Percona monitoring tool the service shows as down.
Thanks for the quick response.But i don’t find a log file for pmm-agent on the servers.
Is there any other way to check.
root@server:~# find / -name pmm-agent.log
find: ‘/proc/1416010/task/1444412’: No such file or directory
find: ‘/proc/1416010/task/1444421’: No such file or directory
find: ‘/proc/1416010/task/1444487’: No such file or directory
find: ‘/proc/1416010/task/1444496’: No such file or directory
find: ‘/proc/1416010/task/1444499’: No such file or directory
find: ‘/proc/1416010/task/1444501’: No such file or directory
find: ‘/proc/1416010/task/1444505’: No such file or directory
root@server:~#
Hi Matthewb,
We had thought since our agents running in one vm monitoring both QR caused false alert and hence removed the remote QR from the QR1 and registered it on the self node. But still it is not resolved ..Still both QR shows down often in percona but actually the node is up.
Yes, load can be an issue. If the mongos is busy handling other queries, it will not be able to respond to the PMM query for “alive” in a timely manner, thus it might alarm. I suggest increasing the number of failures needed to trigger the alarm state.