The MongoDB Cluster has been running not too badly but since a few days I get the above error. Version in use :
crVersion: 1.7.0
image: percona/percona-server-mongodb:4.4.3-5
Since this version has “close most files” after 27h hard coded, we have to restart it every 24h. Each restart of a pod takes about 30min. But right after the restart I see in the log the above message.
- Heartbeat failed after max retries : what is this about and how can I control this
- Is there any “limit” or why does the Pod restart ?