I am facing high CPU Utilization(upto 90-95%) on all the mongo nodes where PMM client is installed. when I run the top command on those servers I can see a “fold” consuming most of the resources(Screenshot attached). Also on the PMM dashboard I can see the CPU utilization being dedicated to a “stea” process. We are planning to deploy it on the Production environment but before need to understand what these processes are, and is there a way to restrict the resources that are being consumed by them.
CPU utilization in “steal” state means that another VM on the same physical host as your VM is taking resources from you. it is inherently how cloud computing works, however in your case it is much higher than normally observed. See this article to understand steal:
Regarding the fold process - I am not sure what this is. First thing that came to mind: https://foldingathome.org/
Thanks alot Michael for your valuable feedback. I disabled the linux metrics and now I no longer see the High CPU utilization. I am using the Cloud watch for now for monitoring the linux metrics. I am not sure if that was the reason behind the CPU Utilization. i am planning deploy it on production in the next 2 weeks. hopefully we can find a solution before that.
Ok great! If you missed it, we released 1.13 yesterday and it included and upgrade to Prometheus, taking us to version 2.3, and I expect will have a dramatically positive impact on your experience with PMM. Take a look at our blog post: