Hi there,
first of all, I want to mention that I am quite happy with xtrabackupex and that it indeed saves a lot of time. Thank you for this great software. Sadly, we are facing some Nagios problems regarding Percona MySQL Plugins executed per NRPE on a server that has quite some load.
Scenario: Every full hour innobackupex starts on our backup slave, then everything goes to tar, then to pigz and then it is copied to its final destination. Due to our quite big database this whole procedure takes about half an hour. This work like a charm. But anytime, when the server is doing that, we get Nagios messages saying :“check_nrpe: Socket time out after 10 seconds.” which throws a critical - and as far as I know, there is no option to raise that socket timeout. So I concluded, anytime I use NRPE on servers with some load, it might happen that this certain socket is not answering or whatever. This only affects Percona plugins.
So I thought, well, why do I need NRPE? So I “converted” pmp-check-mysql-innodb (max_duration) from NRPE to a regular Nagios check. Since this plugin only uses a MySQL user, the check works fine without NRPE - and more important: those socket timeouts have gone!
Motivated by that result I went on to convert the other stuff, like pmp-check-mysql-file-privs. It took me some time wondering why I cannot execute it from the nagios server, but from the database server. But since you people at Percona are just great guys who produce “bash nagios plugins”, I was able to find out what happens.
pmp-check-mysql-file-privs cannot be run from a dedicated Nagios server asking a REMOTE MySQL server for checking file privileges on that remote server because it actually WILL read the config of the given (remote) MySQL server but then it looks for the mysql datadir LOCALLY. I have difficulties to understand that logic - reading a remote config but searching for the datadir locally. This means I can only use those plugins (which need some system privileges) where Nagios AND MySQL are both running on the same machine or use NRPE.
So, assuming my observations were correct, I wonder if it would be possible for you guys to implement a feature to all those plugins where I can define a system user and where the checks are also run on the REMOTE system. Yes, maybe the best way would be to provide arguments to use a given ssh account that can execute find and ps on the remote server (or even use the same user for MySQL as well as for Unix). This would be awesome and that is my suggestion. Maybe there are others who have also fallen into this pitfall.
Otherwise I will try to modify your plugins by myself using a ssh connection to do what I want. So, once again, thank you for providing bash shell plugins - this is a great learning resource!
If there are any other possibilites how to have that plugin run with almost no socket-timeouts, I’d appreciate any answer and suggestions on that matter.
All the best wishes
Ron