Feature Request: Make pmp-check-mysql-* usable for remote servers

Hi there,

first of all, I want to mention that I am quite happy with xtrabackupex and that it indeed saves a lot of time. Thank you for this great software. Sadly, we are facing some Nagios problems regarding Percona MySQL Plugins executed per NRPE on a server that has quite some load.

Scenario: Every full hour innobackupex starts on our backup slave, then everything goes to tar, then to pigz and then it is copied to its final destination. Due to our quite big database this whole procedure takes about half an hour. This work like a charm. But anytime, when the server is doing that, we get Nagios messages saying :“check_nrpe: Socket time out after 10 seconds.” which throws a critical - and as far as I know, there is no option to raise that socket timeout. So I concluded, anytime I use NRPE on servers with some load, it might happen that this certain socket is not answering or whatever. This only affects Percona plugins.

So I thought, well, why do I need NRPE? So I “converted” pmp-check-mysql-innodb (max_duration) from NRPE to a regular Nagios check. Since this plugin only uses a MySQL user, the check works fine without NRPE - and more important: those socket timeouts have gone!

Motivated by that result I went on to convert the other stuff, like pmp-check-mysql-file-privs. It took me some time wondering why I cannot execute it from the nagios server, but from the database server. But since you people at Percona are just great guys who produce “bash nagios plugins”, I was able to find out what happens.

pmp-check-mysql-file-privs cannot be run from a dedicated Nagios server asking a REMOTE MySQL server for checking file privileges on that remote server because it actually WILL read the config of the given (remote) MySQL server but then it looks for the mysql datadir LOCALLY. I have difficulties to understand that logic - reading a remote config but searching for the datadir locally. This means I can only use those plugins (which need some system privileges) where Nagios AND MySQL are both running on the same machine or use NRPE.

So, assuming my observations were correct, I wonder if it would be possible for you guys to implement a feature to all those plugins where I can define a system user and where the checks are also run on the REMOTE system. Yes, maybe the best way would be to provide arguments to use a given ssh account that can execute find and ps on the remote server (or even use the same user for MySQL as well as for Unix). This would be awesome and that is my suggestion. Maybe there are others who have also fallen into this pitfall.

Otherwise I will try to modify your plugins by myself using a ssh connection to do what I want. So, once again, thank you for providing bash shell plugins - this is a great learning resource!

If there are any other possibilites how to have that plugin run with almost no socket-timeouts, I’d appreciate any answer and suggestions on that matter.

All the best wishes

Oups and yes and thank you very much for your quick response!

Surely, I am open to other practical solutions because I cannot imagine that I am the only one who faces such a situation with NRPE. THX!

Sorry, Tom, I cannot follow you. Do you just point me to your docs?

I have already read them all, even multiple times - I also created my own short docs based on yours. I cannot see what info you tried to provide to me. Do you mean a certain doc that I should read? As far as I know I have read them all which in return let me to this forum. But never mind, I already started rewriting pmp-check-mysql-file-privs to have it done its job on a remote server (where I indeed would expect such a tool to gather data).

Nevertheless I am open to anyone who also suffered from Nagios NRPE socket timeouts while having innobackupex running. (Timing the checks around the backup procedure is not a solution.)

I found a simple solution that works without modifying the script. Basically, one needs at least a unix system account that can “sudo” without password (/etc/sudoers) on the remote system and a SSH key pair. Since it is possible to call “ssh” to open up a real terminal on the remote machine, it is quite easy to call a remote script that lies on the remote server. So in case you want to work around NRPE, maybe my approach helps you out:

ssh -t -i “/path/to/.ssh/nagios_id_rsa” nagios@database ‘sudo /usr/lib64/nagios/plugins/pmp-check-mysql-file-privs -H localhost -l nagios -p XXX’

[Execute SSH with a real tty (-t) and use a certain identity file (-i), use the remote user name of the remote server to connect to (nagios@database) and on THAT systems call “sudo” that calls the plugin script with all arguments. Because we are already remote, “localhost” is interpreted on the remote system.]
Since I use an SSH key without password phrase and since I allowed the system user “nagios” to “sudo” exactly that script without any password, this works without entering any password, so it can be used in a script automatically.

In the end I get this for example: WARN files with wrong ownership: /var/lib/mysql/test
Connection to database closed.

That last sentence comes from the SSH server and should be removed from the one liner script because we do not want that message in our monitoring. Of course, one should really add some error handling. Consider this a proof of concept. Maybe this information can be included in your documentation so that you can be more detailed when writing “Executing those scripts is possible with or without NRPE but NRPE is the preferred way.”

Thank you for this challenge - once again I learned something. All the best wishes

[topic can be closed]