Hi all,
I am new the percona toolkit, and just downloaded the percona-monitoring-plugin for nagios.
For my slaves that are replicating OK, for the replication running check they are returning UNKNOWN for a good slave…
SHOW SLAVE STATUS;
Slave_IO_State: Waiting for master to send event
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Last_Errno: 0
Last_Error:
The logic in the 0.9 plugin seems to just return the default STATE_UNKNOWN=3 for the Running/Running/No error, which by my reckoning should be OK.
I can’t imagine that this is a bug, so I am presuming that I am missing the purpose of this check. Is is supposed to be dependent on some other check or something
Thanks
Tom
–
############################################################
Set up constants, etc.
############################################################
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
STATE_DEPENDENT=4
EXITSTATUS=$STATE_UNKNOWN
############################################################
Run the program.
############################################################
main() {
Get options
for o; do
case “${o}” in
-c) shift; OPT_CRIT=“${1}”; shift; ;;
–defaults-file) shift; OPT_DEFT=“${1}”; shift; ;;
-H) shift; OPT_HOST=“${1}”; shift; ;;
-l) shift; OPT_USER=“${1}”; shift; ;;
-p) shift; OPT_PASS=“${1}”; shift; ;;
-P) shift; OPT_PORT=“${1}”; shift; ;;
-S) shift; OPT_SOCK=“${1}”; shift; ;;
-w) shift; OPT_WARN=“${1}”; shift; ;;
–version) grep -A2 ‘^=head1 VERSION’ “$0” | tail -n1; exit 0 ;;
–help) perl -00 -ne ‘m/^ Usage:/ && print’ “$0”; exit 0 ;;
-*) echo “Unknown option ${o}. Try --help.”; exit 1; ;;
esac
done
if [ -e ‘/etc/nagios/mysql.cnf’ ]; then
OPT_DEFT=“${OPT_DEFT:-/etc/nagios/mysql.cnf}”
fi
Get replication status into a temp file. TODO: move this into a subroutine
and test it.
local TEMP=$(mktemp “/tmp/${0##*/}.XXXX”) || exit $?
trap ‘rm -rf “${TEMP}” >/dev/null 2>&1’ EXIT
mysql_exec ‘SHOW SLAVE STATUS\G’ > “${TEMP}”
if [ $? = 0 ]; then
SHOW SLAVE STATUS isn’t an error if the server isn’t a replica. The file
will be empty if that happens.
if [ -s “${TEMP}” ]; then
NOTE=$(awk ‘$1 ~ /_Running:|Last_Error:/{print substr($0, 1, 100)}’ “${TEMP}”)
if grep ‘Last_Error: .’ “${TEMP}” >/dev/null 2>&1; then
EXITSTATUS=$STATE_CRITICAL
NOTE=“CRIT $NOTE”
elif grep ‘_Running: No’ “${TEMP}” >/dev/null 2>&1; then
if [ “${OPT_CRIT}” ]; then
EXITSTATUS=$STATE_CRITICAL
NOTE=“CRIT $NOTE”
else
EXITSTATUS=$STATE_WARNING
NOTE=“WARN $NOTE”
fi
fi
elif [ “${OPT_WARN}” ]; then
Empty file; not a replica, but that’s not supposed to happen.
NOTE=“WARN This server is not configured as a replica.”
EXITSTATUS=$STATE_WARNING
else
Empty file; not a replica.
NOTE=“OK This server is not configured as a replica.”
EXITSTATUS=$STATE_OK
fi
else
EXITSTATUS=$STATE_UNKNOWN
NOTE=“UNK could not determine replication status”
fi
echo $NOTE
exit $EXITSTATUS
}