Nagios HACMP Clustercheck Script

I wrote this script to check HACMP or PowerHA Clusterstates:

#!/bin/ksh

# Check HACMP Cluster via SNMP

# 1. cluster service state with OID .1.3.6.1.4.1.2.3.1.2.1.5.1.4  (up(2), down(4), unknown(8), notconfigured(256))
# 2. cluster state  .1.3.6.1.4.1.2.3.1.2.1.5.1.8 (unstable(16), error(64), stable(32), unknown(8), reconfig(128), notconfigured(256), notsynced(512))
# 3. cluster node state .1.3.6.1.4.1.2.3.1.2.1.5.2.1.1.2 (up(2), down(4), joining(32), leaving(64))
# 4. cluster node name .1.3.6.1.4.1.2.3.1.2.1.5.2.1.1.2 (name of node)

# Define variables
snmpget=/usr/bin/snmpget
clstr_service=.1.3.6.1.4.1.2.3.1.2.1.5.1.4.0
clstr_name=.1.3.6.1.4.1.2.3.1.2.1.5.1.2.0
clstr_state=.1.3.6.1.4.1.2.3.1.2.1.5.1.8.0
clstr_nodestate=.1.3.6.1.4.1.2.3.1.2.1.5.2.1.1.2
clstr_nodename=.1.3.6.1.4.1.2.3.1.2.1.5.2.1.1.4

# Define passed variables
check=$1
hostip=$2
com=$3
nodeid=$4

if [ -z “$check” ];then
check=’clstr’
else
if [ -z “$hostip” ];then
echo “hostip missing, aborting\n\r”
check=’clstr’
else
if [ -z “$com” ];then
echo “community missing, aborting\n\r”
check=’clstr’
fi
fi
fi
case $check in
clstr_service)
check_ret=`$snmpget -v1 -c $com $hostip $clstr_service|awk ‘{print $4}’`
clstrname=`$snmpget -v1 -c $com $hostip $clstr_name|awk ‘{print $4}’`
if [ “$check_ret” -eq ‘2’ ];then
msg=”Cluster service for $clstrname is up”
ret=0
elif [ “$check_ret” -eq ‘4’ ]
then
msg=”Cluster service for $clstrname is down”
ret=2
elif [ “$check_ret” -eq ‘8’ ]
then
msg=”Cluster service for $clstrname is unknown”
ret=2
elif [ “$check_ret” -eq ‘256’ ]
then
msg=”Cluster service for $clstrname is not configured”
ret=1
elif [ “$clstrname” == “” ]
then
msg=”cluster properly down”
ret=1
else
msg=”unknown returncode. check cluster configuration”
ret=3
fi
echo $msg
;;
clstr_state)
check_ret=`$snmpget -v1 -c $com $hostip $clstr_state|awk ‘{print $4}’`
clstrname=`$snmpget -v1 -c $com $hostip $clstr_name|awk ‘{print $4}’`
if [ “$check_ret” -eq ‘8’ ];then
msg=”Cluster state for cluster $clstrname is unknown”
ret=2
elif [ “$check_ret” -eq ’16’ ]
then
msg=”Cluster state for cluster $clstrname is unstable”
ret=2
elif [ “$check_ret” -eq ’32’ ]
then
msg=”Cluster state for cluster $clstrname is stable”
ret=0
elif [ “$check_ret” -eq ’64’ ]
then
msg=”Cluster state for cluster $clstrname is error”
ret=2
elif [ “$check_ret” -eq ‘128’ ]
then
msg=”Cluster state for cluster $clstrname is reconfiguring”
ret=1
elif [ “$check_ret” -eq ‘256’ ]
then
msg=”Cluster state for cluster $clstrname is not configured”
ret=1
elif [ “$check_ret” -eq ‘512’ ]
then
msg=”Cluster state for cluster $clstrname is not synced”
ret=2
elif [ “$clstrname” == “” ]
then
msg=”cluster properly down”
ret=1
else
msg=”unknown returncode. check cluster configuration”
ret=3
fi
echo $msg
;;
clstr_nodestate)
nodename=`$snmpget -v1 -c $com $hostip $clstr_nodename.$nodeid|awk ‘{print $4}’`
check_ret=`$snmpget -v1 -c $com $hostip $clstr_nodestate.$nodeid|awk ‘{print $4}’`
if [ “$check_ret” -eq ‘2’ ];then
msg=”Cluster state for node $nodename is up”
ret=0
elif [ “$check_ret” -eq ‘4’ ]
then
msg=”Cluster state for node $nodename is down”
ret=2
elif [ “$check_ret” -eq ’32’ ]
then
msg=”Cluster state for node $nodename is joining”
ret=1
elif [ “$check_ret” -eq ’64’ ]
then
msg=”Cluster state for node $nodename is leaving”
ret=1
elif [ “$clstrname” == “” ]
then
msg=”cluster properly down”
ret=1
else
msg=”unknown returncode. check cluster configuration”
ret=3
fi
echo $msg
;;
-h|*)
echo “Usage: ./check_hacmp_snmp.sh [clstr_service|clstr_state|clstr_nodestate] <hostip> <community> <nodeid>”
echo “Example:”
echo ” ”
echo “\t./check_hacmp_snmp.sh clstr_service <hostip> <community> ”
echo ” ”
echo “\t \tCluster service is up”
echo ” ”
echo “\t./check_hacmp_snmp.sh clstr_state <hostip> <community>”
echo ” ”
echo “\t \tCluster state is stable”
echo ” ”
echo “\t./check_hacmp_snmp.sh clstr_nodestate <hostip> <community> <nodeid>”
echo ” ”
echo “\t \tCluster node state is up”
echo ” ”

ret=1
;;
esac
exit $ret

If the cluster is not answering, Nagios will set the returncode to ‘error’. If the check results in an unknown situation, it will result in ‘unknown’.

Advertisements
This entry was posted in aix, HACMP, Nagios and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s