Monthly Archives: December 2011

Sending snmp traps from shell scripts

This is short post describing a simple way to send snmp traps from shell scripts using net-snmp snmptrap program on Linux systems.
Create the below shell script. Make sure you have at least 1 trapsink section in your snmpd.conf

#!/bin/sh
SNMPCOMMUNITY=public

snmptrap() {
  for traphost in $(cat /etc/snmp/snmpd.conf | grep trapsink | awk '{print $2}')
  do
    /usr/bin/snmptrap -v 2c -c $SNMPCOMMUNITY $traphost '' DISMAN-SCRIPT-MIB::smScriptResult smRunArgument s "$1" smRunResult s "$2"
  done
}

snmptrap "TestTrap" "This is a test trap"

This would produce a SNMP trap to your management server looking a bit like this:

2010-02-26 15:30 : DISMAN-SCRIPT-MIB:smScriptResult SNMP Trap
Received Time:2010-02-26 15:30:02
Source:192.168.1.1 (192.168.1.1)
Community:public
Variable Bindings
sysUpTime:= 160 days 6 hours 36 minutes 37,28 seconds (1384779728)
snmpTrapOID:= DISMAN-SCRIPT-MIB : smScriptResult (1.3.6.1.2.1.64.2.0.2)
smRunArgument:= TestTrap
smRunResult:= This is a test trap

Note: One of the difficulties I found when investigating this was to find a suitable generic MIB to use so I ended up with the DISMAN-SCRIPT-MIB. If you a better way to this, feel free to post a comment.

Monitoring VMware Data Recovery using Nagios

This is a modification of some code I found on VMware forums.
I´ve added some IF checks and nagios return codes + added a way to call the script from nagios nrpe.
(Sorry but I don´t remember where I got the original code now so no contribution list available)
The script checks the VMDR log /var/vmware/datarecovery/operations_log.utx and if errors are found, it will be alerted.
It will also alert when no data has been written to the log for more than a day. I run VMDR every day so this suits me fine but you may want to disable this check.

  1. Logon to the VMDR appliance with SSH
  2. Create the file /usr/lib64/nagios/plugins/check_vmdr.sh containing this code:
    #!/bin/bash
    
    errorfile=/var/vmware/datarecovery/errors/$(date "+%Y-%m-%d_%H%M%S").txt
    
    cat /dev/null >> /var/vmware/datarecovery/olderror.out
    strings -e l /var/vmware/datarecovery/operations_log.utx | grep -i "error" | awk -F"$" '{print substr($2,5, 30) substr($3, 9, 100) substr ($6, 17, 30) $8}' > /var/vmware/datarecovery/newerror.out
    
    # Compare new list of all errors to old list of all errors
    if ! diff /var/vmware/datarecovery/olderror.out /var/vmware/datarecovery/newerror.out >/dev/null
    then
      diff /var/vmware/datarecovery/olderror.out /var/vmware/datarecovery/newerror.out  | grep ">"  > $errorfile
      err=$(cat $errorfile | grep -c ">")
      mv /var/vmware/datarecovery/newerror.out /var/vmware/datarecovery/olderror.out -f
    
      echo "WARNING - $err new VMDR errors exist. See log file $errorfile for more details. | errors=$err"
      exit 1
    else
      # No errors, but check that VMDR has run recently (today or yesterday)
    
      today=$(date "+%-m/%-d/%Y")
      yesterday=$(date "+%-m/%-d/%Y" --date="1 days ago")
      strings -e l /var/vmware/datarecovery/operations_log.utx | grep "Performing incremental back up" | egrep "${today}|${yesterday}" >/dev/null 2>&1
      if [ "$?" -ne "0" ]; then
        echo "CRITICAL - VMDR has not run recently (today or yesterday) | errors=1"
        exit 2
      else
        echo "OK - No VMDR errors | errors=0"
        exit 0
      fi
    fi
  3. Create the directory /var/vmware/datarecovery/errors
  4. Edit /etc/sudoers and add:
    nagios  ALL=NOPASSWD: /usr/lib64/nagios/plugins/check_vmdr.sh
    (don´t forget to remove the comment for Defaults requiretty – otherwise nrpe can´t use sudoers)
  5. The rpm nagios-nrpe needs to be installed to the appliance. It has some dependencies but I installed it with –nodeps and it works just fine.
  6. Add this row to /etc/nagios/nrpe.cfg:
    command[check_vmdr]=sudo /usr/lib64/nagios/plugins/check_vmdr.sh
  7. I also copied the check_disk plugin to the appliance and added this row to nrpe.cfg to monitor disk usage:
    command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w 10% -c 5% -p /SCSI*
  8. service nrpe start ; chkconfig nrpe on
  9. Done.