How to check for hung or corrupt EM's

Nov 4, 2009
Aug 3, 2010
I am looking into using an external monitoring program to verify the health of an EM. I currently use a cron script and would like to know how other people do this.  I want to be able to determine 2 conditions:  1) That an EM is capable of allowing a login  2) That data can be retreived from an EM.  The current process I use through cron does the following:  A) Attempts to log in to an EM through a CLW.  B) CLW extracts data from the EM to a file  C) Logs out of the EM  D) Starts another script that looks to see if the original script is still running, indicating that the EM login has hung.  E) Checks the output of the data extract to verify that data can be extracted.  This is all easily doable through cron, but the problem come in when I have a third party program (Tivoli ITMA) that needs to check and see if the login has hung. Does anyone else do this and how do you do it?    Sample cron snippets:       cd /wily/moms/moma/lib
    /wily/moms/moma/jre/bin/java -Dhost=localhost -Dport=9999 -Duser=xx -Dpassword=yy -jar CLWorkstation.jar list agents matching \"Custom Metric Agent*\" > /logs/p
rocesscheck/colla.status.log  This verifies the login and puts a data extract to a file    if [ `ps -aef|grep java|grep userxx|grep -v grep|wc -l` = 0 ]; then
                status="One or more CLW scripts are HUNG on $CollHost."      
          fi  The above checks that the login hasn't hung and does a data extract.    Then tail the log file and look for "Exception" indicating that data cannot be extracted:  if [ `tail /logs/processcheck/colla.status.log|grep "Exception"|grep -cv grep` =
 1 ]; then