DX Unified Infrastructure Management

  • 1.  Monitoring a floating process

    Posted May 07, 2009 04:28 AM
    I am trying to set up monitoring on serveral processes that may float on any one of 6 Linux servers which are clustered together.  I need to trigger an alert if a process is not running on any of the 6.  It has been suggested usuing a NAS script.  I have very little experience writing these types of scripts.  Can someone please point me in the right direction.

    Thanks


  • 2.  Monitoring a floating process

    Posted May 07, 2009 11:25 AM
    You could definitely do this in a NAS script, but you might be able to do it with triggers, which is probably simpler if it does what you need.  Here is what I am picturing:
    • Create six triggers for the process down alarm, one for each host
    • Create an AO profile that fires when all six triggers are active
    • Have the profile create a new alarm that says the process is down on all hosts
    If you do not want to see the individual alarms in the alarm list, you should be able to create a preprocessing rule that makes them invisible when they arrive at the NAS.  They should still activate triggers even when they are invisible.

    Does that make sense?

    Regards,
    Keith


  • 3.  Monitoring a floating process

    Posted May 07, 2009 01:55 PM
    Michael,
    I would go with Keith's suggested use of 6 triggers (one per host/process) and an AO profile that will evaluate them and create a new alarm. I would add that, if you set the incoming alerts to invisible, then I would set the visibility criteria on the trigger definition to invisible rather than the default "ignore". If there are other AO profiles that have the visibility criteria set to "ignore", that might get triggered by the incoming process related alerts (even though invisible) then the criteria shoule be set to "visible". From what I understand from the post, at any given time there would be 5 triggers that are "True" when the situation is normal and only when the process isn't running anywhere would there be 6 "True" conditions, so this should work.Give it a try and lst us know if it doesn't fit the bill.


  • 4.  Monitoring a floating process

    Posted May 08, 2009 06:07 AM
    Thanks for the suggestions.  I will give it a try and let you know how it works out.

    Thanks

    Mike


  • 5.  Monitoring a floating process

    Posted May 12, 2009 05:45 AM
    The triggers seem to be working and the AO profile.  The alert created by the AO profile is only informational though.  I have tried to set it as critical but it always reverts back to informational.

    Any way around this?

    Thanks

    Mike


  • 6.  Monitoring a floating process

    Posted May 14, 2009 03:13 AM
    Mike,

    Which version of the NAS are you using?  I opened a support case last year for NAS 3.12 for this problem or one very similar.  It looks like I confirmed it was fixed in 3.14.  Of course, even 3.14 is a bit old already, so you would want to upgrade to someting newer.  We were using 3.14 until a few weeks ago, and then we upgraded to 3.23 (current beta).  If you are on robot 2.x, NAS 3.19 would be the equivalent version you could use (also beta).

    -Keith


  • 7.  Monitoring a floating process

    Posted May 21, 2009 06:03 AM
    Keith,

    I'm still on 3.12.  But I think the triggers and associated AO profiles are working well.  They need to be refined alittle, but I'm working on it.  I should upgrade to 3.14 anyway.

    Thanks

    Mike


  • 8.  Monitoring a floating process

    Posted May 21, 2009 01:47 PM
    Mike,

    If I recall correctly, I think the AO profiles might have worked okay if I changed the severity in the config file.  I am not 100% sure about that, but I think the problem with the severity setting came from the GUI and not the probe itself...

    -Keith