DX Unified Infrastructure Management

  • 1.  Auto-Operator overdue age reset

    Posted Aug 15, 2016 04:28 PM

    I'm working on a couple auto-operator rules and am having a hard time getting them to work the way I would expect.  Basically, am finding some alerts stuck in a loop of being assigned and unassigned by the 2 profiles.  I thought I had criteria set properly so this wouldn't happen, but obviously I am doing something wrong as it is not working as intended. 

     

    For the first profile ("clear assignment"), this is the matching criteria:

    • Any severity (except clear)
    • Message assigned to "create-ticket"

    The profile runs on an interval of 5 minutes and the action is to run a script, within the script I check the difference between the current time and the time the alert was assigned, if it's been more than 60 seconds since it was assigned, clear the assignment:

    local al = alarm.get() now = os.time() assignedAt = al.assigned_at diffSecs = now - assignedAt  if diffSecs >= 60 then   action.assign("", al.nimid) end

     

    The second profile ("assign alert") has this matching criteria:

    • Severity level = critical
    • Message assigned to <none>

    This profile runs on overdue age of 10s and the action is Assign to: create-ticket.

     

    I have another process that listens to a queue for assigned messages and processes any alerts assigned to the create-ticket user.  If the processing is successful, the alert is then assigned to a different user.  The "clear assignment" profile is there to catch any alerts that are not successfully processed and unassign them so the alert is visible to the operators and they can process it manually.

     

    The problem that I'm having is that the "clear assignment" profile seems to be resetting the overdue age as alerts get stuck in a loop of unassigned and assigned if they are not properly processed (on a related note, manually clearing the assignment in IM or USM does the same thing).  Is there a way to clear the assignment without resetting the overdue age?  Or is there a better way to set up the profiles to accomplish the same end result?



  • 2.  Re: Auto-Operator overdue age reset

    Broadcom Employee
    Posted Aug 17, 2016 04:38 PM

    Hi,

     

    Can you please review and make sure I understand what you are trying to do

    1) every 5 minutes check ALL open alarms with the exception of alarms that are clear level and run the script provided

    2) on overdue 10 seconds run a second AO for severity level critical and the assigned to is set to NONE, assigns the user to create-ticket.

     

    So I would expect this to loop. As when you modify an alarm you are resetting the time on it.

    Currently there is no way around this.

    In this case I would expect the following to happen

    number 2 above executes

    About 5 minutes later 1 executes

    then 10 sec latest 2 executes again.

     

    I think you might be better served instead of doing on interval changing it to overdue 5 minutes.

     

    Instead of looping through ALL messages as on interval does this will only go through alarms which are overdue 5 minutes.

     

    I think this should accomplish what you are looking to do.



  • 3.  Re: Auto-Operator overdue age reset

    Posted Aug 17, 2016 05:21 PM

    Gene_Howard, you're understanding of my process is almost correct, the only thing you were missing is step 1 only matches alerts that are assigned to the create-ticket user.

     

    However, based on your point that modifying an alarm resets the time on it, I can definitely see why it will loop and my clarification above is of no relevance.  Unfortunately changing it to an overdue age won't work for me as there are alerts at other severity levels that may be manually assigned to the create-ticket user well after 5 minutes.  For these alerts, I still want to be able to be able to clear the assignment if it stays assigned to that user for an extended period of time.

     

    After some additional thought, I think I may be able to mark one of the user tag fields when the "clear assignment" profile takes action, and add an additional match criteria to the "assign alert" profile to ignore alerts with a value in the user tag field.  I still need to test this, but at the moment it's the best I'm able to think of.



  • 4.  Re: Auto-Operator overdue age reset

    Broadcom Employee
    Posted Aug 18, 2016 09:18 AM

    HI,

     

    could you just add another AO for clearing at overdue 10 minutes as well.

    If not then yes I think you will need to add addition information into a custom or user field so that you can know if it has been processed before or not.