DX Application Performance Management

Expand all | Collapse all

APM - Meaningful data to SOI

  • 1.  APM - Meaningful data to SOI

    Posted Mar 01, 2013 07:46 AM
    Hello all,

    I have APM 9.1 reporting CEM Incidents to SOI 3.0 as alerts currently.

    My issue is how to make APM data more meaningful due to the fact that CEM Incidents remain open until an APM operator closes them, regardless of if the problem has gone away. An example:-

    First slow time defect at 07:00.
    08:00 Slow time defects reach impact 1000 which triggers a moderate impact level on APM - in turn providing an alert to SOI. SOI treats this as a 'slight' health issue. Not so much of a problem.
    09:00 Defect impact then reaches 2000 which means Severe on APM - providing an alert update to SOI, changing SOI health to 'moderate'. Now, by default, SOI treats this as an outage - the start of a period of service unavailability.
    09:15 The defects now stop as the underlying problem goes away. APM still has an Open incident. SOI is still thinking the outage is ongoing.

    Eight hours later... the APM operator closes the CEM Incident. SOI (if you are at APM v9.1.5) now gets an alert close message and removes the alert, bringing the service up to normal health and considering that the end of the outage.

    However, I have an 8 hour plus downtime on my SOI stats, reports and SLA calculations...
    Actual downtime should really be assessed as start of outage to time of last defect. I.e. 09:00 to 09:15 - just 15 minutes.

    There is little that can be done in SOI to deal with this case and it relies on the APM feed to tell it about opening and closing of CEM Incidents.
    I can make sure SOI doesn't treat a CEM Incident as an outage until it reaches a 'Critcal' level, but that doesn't fix the issue of waiting for APM to send a closure.

    Is there any feature or workaround in APM 9.1 to allow Age-out of Open (not Pending) Incidents?
    Are there any Auto-close facilities in APM that can be set by GUI configuration?
    Would an IT PAM Workflow be required to detect and call an API/web service to get APM to close or age-out an Incident?

    Help!


  • 2.  Re: APM - Meaningful data to SOI

    Posted Apr 17, 2015 10:42 AM

    Hello, does anybody have reasonable solution for that problem?


    Mapping CEM incidents on to alerts in SOI is a bad idea. It does not allow measurning service availability in SOI in a credible and automated way - without manually closing the incident by an operator the service is always unavailable.


     

    One of the possible solutions would be turning off alerts generated by CEM incidents and configuring SNMP alerts in Introscope based on metric Total Defect Ratio (%) of desired transactions. Unfortunatelly these alerts are not related to business transaction CIs but to one running software (ca-catalyst_CEM_TESS Agent) CI.



  • 3.  Re: APM - Meaningful data to SOI

    Posted Apr 17, 2015 10:47 AM

    Hello,

    I am having a similiar problem. This integration is pointless from SOIs service level management perspective.



  • 4.  Re: APM - Meaningful data to SOI

    Broadcom Employee
    Posted Apr 17, 2015 10:51 AM

    You can automate the closure of CEM incidents using the CEMExportTool rule 20.

     

    I haven't used SOI in a long time, but can you trigger a commandline tool in response to an action if closing the incident is what you want?



  • 5.  Re: APM - Meaningful data to SOI

    Broadcom Employee
    Posted Apr 17, 2015 11:22 AM

    Hi Bob,

     

    I experienced the same problem you do In one of my customers.

     

    the solution was to implement an auto-watcher that looks in CEM if the last defect in any of the opened incidents is older than X minutes (configurable). In that case, using CEMExportTool or WS (do not remember), we consider the problem is gone and we close it in CEM (therefore in SOI).

     

    This auto-watcher consists in a simple java tool preiodically executed with cron or windows scheduler (my case).

     

    Let me know if you want me to share.

     

    Additionally, by using CEMExportTool and SOI Escalation Policies, we close Incidents in CEM when the related alert in SOI console is closed by the operator.

     

    Regards



  • 6.  Re: APM - Meaningful data to SOI

    Posted Apr 21, 2015 03:18 AM

    Hi Jose,

    Could You please share Your script here? It would help me and propably others alot.



  • 7.  Re: APM - Meaningful data to SOI

    Posted Apr 22, 2015 09:24 AM

    Hi Jose, it would be nice if you'd share it here.



  • 8.  Re: APM - Meaningful data to SOI

    Posted Apr 23, 2015 10:27 AM

    Hi, It would be very useful. Could You please share Your script here?



  • 9.  Re: APM - Meaningful data to SOI

    Posted Apr 24, 2015 06:00 AM

    This solution is great! Could you share it here please? Or do I need to write it on my own?



  • 10.  Re: APM - Meaningful data to SOI

    Broadcom Employee
    Posted May 27, 2015 07:21 PM

    Hi all,

     

    I am so sorry but I miss your replies.

     

    Please, find attached the tool and documentation to close the incidents in CEM when no defects are registered in the last minutes.

     

    Regards

    Attachment(s)



  • 11.  Re: APM - Meaningful data to SOI

    Broadcom Employee
    Posted May 27, 2015 07:57 PM

    Hi again,

     

    now you can find the tool and documentation to close incidents in CEM when the associated SOI alert in closed in the console.

     

    Regards

    Attachment(s)



  • 12.  Re: APM - Meaningful data to SOI

    Broadcom Employee
    Posted May 28, 2015 05:15 PM

    Hi Jose,

     

    can you also post the tools and dos in the document section. Best create a shot document describing the use case and solution and attach the files.

     

    Ciao,

    Guenter