DX Application Performance Management

  • 1.  Get all ConnectionStatus

    Posted Dec 07, 2016 06:34 AM

    Hello,

     

    Im trying to get all ConnectionStatus from all agents and alert if one of them its down.

     

    I have the following:

     

    But im not receiving values.

    This its possible or do i need to set one alert for each ConnectionStatus?

     

    Thx,

    Fernando



  • 2.  Re: Get all ConnectionStatus
    Best Answer

    Broadcom Employee
    Posted Dec 07, 2016 06:53 AM

    Hi not sure about the  definition of the metric grouping you are using - this works for me

     

     

    and the alert is defined like this

     

     

    and I received the alert when the agent disconnected.

     

    But bear in mind that the connections status will only show agents that have disconnected - i.e. if the agent doesn't connect in the first place there will be no metric value and so no alert - and also when the agent is unmounted it will disappear from the grouping

     

    thanks

    Mike



  • 3.  Re: Get all ConnectionStatus

    Posted Dec 07, 2016 07:49 AM

    Hi Mike,

     

    Thank you!

     

    Regards,

    Fernando



  • 4.  Re: Get all ConnectionStatus

    Broadcom Employee
    Posted Dec 07, 2016 08:06 AM

    welcome - just noticed and stray "t" in the metric expression should be

     

    Agents\|(.*)\|(.*)\|(.*):ConnectionStatus

     

    it was fine for me as all my agents were Tomcat



  • 5.  Re: Get all ConnectionStatus

    Posted Dec 08, 2016 08:37 AM

    There has been several discussions on using agent connection status as a up/down check.

    One of the issues with this is the behavior of an agent during a ADS and the agent unmount time.

     

    Agent Connection Status - Unmounted during ADS 

     

    To help get around this issue, I've started to use a Calculator for the number of EM - Collectors

     

    We have two alerts set up, one for the value of connection status (1, 2, 3) and the second is a calculator that sums all of the collector's connection statuses together and alerts if the total is not equal to 4 (in this environment we have four collectors)

    Additionally the resolution is changed to 6 minutes and the trigger is each period, so we would get another email message every 6 minutes.

    There is an ADS that covers any maintenance windows that has both alerts included within it.  But as soon as the ADS expires, even if the metric is unmounted the calculator will fire off.

     



  • 6.  Re: Get all ConnectionStatus

    Broadcom Employee
    Posted Dec 08, 2016 09:46 AM

    Thanks for the info Billy.  How are you handling the Alerts when MOM re-loadbalances the agents?



  • 7.  Re: Get all ConnectionStatus

    Posted Dec 08, 2016 09:57 AM

     

    There are two settings we are using to help with the load balancing of agents.  First is set the combination to "all" so that on the original collector connection status would be 3 and the new collector would be 1. 

     

    Second is to extend the periods over threshold from 1,1 to 8,8 (two minutes) typically the load balancing take less than a minute, but could take a bit longer depending on how loaded the EM-collector is.

     

    In the above case, this would not prevent or catch if there is an ADS and the agent is stopped and not restarted.  This is mainly due to two factors, first is the trigger alert notification set to 'Whenever Severity Changes". 

     

    With the trigger set to whenever severity changes, the value would need to change and not included in an ADS. 

     

    The second, is the agent unmount period.  If an agent is within an ADS, goes down and the ADS is longer than the unmount period, after the ADS has expired, there is a good chance that the agent will be unmounted thus the agent connection data would be "no data".  There isn't an alert setting to catch a "no data" condition.



  • 8.  Re: Get all ConnectionStatus

    Broadcom Employee
    Posted Dec 11, 2016 11:35 AM

    Be careful of using a Management Module to do this aggregation. Load balancing will cause false positives.

     

    Consider using the Javascript calculator I posted a while ago which will do this collection and 'move' them to the MOM virtual agent. Then you can create your groups and alerts.