DX Application Performance Management

  • 1.  Wily Collectors missing in CA APM

    Posted May 02, 2017 08:35 AM

    Hello there, 

     

    Recently I found that the three wily collectors nodes are disconnecting from wily mom. Based on the EM log files, I suspected the issue is caused by Wily MoM and the connectivity issue resolved upon restarting the Wily MoM.

     

    However, the root cause remains unknown. I afraid the issue will happen again.

     

    I have checked the perflog and all checks are green for Wily Mom and collectors. I am out of idea and appreciate if you can shed some lights in which area I can further drill down to find out the cause of this. Thanks.

     

    ****

     

    4/30/17 05:39:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028 -- We will keep waiting and don't log further messages until we receive the reply or time out

    4/30/17 05:39:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027 -- We will keep waiting and don't log further messages until we receive the reply or time out
    4/30/17 05:39:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029 -- We will keep waiting and don't log further messages until we receive the reply or time out

    ...

    4/30/17 05:39:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028 -- We will keep waiting and don't log further messages until we receive the reply or time out
    4/30/17 05:39:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027 -- We will keep waiting and don't log further messages until we receive the reply or time out
    4/30/17 05:39:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029 -- We will keep waiting and don't log further messages until we receive the reply or time out
    4

    ...


    4/30/17 05:40:05.450 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpdp29.sportstar.com@6029 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :60,275 cycles ]
    4/30/17 05:40:05.455 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpcp28.sportstar.com@6028 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :61,454 cycles ]
    4/30/17 05:40:05.456 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpbp27.sportstar.com@6027 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :61,182 cycles ]
    4/30/17 05:40:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028
    4/30/17 05:40:19.003 AM GMT [ERROR] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpcp28.sportstar.com@6028: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    4/30/17 05:40:19.003 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpcp28.sportstar.com@6028
    4/30/17 05:40:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027
    4/30/17 05:40:19.273 AM GMT [ERROR] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpbp27.sportstar.com@6027: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    4/30/17 05:40:19.274 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpbp27.sportstar.com@6027
    4/30/17 05:40:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029
    4/30/17 05:40:20.176 AM GMT [ERROR] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpdp29.sportstar.com@6029: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    4/30/17 05:40:20.176 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpdp29.sportstar.com@6029
    4/30/17 05:41:42.421 AM GMT [INFO] [btpool0-2285] [Manager] Logout Admin
    4/30/17 05:42:01.196 AM GMT [INFO] [btpool0-2285] [Manager] Logout Admin

    ...


    4/30/17 05:42:20.699 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpdp29.sportstar.com@6029"
    com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
    at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
    at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
    at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
    at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
    at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
    at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
    at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
    at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
    at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
    at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
    at java.lang.Thread.run(Thread.java:798)

    4/30/17 05:42:20.699 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpbp27.sportstar.com@6027"
    com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
    at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
    at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
    at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
    at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
    at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
    at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
    at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
    at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
    at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
    at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
    at java.lang.Thread.run(Thread.java:798)

    4/30/17 05:42:20.730 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpdp29.sportstar.com@6029"
    com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
    at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
    at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
    at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
    at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
    at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
    at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
    at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
    at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
    at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
    at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
    at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
    at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
    at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
    at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
    at java.lang.Thread.run(Thread.java:798)



  • 2.  Re: Wily Collectors missing in CA APM
    Best Answer

    Broadcom Employee
    Posted May 02, 2017 10:00 AM

    You can check the MOM logs for the following:

     

    1. [Manager.Cluster] Collector clock is too far skewed from MOM. Collector clock is skewed from MOM clock by XXXX ms. The maximum allowed skew is 3,000 ms. Please change the system clock on the collector EM.

     

    Where XXXX is the time noted in your own log.

     

    2. [Manager] Outgoing message queue is not moving

     

    3. [Manager] Outgoing message queue is moving slowly

     

    On the Collectors that are getting disconnected, check for:

    1. Outgoing message queue is not moving

    2. Outgoing message queue is moving slowly

     

    Check the Collector perflogs and see if your number of traces is > 500,000 as well.

     

    Thanks,
    Matt



  • 3.  Re: Wily Collectors missing in CA APM

    Broadcom Employee
    Posted May 02, 2017 10:01 AM

    Dear Brian:

    Our Knowledge Base has a document on that error

    What does the following message mean? "Waited 120000 ms But did not receive the response for the message com.wily.isenga… 

     

    You can get the same message for CLW -- Seeing a Warning Message: "Waited 120000 ms But did not receive the response" while using CLW. 

     

    Please review and let us know if it was helpful

     

    Thanks

    Hal German



  • 4.  Re: Wily Collectors missing in CA APM

    Posted May 02, 2017 10:03 AM

    Brian,

    Have you had a chance to check collector logs?

    How many agent do you have in your cluster?

    APM Version?

    How many collectors do you have?

    Please add following lines in your IntroscopeEnterpriseManager.Properties file in MOM and all Collectors.

    Restart is required to enable these properties.

    There is an out of the box Management Module called MOM Infra monitoring available. deploy that MM and enable alerting for collector disconnection so you will aware if collector will disconnect again.  

    ################################################################################
    transport.override.isengard.high.concurrency.pool.max.size=10
    transport.override.isengard.high.concurrency.pool.min.size=10
    transport.outgoingMessageQueueSize=6000
    ################################################################################
                             



  • 5.  Re: Wily Collectors missing in CA APM

    Broadcom Employee
    Posted May 02, 2017 10:20 AM

    Dear Brian:   

        Some suggestions have been posted. In addition check out 

     

    http://www.ca.com/us/support/ca-support-online/product-content/knowledgebase-articles/tec604648.aspx

    -- APM Cluster Performance Health Check

    https://communities.ca.com/message/100045562#100045562

    -- APM Health Check Tips

    https://communities.ca.com/message/16546906#16546906

    -- CA Tuesday Tip: Top 10 common causes for EM performance issues - Checklist

    https://communities.ca.com/message/100642256#100642256

    -- CA Tuesday Tip:MARCH 2013 - Top 20 common causes for EM performance issues

     

    I have marked as correct, but you are free to add additional posts/questions

    If all these suggestions do not help, please open a case. Once resolved, then post the solution here to help others

     

    Thanks

    Hal German