Hello there,
Recently I found that the three wily collectors nodes are disconnecting from wily mom. Based on the EM log files, I suspected the issue is caused by Wily MoM and the connectivity issue resolved upon restarting the Wily MoM.
However, the root cause remains unknown. I afraid the issue will happen again.
I have checked the perflog and all checks are green for Wily Mom and collectors. I am out of idea and appreciate if you can shed some lights in which area I can further drill down to find out the cause of this. Thanks.
****
4/30/17 05:39:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028 -- We will keep waiting and don't log further messages until we receive the reply or time out
4/30/17 05:39:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027 -- We will keep waiting and don't log further messages until we receive the reply or time out
4/30/17 05:39:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029 -- We will keep waiting and don't log further messages until we receive the reply or time out
...
4/30/17 05:39:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028 -- We will keep waiting and don't log further messages until we receive the reply or time out
4/30/17 05:39:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027 -- We will keep waiting and don't log further messages until we receive the reply or time out
4/30/17 05:39:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Waited 15000 ms But did not receive the response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029 -- We will keep waiting and don't log further messages until we receive the reply or time out
4
...
4/30/17 05:40:05.450 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpdp29.sportstar.com@6029 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :60,275 cycles ]
4/30/17 05:40:05.455 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpcp28.sportstar.com@6028 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :61,454 cycles ]
4/30/17 05:40:05.456 AM GMT [WARN] [pool-1-thread-1] [Manager.Cluster] [Collector gsolmanpbp27.sportstar.com@6027 is sending data slowly to the MOM and will be disconnected now. MOM will try to reconnect to the collector. [TSIndexDiffFromMom : 60 cycles,MaxAllowedTSIndexDiffFromMom :61,182 cycles ]
4/30/17 05:40:19.002 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_128.client_main:263 to service address Server.main:273 from thread Collector gsolmanpcp28.sportstar.com@6028
4/30/17 05:40:19.003 AM GMT [ERROR] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpcp28.sportstar.com@6028: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
4/30/17 05:40:19.003 AM GMT [WARN] [Collector gsolmanpcp28.sportstar.com@6028] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpcp28.sportstar.com@6028
4/30/17 05:40:19.273 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_84.client_main:263 to service address Server.main:273 from thread Collector gsolmanpbp27.sportstar.com@6027
4/30/17 05:40:19.273 AM GMT [ERROR] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpbp27.sportstar.com@6027: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
4/30/17 05:40:19.274 AM GMT [WARN] [Collector gsolmanpbp27.sportstar.com@6027] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpbp27.sportstar.com@6027
4/30/17 05:40:20.175 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Not waiting for response for the message com.wily.isengard.messageprimitives.service.MessageServiceCallMessage: {com.wily.introscope.spec.server.beans.console.IConsoleService.ping, v1, []} from address Workstation_0.client_main:263 to service address Server.main:273 from thread Collector gsolmanpdp29.sportstar.com@6029
4/30/17 05:40:20.176 AM GMT [ERROR] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Caught exception trying to get the difference between MOM and this Collector's harvest time: Collector gsolmanpdp29.sportstar.com@6029: com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
4/30/17 05:40:20.176 AM GMT [WARN] [Collector gsolmanpdp29.sportstar.com@6029] [Manager.Cluster] Lost contact with the Introscope Enterprise Manager at gsolmanpdp29.sportstar.com@6029
4/30/17 05:41:42.421 AM GMT [INFO] [btpool0-2285] [Manager] Logout Admin
4/30/17 05:42:01.196 AM GMT [INFO] [btpool0-2285] [Manager] Logout Admin
...
4/30/17 05:42:20.699 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpdp29.sportstar.com@6029"
com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
at java.lang.Thread.run(Thread.java:798)
4/30/17 05:42:20.699 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpbp27.sportstar.com@6027"
com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
at java.lang.Thread.run(Thread.java:798)
4/30/17 05:42:20.730 AM GMT [WARN] [ClusterManager Async Executor] [Manager] Unable to update load balancing for collector "gsolmanpdp29.sportstar.com@6029"
com.wily.isengard.message.MessageUndeliverableException: Outgoing mailbox is closed. Message cannot be sent
at com.wily.isengard.postoffice.Mailbox.sendMessage(Mailbox.java:238)
at com.wily.isengard.messageprimitives.service.AAsyncMessageServiceClient.sendRequestAsync(AAsyncMessageServiceClient.java:113)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.sendRequest(MessageServiceClient.java:159)
at com.wily.isengard.messageprimitives.service.MessageServiceClient.invoke(MessageServiceClient.java:356)
at com.sun.proxy.$Proxy101.removeCollector(Unknown Source)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancer.removeCollector(ClusteredLoadBalancer.java:733)
at com.wily.introscope.server.beans.loadbalancer.ClusteredLoadBalancerBean.collectorsRemoved(ClusteredLoadBalancerBean.java:436)
at com.wily.introscope.spec.server.beans.clusters.ClusterNotification.dataRemoved(ClusterNotification.java:28)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager$NotifyRemoved.run(AbstractQueryServiceManager.java:438)
at com.wily.isengard.ongoingquery.QueryServiceManager2$1.execute(QueryServiceManager2.java:46)
at com.wily.isengard.ongoingquery.QueryServiceManager2.runNotification(QueryServiceManager2.java:85)
at com.wily.isengard.ongoingquery.AbstractQueryServiceManager.stateRemoved(AbstractQueryServiceManager.java:231)
at com.wily.introscope.server.beans.AOngoingQueriableBean.stateRemoved(AOngoingQueriableBean.java:89)
at com.wily.introscope.server.beans.clusters.ClusterManager.collectorRemoved(ClusterManager.java:277)
at com.wily.introscope.server.beans.clusters.ClusterManager.access$1(ClusterManager.java:261)
at com.wily.introscope.server.beans.clusters.ClusterManager$CollectorRemovedCommand.run(ClusterManager.java:376)
at com.wily.EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(QueuedExecutor.java:88)
at java.lang.Thread.run(Thread.java:798)