AttilaMolnar

Fake server activity

Discussion created by AttilaMolnar on Feb 12, 2015
Latest reply on Mar 7, 2015 by Garin

Hi,

 

I've experienced -sometimes- a strange behavior. There are events when in IM a robot only has the controller in 'green' and everything else is 'red'. Even HDB and SPOOLER. In the 'serverlist of domain' I see the server as OK.

Before this event I get the 'Robot is inactive' alarm message. Once.

 

This happended yesterday night again. I checked the Windows event logs and found a Network error:

e1kexpress, 11.02.2015 20:13:03, Event-ID: 27, Warning
Intel(R) 82567LM-3 Gigabit Network Connection
Network link has been disconnected.

 

This event resonated through the probes:

- controller.log:

Feb 11 20:13:28:997 [9152] Controller: hub dkf7nmhub04(10.170.245.204) NO CONTACT (communication error)

 

- spooler.log:

Feb 11 20:13:07:999 [4320] spooler: nimSession - failed to connect session to 10.170.245.204:48001, error code 10065

 

- Other probes like CDM, HDB, a self-made Perl script probe simply stopped logging and measuring.

 

When I faced the event I triggered a Robot restart in IM. And then everything came back:

- probes became green in IM

- measurements started

- loggings continoued.

 

In some previous cases the robot restart resulted a "full inactivity" (it really became inactive). The issue then revealed to be server down.

 

 

I don't know if anyone experienced this or at least something similar, but I'm curious why and how could his happen?

And how could I prevent or solve this automatically without adding another monitoring item (probe/script/etc.)?

 

 

BR,

Attila

Outcomes