I've experienced -sometimes- a strange behavior. There are events when in IM a robot only has the controller in 'green' and everything else is 'red'. Even HDB and SPOOLER. In the 'serverlist of domain' I see the server as OK.
Before this event I get the 'Robot is inactive' alarm message. Once.
This happended yesterday night again. I checked the Windows event logs and found a Network error:
e1kexpress, 11.02.2015 20:13:03, Event-ID: 27, Warning
Intel(R) 82567LM-3 Gigabit Network Connection
Network link has been disconnected.
This event resonated through the probes:
Feb 11 20:13:28:997  Controller: hub dkf7nmhub04(10.170.245.204) NO CONTACT (communication error)
Feb 11 20:13:07:999  spooler: nimSession - failed to connect session to 10.170.245.204:48001, error code 10065
- Other probes like CDM, HDB, a self-made Perl script probe simply stopped logging and measuring.
When I faced the event I triggered a Robot restart in IM. And then everything came back:
- probes became green in IM
- measurements started
- loggings continoued.
In some previous cases the robot restart resulted a "full inactivity" (it really became inactive). The issue then revealed to be server down.
I don't know if anyone experienced this or at least something similar, but I'm curious why and how could his happen?
And how could I prevent or solve this automatically without adding another monitoring item (probe/script/etc.)?