Kag,
I really think Ping Up/Down should be built into the product & not require a LUA script & all this NAS modification with the Auto-Operators!!! Robot Inactive is ambiguous & doesn't say if the Robot is the problem or the server down is the root cause...
Your script looks similar to the script I got from someone at CA, but it's a bit different. Here is the Full script I was given, except that I modified the Message Add lines for appending to the Alarm Text (I left the original lines, but commented them out). It works in my UIM server.
-- Find inactive robots, ping them to see if just the robot which is down or the server.
-- The script assumes robot inactive alarms from the hub have been changed to major, this could always be handled by the script of course
-- just insert the following lines after line 26
-- a.level = 4
-- a.severity = major
--Find inactive robot alarm(s)
al=alarm.list("message","Robot % is inactive")
if al ~= nil then
for i = 1,#al do
-- Place current row al[i] into a
(for readability)
a = al[i]
--
Print nimid, hostname, severity and message for troubleshooting
printf("%02d %s %s
%s",i,a.source,a.severity,a.message)
-- Get the ip of the robot from
the alarm
ip_addr = a.source
-- Print for troubleshooting
print(ip_addr)
-- Ping the ip
ping_success = action.ping(ip_addr)
if ping_success then
-- Print the status for
troubleshooting
print("Ping success
"..ip_addr)
-- Edit the alarm message to
to assist ops
-- message_add_OK = "but
server responds to ping OK"
message_add_OK = "- PING OK"
a.message = a.message.." "..message_add_OK
alarm.set (a)
else
--Print the status for
troubleshooting
print("Ping fail "..ip_addr)
-- Edit the alarm message to
assist ops
-- message_add_fail =
"and no response to ping!"
message_add_fail = "- PING Failure"
a.message = a.message.." "..message_add_fail
-- Change the severity to
critical
a.level = 5
a.severity = critical
alarm.set(a)
end
end
end
Either change the Robot Inactive Alarms to Major, or make the minor line changes in the script.
You will also need a Profile to trigger the script when the Alarms come in. Something like this:
Action Type: Script
Script: {Script name}
Action Mode: On Message Arrival
Severity Level: Major & Critical
Message String: /(?i).*Robot[\w\d\s]*inactive.*/ <--- This is the Regular Expression for catching all the "Robot [Server Name] is inactive" Alarms.
Let me know if this resolves your problem...
Regards,
Mike