This is regarding correlating multiple alarms in to a single alarm.
I have set of hosts added in monitoring in Nimsoft. At present, I am monitoring CPU, Disk and Memory related alarms. My requirement is that when there are 3 different alarms received from the same host, I want to correlate multiple alarms from a host into a single correlated alarm.
The solution, which I thought of, is not that good in terms of performance or even might fail. The same is mentioned below.
1) Create 3 triggers, each for CPU, Disk and Memory, generic for all the hosts.
2) Create a LUA script (as shown below) which gets called every AO interval (like 5 minutes) so that it can work not only on the new alarms but also on the current alarm.
Note :- I am just writing the logic here, the syntax might not be accurate.
function CorrelateAlerts all_CPU_alerts = trigger.list ("CPU_Alert") for i = 1,#all_CPU_alerts do --extract hostname --The following line might be incorrect as per syntax hostname1 = string.match ("CPU Alert from ([^ ]*)", all_CPU_alerts[i].message) messageid1 = all_CPU_alerts[i].nimid all_Disk_alerts = trigger.list ("Disk_Alert") for j = 1,#all_Disk_alerts do --extract hostname --The following line might be incorrect as per syntax hostname2 = string.match ("Disk Alert from ([^ ]*)", all_Disk_alerts[j].message) messageid2 = all_Disk_alerts[j].nimid if ( hostname1 != hostname2 ) then break end all_Memory_alerts = trigger.list ("Memory_Alert") for k = 1,#all_Memory_alerts do --extract hostname --The following line might be incorrect as per syntax hostname3 = string.match ("Memory Alert from ([^ ]*)", all_Memory_alerts[k].message) messageid3 = all_CPU_alerts[k].nimid if ( hostname2 != hostname3 ) then break else
nimbus.alarm (5, "Correlated alert from " .. hostname1 )
end end end end end if trigger.state ("CPU_Alert") and trigger.state ("Disk_Alert") and trigger.state ("Memory_Alert") then CorrelateAlerts () end