Service Operations Insight

Tech Tip: CA SOI - Event / Alarm Storm in eHealth caused Alert counts to be out in SOI 

Sep 03, 2018 10:13 AM

Question:
The customer had an "event storm" last week which caused many events to not get cleared because the limit was reached in eHealth.

Can the systems be brought into sync manually now - by a restart? What must be done to achieve this? 
Answer:
Alarm/Alert storms are known occurrences with products like Spectrum, UIM and eHealth. I'm aware of prevention measures you can take for UIM to detect and suppress potential Alarm Storms, I would suggest you raising a new ticket with the eHealth team to ask for best practices / guides to prevent this from reoccurring. 

SOI can handle a large number of alarms however, if there is a flood of alerts in just a few seconds the SOI manager will go into a hung state and stop processing those alerts. We have an Idea open in the SOI Community for this scenario, I would suggest voting here: 

SOI Alarm functionality to detect Alarm Storms - https://communities.ca.com/ideas/235736086-soi-alarm-functionality-to-detect-alarm-storms 

The solution to this problem, would be to clear those duplicate / flood of alerts from the eHealth side and then recycle the Connector services. After this the alarms should be synchronized and the problem on the SOI side resolved.

 

Link to KB: CA SOI - Event / Alarm Storm in eHealth caused Ale - CA Knowledge  

Statistics
0 Favorited
0 Views
0 Files
0 Shares
0 Downloads

Related Entries and Links

No Related Resource entered.