Hi there,
we have this kind of issue:
- use process probe to gather information about key OS process availability
- create SLAs based on that information
- the process probe has a known feature where the timing (probing frequency) is not exactly what is defined, so every now and then the configured 60s might be 61s, or just 60,1s (example from this morning is that one sample is form 9:48:59 and next 9:50:00, that means that we have one 1 minute breach in SLA at 9:49)
- then the SLA calculation defines that there is a missing data -> SLA breach
- though, interval based calculation needs to be on place due the thing that when host is down, there is no data coming, so we need to find those missing datas
And,
- there seems to be no way to make probe precise on the timings, well, fix is hopefully coming
- SLA async calculation can not be used, that would then mean that host can be down and SLA is still 100%
- seems to be no tuning for SLA calculations to add that "fuzziness" so that is would understand that when data is still coming very soon after the "required time" then it is still OK.
Have other had this same issue? Any ideas how to outcome this?
Br
Teppo