CA Folks,
We are facing a little issue with the NAS probe from CAUIM. Sometimes the Scheduler did not work very well.
For example, we have the following rules:
rule 1:
rule 2:
Both rules are added in the following scheduler:
Concluding, we cannot have alarms from itens with hostname starting with FRQ or IP 10.106 from 9:00 PM to 7:00 AM
However, we got an alarm last night.
HOSTNAME: FRQ_9119
IP: 10.106.101.110
alarm time origin: 02h15m
alarm id: JJ24247294-84254
dev_id = DAF0FD27DAFD4814434CE932952BAA636
probe: cisco_monitor
And follow the logs from NAS at level 5:
2h09m - First alarm, excluded by rule
Oct 10 02:09:05:614 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:09:05:614 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:09:05:614 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:09:05:614 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:09:05:614 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:09:05:614 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:09:05:614 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5
2h15 – second event, was not excluded by rule generating an alarm
Oct 10 02:15:03:647 [43192] nas: dbsRun committed 1 requests. 0 remaining in queue...
Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0362702 - Citrix Lefosse Reboot EventID', next run Tue Oct 10 02:15, 2017
Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0358210 - Exclude Backup Spo49 e brsmcpr54', next run Tue Oct 10 02:15, 2017
Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0347960 - Janela de backup, SRVMBX01', next run Tue Oct 10 02:15, 2017
Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0182412 - Horario de Backup - BR1SAP', next run Tue Oct 10 02:15, 2017
Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0432408 - Exclude de alarmes Franquias', next run Tue Oct 10 02:15, 2017
Oct 10 02:15:05:154 [14916] nas: SqliteExecuteCallback: sqlite3_finalize returned:0
Oct 10 02:15:05:154 [14916] nas: SqliteExecuteCallback: sqlite3_finalize returned:0
Oct 10 02:15:05:155 [102280] nas: dbBeginTransaction actLogRun, OK - rc:0
Oct 10 02:15:05:199 [102280] nas: dbCommitTransaction actLogRun, OK - rc:0
Oct 10 02:15:05:284 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002 h=258 d=846
Oct 10 02:15:05:284 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:15:05:284 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:15:05:284 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:15:05:284 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:15:05:284 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:15:05:284 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:15:05:284 [18564] nas: dbBeginTransaction subscriber, OK - rc:0
Oct 10 02:15:05:284 [18564] nas: SqliteExecuteCallback: sqlite3_finalize returned:0
Oct 10 02:15:05:286 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002
Oct 10 02:15:05:291 [18564] nas: dbCommitTransaction subscriber, OK - rc:0
Oct 10 02:15:05:291 [18564] nas: pubCommitMonitor: subscr_waiting: 'flushUncommitedAlarms'
Oct 10 02:15:05:291 [18564] nas: pubCommitMonitor: subscr_released: 'flushUncommitedAlarms'
Oct 10 02:15:05:291 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002 h=258 d=948
2h21 – excluded by rule
Oct 10 02:21:05:498 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:21:05:498 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:21:05:498 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:21:05:498 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:21:05:498 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:21:05:498 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:21:05:499 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5
Oct 10 02:21:05:499 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002
2h27 – excluded by rule
Oct 10 02:27:05:568 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:27:05:568 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:27:05:568 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:27:05:568 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:27:05:568 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:27:05:568 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:27:05:568 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5
Oct 10 02:27:05:568 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002
Oct 10 02:27:05:580 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002 h=258 d=846
2h33 – excluded by rule
Oct 10 02:33:05:432 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:33:05:432 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:33:05:432 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:33:05:432 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:33:05:432 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:33:05:432 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:33:05:432 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5
2h39 – last alarm
Oct 10 02:39:07:109 [18564] nas: Device_Approver APPROVED: dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'
Oct 10 02:39:07:109 [18564] nas: maint: entering inMaintenanceMode function
Oct 10 02:39:07:109 [18564] nas: maint: Entered getMaintenanceMode function
Oct 10 02:39:07:109 [18564] nas: maint: getMaintModeChecker passed
Oct 10 02:39:07:109 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed
Oct 10 02:39:07:109 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.
Oct 10 02:39:07:109 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5
Oct 10 02:39:07:109 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002
Oct 10 02:39:08:339 [111976] nas: ptNetIpToHost - getaddrinfo failed for HCSAP4AND-11
Has anyone seen this scenario before?
Regards