AnsweredAssumed Answered

NAS - Scheduler not working sometimes

Question asked by Jean Gomes on Oct 10, 2017
Latest reply on Oct 11, 2017 by Jean Gomes

CA Folks,

 

We are facing a little issue with the NAS probe from CAUIM. Sometimes the Scheduler did not work very well.

 

For example, we have the following rules:

 

rule 1:

 

rule 2:

 

Both rules are added in the following scheduler:

 

Concluding, we cannot have alarms from itens with hostname starting with FRQ or IP 10.106 from 9:00 PM to 7:00 AM

 

However, we got an alarm last night.

 

 

HOSTNAME: FRQ_9119

IP: 10.106.101.110

alarm time origin: 02h15m

alarm id: JJ24247294-84254

dev_id = DAF0FD27DAFD4814434CE932952BAA636

probe: cisco_monitor

 

And follow the logs from NAS at level 5:

 

 

2h09m - First alarm, excluded by rule

Oct 10 02:09:05:614 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:09:05:614 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:09:05:614 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:09:05:614 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:09:05:614 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:09:05:614 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:09:05:614 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5

 

2h15 – second event, was not excluded by rule generating an alarm

Oct 10 02:15:03:647 [43192] nas: dbsRun committed 1 requests. 0 remaining in queue...

Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0362702 - Citrix Lefosse Reboot EventID', next run Tue Oct 10 02:15, 2017

Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0358210 - Exclude Backup Spo49 e brsmcpr54', next run Tue Oct 10 02:15, 2017

Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0347960 - Janela de backup, SRVMBX01', next run Tue Oct 10 02:15, 2017

Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0182412 - Horario de Backup - BR1SAP', next run Tue Oct 10 02:15, 2017

Oct 10 02:15:05:062 [82536] nas: Scheduler rescheduled profile:'RITM0432408 - Exclude de alarmes Franquias', next run Tue Oct 10 02:15, 2017

Oct 10 02:15:05:154 [14916] nas: SqliteExecuteCallback: sqlite3_finalize returned:0

Oct 10 02:15:05:154 [14916] nas: SqliteExecuteCallback: sqlite3_finalize returned:0

Oct 10 02:15:05:155 [102280] nas: dbBeginTransaction actLogRun, OK - rc:0

Oct 10 02:15:05:199 [102280] nas: dbCommitTransaction actLogRun, OK - rc:0

Oct 10 02:15:05:284 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002  h=258 d=846

Oct 10 02:15:05:284 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:15:05:284 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:15:05:284 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:15:05:284 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:15:05:284 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:15:05:284 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:15:05:284 [18564] nas: dbBeginTransaction subscriber, OK - rc:0

Oct 10 02:15:05:284 [18564] nas: SqliteExecuteCallback: sqlite3_finalize returned:0

Oct 10 02:15:05:286 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002

Oct 10 02:15:05:291 [18564] nas: dbCommitTransaction subscriber, OK - rc:0

Oct 10 02:15:05:291 [18564] nas: pubCommitMonitor:  subscr_waiting:  'flushUncommitedAlarms'

Oct 10 02:15:05:291 [18564] nas: pubCommitMonitor:  subscr_released: 'flushUncommitedAlarms'

Oct 10 02:15:05:291 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002  h=258 d=948

 

2h21 – excluded by rule

Oct 10 02:21:05:498 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:21:05:498 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:21:05:498 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:21:05:498 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:21:05:498 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:21:05:498 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:21:05:499 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5

Oct 10 02:21:05:499 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002

 

2h27 – excluded by rule

Oct 10 02:27:05:568 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:27:05:568 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:27:05:568 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:27:05:568 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:27:05:568 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:27:05:568 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:27:05:568 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5

Oct 10 02:27:05:568 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002

Oct 10 02:27:05:580 [18564] nas: RREQUEST: hubpost <-10.55.249.10/48002  h=258 d=846

 

2h33 – excluded by rule

Oct 10 02:33:05:432 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:33:05:432 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:33:05:432 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:33:05:432 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:33:05:432 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:33:05:432 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:33:05:432 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5

 

2h39 – last alarm

Oct 10 02:39:07:109 [18564] nas: Device_Approver APPROVED:  dev_id: 'DAF0FD27DAFD4814434CE932952BAA636' from '/Logicalis-Infrastructure-Management/HS1B-Dia/br5oimsnmdia001/cisco_monitor'

Oct 10 02:39:07:109 [18564] nas: maint: entering inMaintenanceMode function

Oct 10 02:39:07:109 [18564] nas: maint: Entered getMaintenanceMode function

Oct 10 02:39:07:109 [18564] nas: maint: getMaintModeChecker passed

Oct 10 02:39:07:109 [18564] nas: maint: validateMaintenanceIntervalsIncludeTime passed

Oct 10 02:39:07:109 [18564] nas: maint: dev_id 'DAF0FD27DAFD4814434CE932952BAA636' from 'subscriber' '10.106.101.110' is NOT in maintenance.

Oct 10 02:39:07:109 [18564] nas: EXCLUDED BY RULE 'RITM0432408 - Exclude de alarmes Franquias, 21h - 7h _ 2' - msg:The SNMP Agent at '10.106.101.110' in group 'Franquias' is not responding. [FRQ_9119],src:10.106.101.110,sev:5

Oct 10 02:39:07:109 [18564] nas: SREPLY: status = 0(OK) ->10.55.249.10/48002

Oct 10 02:39:08:339 [111976] nas: ptNetIpToHost - getaddrinfo failed for HCSAP4AND-11

 

Has anyone seen this scenario before?

 

Regards

Outcomes