DX Unified Infrastructure Management

  • 1.  NAS and Alarm-enrichment queue monitoring

    Posted Oct 24, 2017 09:07 AM

    Do we have any option to monitor the NAS and alarm_enrichment queue issue?



  • 2.  Re: NAS and Alarm-enrichment queue monitoring

    Posted Oct 24, 2017 11:28 AM

    Hi Martin!

     

    You explore the CA UIM Hub Queue Statistics Probe

     

    http://marketplace.ca.com/shop/uim/ca-uim-hub-queue-statistics-probe.html

     

    OR

     

    You can also use a small LIA script from: QueueCheck LUA script v2.1 (QueueCheck LUA script v2.2 ); this tool will create the needed alarms and qos entries to be alarmed and to do the follow up (with sample list view included)

     

    Another thing in paralell is to monitor the nas logs with the logmon probe for failure and timeout messages to alarm on.

     

    Regards,



  • 3.  Re: NAS and Alarm-enrichment queue monitoring

    Posted Oct 24, 2017 11:43 AM

    One of the best practices can also be: monitor the space and number of files in the probes/hub/q/nas and probes/hub/q/alarm_enrichment directories. The Hub Queue Statistics Probe is definitely something to use in addition of this.



  • 4.  Re: NAS and Alarm-enrichment queue monitoring

    Broadcom Employee
    Posted Oct 24, 2017 01:59 PM

    If you are on Microsoft SQL, you can monitor the NAS_TRANSACTION_LOG table in the database using a SQL maintenance plan.

     

    Go into SQL Management Studio

    Configure Database Mail as described at the following Microsoft article:

    Configure Database Mail | Microsoft Docs 

     

    Start the SQL Server Agent at the bottom by right clicking and choosing Start.  (Note the confirmation window may pop up in the back behind your Management Console window)

    After it starts, right click the node and choose Properties

    Click Alert System

    Check Enable mail profile

    Choose the Mail profile you configured above

    Click OK

     

    Expand SQL Server Agent

    Right click Operators

    Click New Operator

    Name: Alarm Failure

    Email name: (Your internal distribution list for getting these failure emails)

    Click OK

     

    Right click on Jobs and choose New Job

    Enter a name like "Alarm Count"

    Click Steps

    Click New

    Step name: Query

    Database: CA_UIM (or the name of your DB if different)

    Command: 

    DECLARE @minutes INT = -30;
    DECLARE @rows INT;

    SELECT @rows = COUNT(time)
    FROM NAS_TRANSACTION_LOG
    WHERE time > DATEADD(mi, @minutes, GETDATE())
    GROUP BY SUBSTRING(CAST(time AS VARCHAR),1,1)

    IF @rows = 0 BEGIN
    RAISERROR (51000, -1, -1, 'No Alarms');
    END

     

    Click Schedules

    Click New

    Name: Alarm Count Schedule

    Occurs: Daily

    Occurs every:  30 minutes

    Click OK

     

    Click Notifications

    Check E-mail

    Choose Alarm Failure

     

    Click OK

     

    At this point the SQL server will check every 30 minutes for alarms having been generated within 30 minutes.

     

    Bear in mind that the email above is not very verbose, but you can modify some of the above message to get a little more info.  This is intended as a starting point

     

    Obviously, this is not officially supported by CA, and only intended as a community aided solution to a common problem



  • 5.  Re: NAS and Alarm-enrichment queue monitoring

    Posted Oct 24, 2017 05:42 PM

    Hi,

     

    The method here is interesting, but I want to point out it doesn't primarily monitor queues. It monitors if alarms are going through nis_bridge properly. If the alarm_enrichment or nas queues are having a problem, it will of course be impacted and you won't get any new records in the SQL Database, but everything can be ok on queues and you can have a nis_bridge issue which will only affect the NAS_ALARM and NAS_TRANSACTION_* tables. Also note that if you disable Nas transaction logging in the nas settings you won't get records in the NAS_TRANSACTION table.

     

    However, I recommend to monitor - if you use UMP - the NAS_ALARM table to be sure it's moving.



  • 6.  Re: NAS and Alarm-enrichment queue monitoring

    Broadcom Employee
    Posted Oct 24, 2017 06:16 PM

    You're right, it doesn't monitor queues directly.  However, if one is monitoring alarm_enrichment/nas queues, then it's likely they want to be alerted of failures.  And if the alarm_enrichment/nas queues are not processing then any alerts generated based on the lack of processing will not be received.  So the above method provides an "out of band" method of alerting to alarms suddenly not being available any longer.

     

    Is it bullet proof?  No.  Again you are correct that if nis bridge synchronization fails or is disabled, this will generate alerts as well.  Ideally you would have these features enabled and then you'd have alerting if your nis_bridge failed too.

     

    I recommend NAS_TRANSACTION_LOG vs NAS_ALARMS because it is more comprehensive.  A clear alarm will not be represented in NAS_ALARMS, nor will an acknowledge, but you will find evidence of them in NAS_TRANSACTION_LOG.