DX Unified Infrastructure Management

  • 1.  UIM: How to detect QOS not receiving data

    Broadcom Employee
    Posted Nov 08, 2016 08:58 AM

    Hello,

     

    what is your current approach to detect and alarm on QOS not receiving data for last hour/days?

     

    I have created a very simple bat file to check a qos via REST API and return "0" if data is present for last our, "1" otherwise. This can be wrapped on a script to monitor critical QOS that must be "alive" and with data all the time.

     

    Sample of usage:

    C:\Users\falne02\Desktop>checkqos.bat falne02-ump <user> <password> uimdemo uimdemo QOS_CPU_USAGE

    Calling URL: http://falne02-ump/rest/qos/data/name/QOS_CPU_USAGE/uimdemo/uimdemo/lasthour/now/0

    QOS data found for last hour

     

    The bat file takes 6 parameters in this order:

    - umpserver

    - user

    - password

    - source

    - target

    - qos

    NOTE: the script requires curl in the PATH of the server from where we execute it from

     

    Thanks for any comments/feedback

    Nestor



  • 2.  Re: UIM: How to detect QOS not receiving data

    Posted Nov 08, 2016 09:19 AM

    This must be executed in UIM or UMP sever ?



  • 3.  Re: UIM: How to detect QOS not receiving data

    Broadcom Employee
    Posted Nov 09, 2016 05:07 AM

    Hello,

    it does not matter as long as you can reach the ump server. Note that you need curl utility in the PATH of the box from where you launch the tool.

    Nestor



  • 4.  Re: UIM: How to detect QOS not receiving data

    Posted Nov 09, 2016 07:41 AM

    I have a LUA script that runs the following SQL:

     

    SELECT max(Q.source) as source, isnull(min(DATEDIFF(SECOND, d.sampletime, getdate())-21600 + rn.tz_offset),99999999) as age, MIN(r.user_tag_1) as user_tag_1
    FROM S_QOS_DATA Q
    left JOIN S_QOS_SNAPSHOT D ON Q.table_id=D.table_id
    left join CM_NIMBUS_ROBOT R ON q.source=r.robot
    left join RN_QOS_DATA_0012 RN on q.table_id=rn.table_id and d.sampletime=rn.sampletime
    where
    q.qos='QOS_COMPUTER_UPTIME'
    and ( r.is_hub=1 )
    group by q.source order by age desc

     

    I the create alarms based on the age.

     

    And because this is actually being used to detect hubs (is_hub=1) that aren't reporting as a way to determine how long their tunnels have been down, I'm saving off the age in another table so that when the data for a hub ages out of QOS_COMPUTER_UPTIME I still know when it was last heard from.

     

    -Garin