DX NetOps

Checking for OC-Server workload conditions

  • 1.  Checking for OC-Server workload conditions

    Broadcom Employee
    Posted Jul 16, 2015 05:25 AM

    Dear all,

     

    one important component in a CA Spectrum installation is the OneClick-server/service. It is important to know the workload condition for the OC-server and to recognize, if there is a potential workload condition affecting the service.

    Per design the OC-Server is based on a Tomcat framework using Java code. You may find good improvements by re-configuration - Run OneClick-Server without option "java.compiler=NONE" for R9.3+

     

    How to verify the current workload condition.

    A: Have a look to the "Longest queue update" time - listed under OC-weblogon / Administration / Debugging / Alarm Statistics

    B: Have a look to the "OC-Console"/"OC-Server" response test values as per Ranking OC-Console/Workstation to OC-Server performance

    C: Have a look to the "SSPerformance" application model at the MLS-SpectroSERVER - event tab - and find OC-server performance events 0x4820002

     

     

    For A:

    The "longest queue update" time is a maximum values stored at OC-server runtime since last startup and should be below 20 seconds. In case of temporary massive alarm workload we may see processing times exceeding 40 seconds or more. This is not a problem as long the "Current update queue size" and the "Current queue alarms" will be at zero. It indicate there was once a workload condition and update processing right now is back fine. Seeing longest queue update times >> 60.000ms / 60 seconds is a critical aspect. Unfortunately there is no "reset" option during OC-server uptime to re-verify/reset this processing timer.

     

    For B:

    As you may find in the "Ranking document" - you may have to verify the "common" values at period interval to allow - once a workload condition appears to be valid - to interprete the now seen values.

     

    For C:

    Here - by default the OC-server is configured to connect to the MLS to retrieve the Lanscape Map info covering all the Landscapes/SpectroSERVERs (and we saw sometimes using a different configuration via context.xml - locServerName parameter) - and that "info" is used for the OC-service to write performance event records to the SSPerformance applciation model. The performance events 0x4820002 will cover counters from OC-server startup time increasing per 1 minute interval. Exporting these and processing i.e. via Excel macros will allow to see "workload characteristic" for the OC-server - and if that is affected by significant changes at modeling / monitoring level. Find below weekly graphs for model- and alarm-updates per minute. Processing limits "per minute" will depend on active OC-Console logon and the "hardware" for the OC-server (here multi-threaded/core system) - plus the "network performance" between OC-server and OC-Console. Critical values are > 10k/minute and would require specific analysis.

     

    diff_mdl_updates.PNG

     

     

    diff_alarm_updates.PNG

     

     

    Cheers, Joerg