DX Application Performance Management

  • 1.  CPU Metrics - Process and Aggregate Values

    Posted Aug 03, 2015 05:43 AM
      |   view attached

    Hi,

    I have attached the screenshot of CPU (%process) and CPU (%aggregate). According to wily docs, CPU (%process) describes the percentage of CPU the JVM is using.

    And CPU (%aggregate) describes the total load on CPU including JVM and non-JVM processes.

    We have stopped all the Java related processes and so the CPU process value is down now. But the CPU aggregate value is still high. How should I find out which processes or services are eating up CPU. I checked using TOP command, but it shows a value normally down.

    Also which value should I consider CPU process or CPU aggregate to show the health of all the services which are usually built using Java ?

     

    I am unable to explain why the CPU metrics shows different stats to the customer as am myself confused.

    Please help.

    Attachment(s)

    docx
    CPU Metrics.docx   128 KB 1 version


  • 2.  Re: CPU Metrics - Process and Aggregate Values
    Best Answer

    Broadcom Employee
    Posted Aug 03, 2015 05:52 AM

    Here is what our old KB says about this:

     

    Question

    What is the difference between the Metrics CPU|Utilization

    % (process) and CPU|{Processor number}: Utilization % (aggregate)?

     

    Answer

    CPU|Utilization % (process) describes the percentage of CPU the JVM is using.

    CPU|{Processor number}: Utilization %(aggregate) describes the total utilization of CPU,

    including JVM and non-JVM processes. This is what you would see for

    the total load on the CPU using the top or ps commands.

    If you want to know the overall CPU utilization on a 4-processor machine,

    including JVM and non-JVM, this data is available in Introscope by creating a Metric Grouping

    and a Calculator for each processor.



  • 3.  Re: CPU Metrics - Process and Aggregate Values

    Posted Aug 03, 2015 11:23 AM

    Hi, I have already read this. I just want to know how should I find the processes which are included under CPU (%process) and those under CPU (%aggregate). Because top command shows a different stats which doesn't match to both of them.



  • 4.  Re: CPU Metrics - Process and Aggregate Values

    Broadcom Employee
    Posted Aug 03, 2015 07:02 PM

    CPU|Utilization % (process) describes the percentage of CPU the JVM is using: top will show you the CPU percentage of 1 CPU per process, i.e. if a process uses two full CPUs top will show 200%. If there are 4 CPUs in the server Introscope will only show you 50% because it divides through number of CPUs.

     

    CPU (%process) is the CPU% of the JVM that the agent is running in, i.e. your WAS, WL, Tomcat, ... app server.

     

    CPU (%aggregate) is the CPU% of all processes of the server. If you run tops it is (user+sys) = (100% - idle)



  • 5.  Re: CPU Metrics - Process and Aggregate Values

    Posted Aug 04, 2015 02:10 PM

    Hi,

    You cannot compare 'top' command stats with (aggregate%).

    'top' command will only help you to know which non-jvm process (PID) is culprit on your server (and not on JVM or CPU i.e. processor).

     

    Regarding %process and %aggregate, your approach should be like this -

    • Check first for %process value. If high, indicates your application (or indirectly jvm where your application is deployed) is consuming unwanted  CPU resources.
    • If no, check for %aggregate, if value is low, your system is not under caution\danger state from resource utilization.
    • If value is high in the above point, 'top' command will help you to know which particular PID (or system\kernel process) is consuming system resources.

     

    Hope it helps.

     

    Regards,

    Vaibhav



  • 6.  Re: CPU Metrics - Process and Aggregate Values

    Posted Aug 06, 2015 11:07 AM

    Hi Vaibhav,

    Thanks for your support.

    I am confused around the last point. The aggregate value on one agent shows the value around 80 % from last two weeks and as you said when I run the top command, it shows the value around 0.3 for the same agent.

    The % process value is very low. So how should I find out what is eating up the CPU in the % aggregate value.

     

    Secondly before two weeks we could see the same stats for % process and % aggregate. Then development team did some changes to the new service they had implemented and the % process value came down.

    But the % aggregate value still remains around 80%, so confused.