DX Application Performance Management

Expand all | Collapse all

Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

  • 1.  Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Jan 28, 2016 01:45 PM

    We are running into an issue where a 2 minute AJAX function is being reported as a stall every single time (since it exceeds the 30 second time frame).


    As a result, our live error viewer is filled with these "false stalls".

     

    However, the other metrics (average response time, concurrent invocations, etc) are still useful to us, so skipping the particular class entirely is less than ideal. We also make use of the stalls metric in other places, so we don't want to disable stall tracking entirely, nor can we change the time frame.

     

    If I'm reading correctly, it appears the BlamePointTracer is responsible for Average Response Time, Concurrent Invocations, Errors Per Interval, Responses Per Interval, and Stall Count. Is it possible to omit this particular class from JUST the stall count metrics? If so, how is this accomplished?

     

    Thanks for your assistance!



  • 2.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?
    Best Answer

    Posted Feb 01, 2016 01:23 PM

    Hi,

    Stalls thresholds are global so you can't omit a particular class\method pair from being monitored for stalls however instead of using BlamePointTracer to monitor(which produces 5 metrics including Stalls) you can specifically deploy individual tracers on required method to get rest of the 4 metrics for e.g

     

    -Average Response Time(e.g BlamedMethodTimer)

    -Responses Per Interval (e.g BlamedPerIntervalCounter)

    -Errors Per Interval(e.g ExceptionErrorReporter)

    -Concurrent Invocations(e.g ConcurrentInvocationCounter)

     

    This means you are looking at writing a custom pbd and depending on your skill level might require additional reading.

    Please look at below links for further referece:

    List of Probe Builder Directives

    Instrumentation Best Practices

    Custom Instrumentation with CA APM

     

    Hope this helps.

     

    Regards,

    Kulbir.



  • 3.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 01, 2016 01:30 PM

    Kulbir -

     

    that information is very helpful! Am I right in understanding that each of these tracers would be a custom pbd of its own listing the above mentioned items? I will also peruse the links you provided - thank you!



  • 4.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 01, 2016 01:48 PM

    Hi,

    Those are 4 individual tracers that produce corresponding metrics for e.g ConcurrentInvocationCounter when applied to a class\method will produce "Concurrent Invocations" metric.

    You will need just a single pbd with 4 individual directives specifying the required method to be probed with each of the above tracers or subset as needed.

    Once you go through the Custom instrumentation link provided it will be more clear.

     

    Regards,

    Kulbir.



  • 5.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 01, 2016 03:18 PM

    Kulbir - question regarding this then:

     

    If we use a skip class directive and then instrument that class within the custom tracers, will the custom tracers still function, or will the skipclass directive override them?

     

    I ask because I would presume we have to tell the default BlamePointTracer to skip that class, or else we are still going to have the Stalls metric being generated, correct?



  • 6.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 01, 2016 04:12 PM

    Correct, but you can use “SkipClassForFlag” so that your class gets put on a blacklist just for the specified “out of the box” flag but not for your “custom” one.



  • 7.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 08:28 AM

    I think I understand, but just to be safe-

     

    The stall itself contains this:

    Error at 16:11:51.167 (27 Jan 2016)

    Frontends|Apps|GTConnect|URLs|Default (0 ms)

    Application Name: GTConnect

    Class: com.gtnet.httpClient.comms.UnifiedAcceptor

    Context Path: /GTConnect

    CorGUID: *omitted*

    DataCreationType: 0

    Error Message: Stalled Transaction

    HTTP Method: POST

    Is dynamic: false

    Is temporary: false

    Method: service

    Method Descriptor: (Ljavax/servlet/http/HttpServletRequest;Ljavax/servlet/http/HttpServletResponse;)V

    Resource Name: Servlets|{classname}

    Scheme: https

    Server Name: *omitted*

    Server Port: *omitted*

    Session ID: *omitted*

    Thread Group Name: main

    Thread Name: WebContainer : 61

    Trace ID: 1453929146997:876180%1

    Trace Type: ErrorSnapshot

     

    my understanding is that we would want to set up a pbd with the following directive:

    "SkipClassForFlag: <class-name> <Tracer-group>".

     

    SkipClassForFlag: com.gtnet.httpClient.comms.UnifiedAcceptor ErrorSnapshot

     

    Is that correct?



  • 8.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 08:56 AM

    Nope, you would have to do this:

     

    SkipClassForFlag: com.gtnet.httpClient.comms.UnifiedAcceptor HTTPServletTracing

    But that won’t work. The problem is that it’s a Servlet, and the “service” method bytecode (which is what the Agent works with) is not in the class itself, but in the javax.servlets.HttpServlet parent class, so you would have to skip that and manually instrument all of your Servlets one by one, which is going to be quite painful.

     

    So, you’re really better off living with the Stall in this occasion. What’s the problem with having a Stall? Is it because it shows up in an alert and that you want to filter it out?



  • 9.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 09:10 AM

    Ah dang...

     

    The reason we're trying to knock this out is this:

     

    Stalls 1.PNG

    stalls 2.PNG

     

    Basically, the "stalls" from this comet transaction is flooding the live error viewer; it's burying other actual errors. It isn't a huge problem per say; I was just hoping to clear it up to keep our error viewer organized and tidy.



  • 10.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 09:51 AM

    Ah, there might be a solution to this problem.

     

    Look at this section of the Agent profile:

     

    #######################

    1. Error Detector Configuration

    #

    1. ================

    2. Configuration settings for Error Detector

     

    1. Please include errors.pbd in your pbl (or in introscope.autoprobe.directivesFile)

     

    1. The error snapshot feature captures transaction details about serious errors

    2. and enables recording of error count metrics.

    3. Changes to this property take effect immediately and do not require the managed application to be restarted.

    introscope.agent.errorsnapshots.enable=true

     

    1. The following setting configures the maximum number of error snapshots

    2. that the Agent can send in a 15-second period.

    3. Changes to this property take effect immediately and do not require the managed application to be restarted.

    introscope.agent.errorsnapshots.throttle=10

     

    1. The following series of properties lets you specify error messages

    2. to ignore.  Error snapshots will not be generated or sent for

    3. errors with messages matching these filters.  You may specify

    4. as many as you like (using .0, .1, .2 ...). You may use wildcards (*).

    5. The following are examples only.

    6. Changes to this property take effect immediately and do not require the managed application to be restarted.

    #introscope.agent.errorsnapshots.ignore.0=com.company.HarmlessException

    #introscope.agent.errorsnapshots.ignore.1=HTTP Error Code: 404

     

     

    Maybe you can specify something like:

    introscope.agent.errorsnapshots.ignore.0=com.gtnet.httpClient.comms.UnifiedAcceptor

     

    I have no idea if that’s gonna work or not, but worth a shot.

     

    Otherwise, you can also adjust the Stall timeout for this particular Agent, by default it’s 30 seconds, but maybe if you set it to 45 seconds or 1 minute you wouldn’t get flooded anymore?

     

    1. Minimum threshold for stall event duration

    2. Changes to this property take effect immediately and do not require the managed application to be restarted.

    introscope.agent.stalls.thresholdseconds=30



  • 11.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 10:52 AM

    I will certainly give that a try and see what happens! Once we get a chance to make the change and observe, I'll report the results!

     

     

    Something else we were looking through, regarding all this:

     

    The class generating the stalls (GTConnect) is directly related to the UnifiedAcceptorRequestWrapper. Would it be possible to remove the stall tracer from the tracer group that is instrumenting the UnifiedAcceptorRequestWrapper and then reinstate the tracer for any other class being instrumented by name, and would doing so prevent the Stall tracer being run on the UnifiedAcceptorRequestWrapper?

     

    My senior was walking me through the logs - it looks like the UnifiedAcceptorRequestWrapper is called here:

     

    Processing class com/gtnet/httpClient/security/config/RequestURIConfig

    Processing class com/gtnet/httpClient/security/config/GeneralErrorPage

    Processing class com/gtnet/httpClient/security/config/Page

    Processing class com/gtnet/httpClient/comms/AcceptorMode

    Processing class com/gtnet/httpClient/comms/UnifiedAcceptorRequestWrapper

    Processing class javax/servlet/http/HttpServletRequestWrapper

    Processing class javax/servlet/ServletRequestWrapper

            getInputStream:0                   inserted method tracer object allocation: com/wily/introscope/agent/trace/hc2/DirectStreamAccessWatcherTracer

            getReader:0                        inserted method tracer object allocation: com/wily/introscope/agent/trace/hc2/DirectStreamAccessWatcherTracer

     

    The Unified Acceptor Request Wrapper is being instrumented with the DirectStreamAccessWatcherTraces and is the only ServletRequestWrapper that is being instrumented within context of the Java Agent on this server.

     

    Could we disable that tracer and change the definition to exclude the stalls?



  • 12.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 11:54 AM

    This tracer is not responsible for creating any Stall metrics (it does some internal stuff only), so shutting it down wouldn’t do what you’re trying to do.



  • 13.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 11:59 AM

    The thing that's odd is that, looking at our logs as everything loads up, that seems to be the only tracer really referencing the UnifiedAcceptor at all.

     

     

    Ah well, I'll pass that along - Thanks Florian! I've put together a proposal on testing the change to ignore the errors in the introscope agent properties - we're just waiting on a time where we can get access and support from our websphere team to implement and test.



  • 14.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 01:43 PM

    Actually, I do have another question: Looking at the documentation on the introscope.agent.errorsnapshots.ignore.0= function, I found this:

     

    Ignore Specific Error Messages

    Specifies which error messages to ignore.  Error snapshots will not be generated or sent for errors with messages matching these filters.  You may specify any number using 0, 1, 2 and wildcards (*).

    The following are examples only. Changes to this property take effect immediately and do not require the managed application to be restarted.

    Property Name: introscope.agent.errorsnapshots.ignore.0=*

    Examples:

    introscope.agent.errorsnapshots.ignore.0=*com.company.HarmlessException*

    introscope.agent.errorsnapshots.ignore.1=*HTTP Error Code: 404*

    I am not certain though... what does the number represent? Is that the number of times it will ignore the error in a given interval before it reports it?



  • 15.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 01:54 PM

    That's just a unique identifier in case you want to specify multiple errors that should not be reported by Error Detector NOT how many times it should be ignored.

    Anytime you define one agent will not report any error snapshots for same as long as error string matches what's defined in the property.

    In any case it's not going to work for stalls so as Florian suggested you might need to look into increasing the stall threshold itself, however keep in mind that it's a global setting.

     

    Regards,

    Kulbir.



  • 16.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 02, 2016 01:58 PM

    Kulbir -

     

    I'm a bit confused now. Florian suggested trying that, and Francis updated my support case recommending to try it as well. Now you are saying that the introscope.agent.errorsnapshots.ignore will not work?

     

    I just want to make sure I'm understanding. The problem with increasing the stall threshold is that 30 seconds is where it needs to be for pretty much every other transaction - it's this one single transaction that it needs to not be 30 seconds on (or simply not recorded at all).



  • 17.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 04:46 AM

    Yes, I suggested trying it, as it’s quite easy and it would solve your problem if it worked. I’m not sure if that will work or not, but I think it’s worth a try.

     

    Now Kulbir thinks it’s not going to work. Knowing Kulbir, he’s probably right ☺ But I’d still try it anyhow.

     

    If that doesn’t work, can you tell me what’s the “average response time” for this particular transaction?



  • 18.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 08:18 AM

    The average response time for this transaction has a pretty decent variance - for example:

     

    GTConnect ART 60 minute.PNG

     

    The thing that makes this a bit harder to pin down to a specific time frame is that anything that returns within the two minute mark is considered acceptable. Over the last week, over a one hour resolution, we had values ranging from 1100ms up to 59,016ms. If I up the resolution to 6 minute intervals, over the same time frame, we start seeing the 120,000ms times.

     

    GTConnect ART 6 minute.PNG

     

    GTConnect ART 1 minute.PNG

     

    We have not had the chance yet to deploy the change you suggested and test it due to other activities in the environment - I am hoping we will get a chance to do that today or tomorrow.

     

    Thanks again Florian!



  • 19.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 08:57 AM

    Well if the errorsnapshot property doesn’t work, here’s my suggestion:

     

    Increase the timeout to ~ 100 seconds, that way you will get around most of the problem with this method, yet retain the ability to capture “Stalls” for all the other transaction in the application that take more than 1 minute and 40 seconds. (up from 30 seconds).

     

    It’s not “great”, but at least you won’t get flooded with useless errors anymore.



  • 20.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 09:01 AM

    Well, actually scratch my previous answer.

     

    “The thing that makes this a bit harder to pin down to a specific time frame is that anything that returns within the two minute mark is considered acceptable.”

     

    That’s curious, because from the graphs you sent, that response actually never seems to take more than 120 seconds, it seems to be capped at 120 seconds. Is there a timeout configured?



  • 21.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 09:56 AM

    My (albeit limited) understanding is that it is a COMET push mechanism that allows the server to send info to the browser in several chunks without the browser having to re-open the connection each time. This is all delivered to a java session on the users machine. It is along the lines of:

     

     

    Browser                                                        Server

    Request ----- wait -------------------------> send data

    ----------------- wait -------------------------> send data

    ----------------- wait -------------------------> send  data

    ----------------- wait -------------------------> send data

    ----------------- wait -------------------------> send data

    While the server is sending data back to the browser, the browser is rendering or doing other tasks as well. From what I understand, the two minute timeout is hardcoded at some level, though we are not sure where.



  • 22.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 10:14 AM

    Right… Well, the only way I can think of seems to put the stall timeout at 121 seconds! Not perfect, but at least you’ll know that if you get a Stall metric/event it’s a very real one!



  • 23.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 10:25 AM

    *nod*

     

    One thing I've been trying to get organized in my head -

     

    If we adjust the stall threshold, that is on the APM agent on that server, correct? Not globally across all agents (ie, it isn't set on the Enterprise Collector to be evaluated as it comes in)?

     

    If that is accurate, my next step would be to look at if we could forgo the current stall window in favor of the larger one for those particular servers/JVM's that those agents monitor.

     

    I appologize for my confusion!

     

    Thanks Florian!



  • 24.  Re: Is it possible to disable Stall tracking for a single java class, instead of disabling stall tracking entirely or skipping the entire class?

    Posted Feb 03, 2016 10:58 AM

    Correct, you can have one Agent.profile just for the very JVMs that host the application with these long running transactions.