Symantec Access Management

Expand all | Collapse all

java.net.connectException

  • 1.  java.net.connectException

    Posted Dec 08, 2017 09:33 AM

    Hi,

     

    I'm performance testing a WS-FED partnership set up with CA Access Gateway.

     

    I have 4 servers load balanced round robin.

     

    After a particular load level, tomcat starts refusing connections with a java.net.cConnectException connection refused.

     

    I was wondering if we have any tuning parameters available to allow the fedproxy and tomcat to allow more connections?

     

    As of now my java heap size is at 512 MB. My server is capable. I can increase that, but i'm not running into OOM errors, so not sure if this is necessary.

     

    I'm running it on windows so i guess the access gateway uses the mpm winnt module. I see the settings for MaxThreads at 1920. I'm not sure if changing this will help?

     

    Regards,

    Anand.



  • 2.  Re: java.net.connectException

    Posted Dec 08, 2017 09:53 AM

    There are quite a handful of parameters that could be tweaked.

     

    MaxThreads are the front end apache settings. Imagine a funnel, if you keep expanding the top of the funnel, it doesn't matter. What matters is the end of the funnel and the underlying component; that is where the bottleneck is.

     

    Understanding the layers through which a request flows in CA AG is quite important.

    https://docops.ca.com/ca-single-sign-on/12-7/en/implementing/implementing-ca-access-gateway/ca-access-gateway-architecture-introduced 

     

    To begin I'd recommend, review these contents which have been explained by Mark.

    https://communities.ca.com/community/ca-security/ca-single-sign-on/blog/2017/02/01/techtip-testing-solution-to-agent-gatewaysps-with-one-bad-back-end

    https://communities.ca.com/message/241968217-re-recommended-value-for-http-socket-timeout?commentID=241968217#comment-241968217

    https://docops.ca.com/ca-single-sign-on/12-7/en/configuring/ca-access-gateway-configuration/configure-the-proxy-service-settings-manually

     

    We are not just looking at CA AG, but also at backend components.

     

    Also we are having 4 CA AG in round robin under a LB. Have we first tried the test with a single CA AG under the LB and tuned that to a good value. Performance testing is also an art which needs planning and stepped approach. This allows us to investigate, monitor, tune and derive result at every component. Throwing all apples into a basket is not the ideal thought process with which we should be performance testing. Just a side note, to think beyond just the error message.



  • 3.  Re: java.net.connectException

    Posted Dec 08, 2017 10:00 AM

    Thanks HubertDennis

     

    I don't have a backend. My backend is the tomcat that comes packaged with the access gateway. I'm just running federation and not using it as a reverse proxy.

     

    Is there a way I can have tomcat service more requests? Would I have to tweak the mod_jk_ajp parameters to adjust?

     

    Regards,

    Anand.



  • 4.  Re: java.net.connectException

    Posted Dec 08, 2017 10:07 AM

    I would say cut down the CA AG to one. As it makes the investigations much better. Once we have an ideal tuning combination. Replicate the same on all CA AG. Then run all the other 3 CA AG individually, just to mitigate any other issue and to make sure all 4 CA AG number on average match up (variation should not be more than +/- 5%).

     

    If the issue crops up, lets widen the mod_jk funnel and retest.

     

    Bear in mind, you'll also need to monitor the connection between CA AG and PS, PS and UD.



  • 5.  Re: java.net.connectException

    Posted Dec 08, 2017 10:15 AM

    Like I said, performance testing is an art, we just don't do performance test, we plan and design and evaluate and execute. Here you go. This is just the mod_jk parameter.

     

    I made this for a Customer to assist in Performance testing. We then played with the numbers in each run. After each run we evaluated the result. Then we matched the result to the change done and arrived at a derivation. Then based on the derivation made the next change for the next run.

     

    These parameters are used by mod_jk
    ParameterDescriptionDefault ValueCurrent ValueRun1 ValueRun2 ValueRun3 Value
    worker.ajp13.accept_countNumber of request waiting in queue (queue length): This represents the maximum queue length for incoming connection requests when all possible request processing threads are in use. Any requests received when the queue is full are refused.101050  
    worker.ajp13.min_spare_threadsNumber of threads created at initialization time: This represents the number of request processing threads that will be created when this connector is initialized. This attribute should be set to a value smaller than that set for worker.ajp13.max_threads.1050100  
    worker.ajp13.max_threadsMaximum number of concurrent connections possible: This represents the maximum number of request processing threads to be created by this connector, which therefore determines the maximum number of simultaneous requests that can be handled.100500650  
    worker.ajp13.reply_timeoutThe maximum time (milliseconds) that can elapse between any two packets received from proxy engine after which the connection between HTTP listener and proxy engine is dropped. A value of zero makes it to wait indefinitely until response is received (default). The parameter value should be kept equivalent to the http_connection_timeout.0 (infinite/never timeout)0180000  
    worker.ajp13.retriesThe maximum number of times that the worker will send a request to Proxy Engine in case of a communication error. Each retry will be done over another connection. The first time already gets counted, so retries=2 means one retry after error.223  
    worker.ajp13.max_packet_sizemax_packet_size': This attribute sets the maximum AJP packet size in Bytes. The maximum value is 65536. This same value will be used as 'packetSize' attribute for AJP connector on the Tomcat side. 1638416384  
    worker.ajp13.connection_pool_timeoutDefines the maximum time (in seconds) that the idle connections (between apache and tomcat over mod-jk) remain in the connection pool before timing out.00200  
    ajp13.max_header_countThe maximum number of headers in a request that are allowed by the container. A request that contains more headers than the specified limit will be rejected. A value of less than 0 means no limit. If not specified, a default of 100 is used.100100100  


  • 6.  Re: java.net.connectException

    Posted Dec 15, 2017 07:31 PM

    Hi HubertDennis

     

    I've been working with this a bit more. I have determined that my front end apache is refusing connections. The backend and tomcat (Underlying user directories, policy server, policy store) seem to be holding up okay. I can see in the apache logs saying that the thread limit reached, consider increasing the MaxThreadsPerChild setting.

     

    When I try to increase MaxThreadsPerChild, it gives me a warn during startup in apache error log that MaxThreadsPerChild is set higher than ThreadLimit of 1920 and it reduces the MaxThreadsPerChild to 1920.

     

    Is there a hard limit of 1920 on the apache threads? Can this be increased? the mpm_winnt has a limit of 15000, is it possible to set the SPS value closer to this number?

     

    I have 14 SPS instances running to serve a load of 70,000 WS-FED transactions within a 10 minute period. So that is about a combined concurrent requests of 25,000 or so.



  • 7.  Re: java.net.connectException

    Posted Dec 15, 2017 07:35 PM

    Hi Anand anand3g

     

    Looking at blogs "Default value for ThreadLimit is 1920 when used with mpm_winnt". Recompile is out of the equation.

     

    https://httpd.apache.org/docs/2.4/mod/mpm_common.html#threadlimit

     

     

     

    Questions for clarity

     

    • Have we checked how many connections are on the apache port ?

     

    • Have we checked what is the "status" of the connection on the port that apache is listening ? "CLOSED_WAIT" / "Established" / "Listening" etc ?

     

    • What is the version of this CA AG / Bitness ?

     

    • What hardware is this running on (CPUs / RAM) ?

     

    • Is there only one CA AG instance running on each server ?

     

     

    • MaxThreadsPerChild ? I can't seem to find a reference to 'MaxThreadsPerChild'. Could you attach the httpd.conf from windows SPS. I do not have a windows SPS. 

     

    • The other thing that I have been wondering about is, did we change the value from default to greater than 1920 and is such a high value needed. Did we test with a value lower than 1920 e.g. 500, 1000, 1500, 1900 etc. A stepped approach is preferable. If we did a stepped approach what was the result from that ? 


  • 8.  Re: java.net.connectException

    Posted Dec 15, 2017 10:13 PM

    Version r12.52 SP1 CR05

     

    I have this in my httpd.conf

     

    <IfModule mpm_winnt.c>
    ThreadLimit 15000
    ThreadsPerChild 15000
    MaxRequestsPerChild 0
    </IfModule>

     

    But when I turn on debug logs on apache and restart, this is what I see.

     

    [Fri Dec 15 22:05:24 2017] [info] mod_ssl/2.2.22 compiled against Server: Apache/2.2.22, Library: OpenSSL/0.9.8x-fips
    [Fri Dec 15 22:05:24 2017] [notice] Child 11492: Child process is running
    [Fri Dec 15 22:05:24 2017] [debug] mpm_winnt.c(398): Child 11492: Retrieved our scoreboard from the parent.
    [Fri Dec 15 22:05:24 2017] [info] Parent: Duplicating socket 380 and sending it to child process 11492
    [Fri Dec 15 22:05:24 2017] [info] Parent: Duplicating socket 372 and sending it to child process 11492
    [Fri Dec 15 22:05:24 2017] [info] Parent: Duplicating socket 376 and sending it to child process 11492
    [Fri Dec 15 22:05:24 2017] [debug] mpm_winnt.c(595): Parent: Sent 3 listeners to child 11492
    [Fri Dec 15 22:05:24 2017] [debug] mpm_winnt.c(554): Child 11492: retrieved 3 listeners from parent
    [Fri Dec 15 22:05:26 2017] [notice] Child 11492: Acquired the start mutex.
    [Fri Dec 15 22:05:26 2017] [notice] Child 11492: Starting 150 worker threads.
    [Fri Dec 15 22:05:26 2017] [notice] Child 11492: Starting thread to listen on port 443.
    [Fri Dec 15 22:05:26 2017] [notice] Child 11492: Starting thread to listen on port 80.
    [Fri Dec 15 22:05:26 2017] [notice] Child 11492: Starting thread to listen on port 443.

     

    so it seems it just ignores the ThreadsPerChild settings that I set in httpd.conf.

     

    Does this mean, I'm only running with 150 threads?



  • 9.  Re: java.net.connectException

    Posted Dec 15, 2017 10:48 PM

    okay turns out the setting in extra/httpd-mpm.conf was overriding this.

     

    I changed it there and was able to get ThreadLimit and ThreadsPerChild to 3000 and now it does generate 3000 threads.

     

    I tried going upto 5760, but it maxed out at some 3600 and apache died on me. So I think 3000 is a safe limit for my physical ram of 6GB. (my server actually is 12 GB, but I have two instances running. Some of this Physical RAM is assigned to my tomcat JVM, so there is a upper limit for the amount of memory apache can use)



  • 10.  Re: java.net.connectException

    Posted Dec 18, 2017 01:16 PM

    Thank You Anand anand3g for confirming on the finer details and the ability to be able to flex the configuration.

     

    Could you suggest if this meant you were able to accomplish your numbers?



  • 11.  Re: java.net.connectException

    Posted Dec 18, 2017 01:20 PM

    My Apache scales now, but tomcat runs out of memory. Can't seem to stretch

    that past 1328M. So only option left to me is to upgrade to 64 bit.

    Planning to upgrade to 12.7

     

    On Dec 18, 2017 12:15 PM, "Hubert Dennis" <



  • 12.  Re: java.net.connectException

    Posted Dec 18, 2017 01:23 PM

    True with 32bit JVM (if your error message is JVM out of memory). It makes good sense to use 64bit JVM and 64bit CA AG. As a side note, Just keep an eye out for the version of your Policy Server.



  • 13.  Re: java.net.connectException

    Posted Dec 18, 2017 01:26 PM

    I’m going to need a 12.7 PS as well right? Or does the 12.7 AG work with

    12.52SP1 PS?

     

     

     

    Regards,

     

    Anand.

     

     

     

    From: HubertDennis

    Sent: Monday, December 18, 2017 1:24 PM

    To: Anand Rao <agrao@simeiosolutions.com>

    Subject: Re:  - Re: java.net.connectException

     

     

    CA Communities <https://communities.ca.com/?et=watches.email.thread>

     

     

    Re: java.net.connectException

     

    reply from Hubert Dennis

    <https://communities.ca.com/people/HubertDennis?et=watches.email.thread> in *CA

    Single Sign-On* - View the full discussion

    <https://communities.ca.com/message/242030419-re-javanetconnectexception?commentID=242030419&et=watches.email.thread#comment-242030419>



  • 14.  Re: java.net.connectException

    Posted Dec 18, 2017 01:36 PM

    It would be a safer bet to use 12.7 PS with 12.7 CA AG. Longer term + performance benefit is vast, because then all JVMs are 64 bit, CA AG 64bit and PS 64bit. Plus no support hassles OR doubts if / whether this / that (with R12.52PS) would function or fail.



  • 15.  Re: java.net.connectException

    Posted Dec 18, 2017 01:52 PM

    Anand,

     

    Take the latest Service Pack 12.7 SP1, as it has more defect fixes and as Hubert mentioned performance benefit is vast.

     

    Regards

    Kapil



  • 16.  Re: java.net.connectException

    Posted Dec 15, 2017 08:17 PM

    I have about 1500-1600 connections when I try netstat-an

     

    lot of them in FIN_WAIT_2 state. But apache logs clearly says threads exceeded.

     

    [Fri Dec 15 13:34:52.368247 2017] [mpm_winnt:error] [pid 4852:tid 3360] AH00326: Server ran out of threads to serve requests. Consider raising the ThreadsPerChild setting

     

    TCP 10.19.56.123:443 10.19.10.25:32914 ESTABLISHED
    TCP 10.19.56.123:443 10.19.10.25:32931 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:32964 ESTABLISHED
    TCP 10.19.56.123:443 10.19.10.25:32998 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33010 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33048 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33096 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33129 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33170 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33223 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33241 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33312 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33450 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33620 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33663 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33700 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33743 FIN_WAIT_2
    TCP 10.19.56.123:443 10.19.10.25:33762 FIN_WAIT_2