Automic Workload Automation

  • 1.  PSA: JOBFs stuck in "Connecting"

    Posted Jun 26, 2017 10:08 AM
    Public Service Announcement:

    Say you have two JOBF inside a JOBP. Now assume the first JOBF takes a file off of server #1, and puts it onto server #2. The second JOBF takes the same file off of server #2, and puts it on server #3 (I'm not saying this is overly useful, I'm just saying  that I found myself debugging this exact scenario).

    You might then find the second JOBF occasionally getting stuck in a state called "Connecting". It will be stuck until the end of time (or until a power outage occurs, whatever happens first). You may also notice that when this happens, the agent logs show that the connection to the first server is terminated AFTER the second JOBF has already begun its file transfer.

    This appears especially likely to happen if your first JOBF is a high latency connection, for us the first JOBF in the chain was a network link to Germany from Russia (where even the railroad track width changes somewhere mid-route, so I imagine TCP handovers are an equally time-consuming affair ;)

    The problem goes away once you introduce an artificial delay of sufficient amount in between the two JOBF - in my case, a "Generate at Runtime" Automic Script with a ":wait 3" was enough to make the problem disappear.

    So if you chain JOBFs - don't. But if you absolutely have to, put some extra delay in there.

    p.s. this is with 10.0.3 core and 10.0.8 agents. Your mileage may vary.


  • 2.  PSA: JOBFs stuck in "Connecting"

    Posted Aug 01, 2017 11:15 AM
    Hi Carsten_Schmitz_7883 ,

    The problem is Known Error?


    Do you know?
    Thanks
    Karina


  • 3.  PSA: JOBFs stuck in "Connecting"

    Posted Aug 01, 2017 11:25 AM
    Hey Karina_Mankauskas_553,

    Good question ...

    It has not been confirmed by Automic as a "known problem", but they also didn't rule it out. But since we're on engine version 10, Automic has said that we've only demonstrated this with an engine version which is out of mainstream support, and that the issue is not critical enough to warrant further investigation (by means of e.g. passing on to development). That is, presumably, unless we can demonstrate it with a fully supported version such as V12. I can't do that at present time for logistical reasons.

    I do, however, have a second thread on the matter (honest mistake, I forgot I posted this already lol). It's here: https://community.automic.com/discussion/10169/race-condition-with-two-subsequent-jobf

    In that one, Josef from Automic has kindly offered to have a another look into the matter.

    Out of curiosity: Did you encounter this problem, too?

    Cheers,
    Carsten


  • 4.  RE: PSA: JOBFs stuck in "Connecting"

    Posted May 12, 2023 08:46 AM

    Hi Carsten_Schmitz_7883 ,

    we are having this same issue in version 12.3.2. Is there a known issue with this in version 12.3? Unfortunately, it could not be resolved by Support.

    thanks




  • 5.  RE: PSA: JOBFs stuck in "Connecting"

    Posted May 22, 2023 11:21 AM

    Please  upgrade the agent to  the latest 12.3.9.

    Regards,
    Ashwin




  • 6.  PSA: JOBFs stuck in "Connecting"

    Posted Aug 01, 2017 03:01 PM
    Hi, Carsten_Schmitz_7883

    I found one customer that is receivend the status "Connecting" for long time in JOBF, or the JOBF abends with ended abnormally, 

    2017-08-01 09:55:43 - U0011124 Selection started with filter '/sftp/....xxxxx530011.zip*' ...

    2017-08-01 09:55:43 - U0011125   '/sftp/...xxxx9530011.zip'

    2017-08-01 09:55:43 - U0011126 Files selected: '1'.

    2017-08-01 09:55:43 - U0063073 FT'432622437' Connection to Agent '***' terminated.

    2017-08-01 09:55:43 - U0011409 FT '432622437': FileTransfer ended abnormally.

    Will be is the same problem?


    Thanks and regards,

    Karina







  • 7.  PSA: JOBFs stuck in "Connecting"

    Posted Aug 01, 2017 03:08 PM
    Hi, Carsten_Schmitz_7883

    I found one customer that is receivend the status "Connecting" for logn time in JOBF, or the JOBF abends with status ended abnormally:

    2017-08-01 09:55:43 - U0011124 Selection started with filter '/sftp/....xxxxx530011.zip*' ...

    2017-08-01 09:55:43 - U0011125   '/sftp/...xxxx9530011.zip'

    2017-08-01 09:55:43 - U0011126 Files selected: '1'.

    2017-08-01 09:55:43 - U0063073 FT'432622437' Connection to Agent '***' terminated.

    2017-08-01 09:55:43 - U0011409 FT '432622437': FileTransfer ended abnormally.

    Will be is the same problem?

    thanks,
    Karina





  • 8.  PSA: JOBFs stuck in "Connecting"

    Posted Aug 02, 2017 04:20 AM
    Hi Karina_Mankauskas_553

    not entirely sure without analysing our own logs again, but at first sight, this does not look like the exact same problem to me.

    We have, however, had the same or very similar issues to the one you describe, too.We had that happen (i.e. "Files selected" immediately followed by "Connection terminated" and "File Transfer ended abnormally") on JOBF to one particular Linux server. We occasionally saw this over the time of several weeks, and then it suddenly stopped happening. Hasn't happened ever since, for maybe two month.

    In our case, the root cause might have been a network issue, but we also observed that even after the network issues were not occuring anymore, the agent would still not reconnect, eventhough it should have. Automic ultimately recognized that the agent should have reconnected, but didn't. But long story short, there won't be fixes for the AE version we're on anymore (Version 10), and therefore also no problem reports.

    I heard at the time, and gotten confirmation about this at AutomicWorld too, that significant parts of the net code for the Agent have been rewritten in version 12. Specificially, the old agent reportedly had a bit of feature creep and thus ended up with multiple socket libraries. The V12 agent is said to contain a new, unified socket library. So for our part, we are placing our hopes towards version 12 (that's 12.1, because we decided 12.0 is a no-go for us).

    What AE version and agent version are you on? Are you able to replicate the problem reliably, or does it happen "once in a blue moon" as it did for us?

    Best,
    Carsten