AutoSys Workload Automation

  • 1.  Continuous Autosys file-watcher issues when the agent on the server loses connectivity

    Posted Oct 29, 2015 02:37 PM

    Hi all -

     

    We have implemented an Autosys file-watcher as a continuous process to look for the files and have them run to success only when it finds a file(s). The Parent box is designed based on the continuous file-watcher and the Global variable and so it doesn't need any schedule to kick off. It's working as a perfect data-driven design until we found the below drawback related to Autosys Agent connectivity issues.

     

    When the Autosys Agent on the server loses connectivity, file-watcher loses the connection too. However, when the agent re-establishes the connection - it's not being recognized by the file-watcher and remains struck in the "Running" status. Teams are ending up in killing and restarting the file-watcher to reset the service.

     

    Did anyone has noticed the similar issues with the continuous file-watcher, if so, what's the resolution you put in place?

     

    Any suggestions or workarounds would be really appreciated.

     

    Thanks,
    Swetha



  • 2.  Re: Continuous Autosys file-watcher issues when the agent on the server loses connectivity

    Posted Oct 29, 2015 04:14 PM

    Swetha,

    Could you give some additional details on this?  What connection is lost?    I assume the connection lost is the file mount for the filewatcher file?  Something like that?  What OS?

    Thanks,

    Brad E



  • 3.  Re: Continuous Autosys file-watcher issues when the agent on the server loses connectivity
    Best Answer

    Posted Oct 29, 2015 04:32 PM

    Hi Brad -

    Autosys agent loses the connection when the machine, where FW runs is taken offline or gone through a quick bounce etc. (could be any OS). Continuous jobs are having issues with this as the agent is not able to send the status back to scheduler anymore. We have just got the below response from CA as a workaround.

     

    Response from CA:

    "If the machine is taken offline, whether via sendevent or reboot, the recommendation would be to force the continuous job to complete via sendevent –E KILLJOB –J job_name.  The reason for this is that the scheduler will lose track of the job, because the agent will no longer be communicating the status of the job back to the scheduler.

     

    You would want to kill the job prior to taking offline, and then start back up after the machine is brought back online, for clean job processing."

     

    ~Swetha