we have a job that fails to connect to a remote host once in a while, restarting the job manually works every time. I would like esp to retry the job three times before it fails after a 5 minute sleep.
A similar scenario is listed in the Examples Cookbook on DocOps
Great thank you very helpful.
If a particular job fails, it should be resubmitted from the top after a 5-minute delay.
Take the following steps:
1. When you define the job, use a NOTIFY statement that identifies the Alert to be triggered if the job fails. For example:
NOTIFY FAILURE ALERT(BAD)
2. Define the Alert using the ALERTDEF command or initialization parameter. For example:
OPER ALERTDEF ADD ID(BAD) EVENT(CYBER.RESUB_JOB)
3. Set up the Alert Procedure and Event. The Alert Event invokes the following Procedure:
IF ESPREEXEC#=0 THEN DO
SEND '%MNJOB HAS FAILED WITH %MNCMPC' U(*)
SEND 'AUTOMATIC RESUBMISSION IN 5 MINUTES' U(*)
ELSE ESP AJ %MNJOB RESUB APPL(%MNAPPL..%MNAPPLGEN)
When the Alert Procedure is invoked, the ESPREEXEC# variable is 0. Two SEND messages are sent to the user. The Procedure is then scheduled to re-execute 5 minutes later. At that time, ESPREEXEC# is 1, and an AJ command resubmits the failed job. Monitor variables are used to ensure the job is resubmitted in the correct generation (MNAPPLGEN) of the correct Application (MNAPPL).
Another approach is to have the Alert Procedure build a one-job Application that contains a link with a DELAYSUB time of NOW PLUS 5 MINUTES. The link can issue the ESP AJ command to resubmit the failed job when the DELAYSUB time is met.
Rick beat me by 30 seconds, but I noticed that my example is also out in DocOps:
And, for the record, I would feel remiss if I didn't include a link to the main Cookbook page:
Great thank you this is very helpful. I looked and look and could not find the cookbook page.
Retrieving data ...