Objective
If a particular job fails, it should be resubmitted from the top after a 5-minute delay.
Solution
Take the following steps:
1. When you define the job, use a NOTIFY statement that identifies the Alert to be triggered if the job fails. For example:
JOB A
NOTIFY FAILURE ALERT(BAD)
RUN ANYDAY
ENDJOB
2. Define the Alert using the ALERTDEF command or initialization parameter. For example:
OPER ALERTDEF ADD ID(BAD) EVENT(CYBER.RESUB_JOB)
3. Set up the Alert Procedure and Event. The Alert Event invokes the following Procedure:
IF ESPREEXEC#=0 THEN DO
SEND '%MNJOB HAS FAILED WITH %MNCMPC' U(*)
SEND 'AUTOMATIC RESUBMISSION IN 5 MINUTES' U(*)
REEXEC IN(5)
ENDDO
ELSE ESP AJ %MNJOB RESUB APPL(%MNAPPL..%MNAPPLGEN)
Explanation
When the Alert Procedure is invoked, the ESPREEXEC# variable is 0. Two SEND messages are sent to the user. The Procedure is then scheduled to re-execute 5 minutes later. At that time, ESPREEXEC# is 1, and an AJ command resubmits the failed job. Monitor variables are used to ensure the job is resubmitted in the correct generation (MNAPPLGEN) of the correct Application (MNAPPL).
Variation
Another approach is to have the Alert Procedure build a one-job Application that contains a link with a DELAYSUB time of NOW PLUS 5 MINUTES. The link can issue the ESP AJ command to resubmit the failed job when the DELAYSUB time is met.