Lucy_zhang

CA Tuesday Tip: (CA ESP) Start CA WA ESP System With Different Parms

Discussion created by Lucy_zhang Employee on Jan 3, 2012
CA Tuesday Tip by Lucy Zhang, Principal Support Engineer for 01/03/2012

Note:
- More detail can be found on “Operator’s Guide”.
- It’s best practice to have following added to ESPPARM, so that JESMSGLG will show how CA WA ESP was started (not all parms have related variables, like SKIP or PURGE):
ECHO '----------------'
ECHO 'START UP OPTIONS'
ECHO '----------------'
ECHO 'COLD =' %COLD
ECHO 'QUIESCE =' %QUIESCE
ECHO 'RELOAD =' %RELOAD
ECHO 'QFORM =' %QFORM
ECHO 'RESFORM =' %RESFORM


QUIESCE: The CA WA ESP system will be in the quiesced state, which defers event execution. Use the RESTART command to take it out of the quiesced state. This is used popularly for Disaster Recovery test and major changes to the CA WA ESP system (like increase APPLFILE etc), so operators can validate first.

RELOAD: Specifies that all CA WA ESP modules residing in CSA be reloaded (that is, control blocks are re-initialized). This should be used when the CA WA ESP system is upgraded or applied new aggregate maintenance/a group of PTFs.

SKIP: Skips the first TDR (time-driven request) in the queue. For Abend0C4, most of the time, the data exception is only related to the first TDR. So recycle ESP with PARM=SKIP will skip it and let ESP process other TDRs. As a result, the skipped TDR needs to be handled manually.

PURGE: This parameter forces CA WA ESP system to perform an initial start as if there was an IPL. This is useful for inconsistent issues after upgrade or applying PTFs, or system outrage;
Note: If you use CA WA EE Restart Option(Formerly called Encore), before bring down CA WA ESP for start with the PURGE parameter, stop Encore Auxiliary address space by issuing an "/F esp_stc,AUX_AS CANCEL" command or a FORCE ARM operator command.

COLD: This will reformat checkpoint file, needed when checkpoint file should be increased or error messages with “INSUFFICIENT CHECKPOINT SPACE TO QUEUE ……” or “No CKPT space to issue ……” won’t stop even 30 minutes after actions taken (like turn TRACKING back to STORE). The following items are lost in a cold start:
■ Event class actions - Any CLASS actions entered like hold, ignore or suspend are lost.
■ Trigger-added Events - If an Event was added to the schedule via a TRIGGER ADD command, it is lost over a cold start.
■ Pending application manager actions - application actions that are pending are lost in a cold start. These are entries that can show as queued for submission in CSF. Resubmit these jobs. (Items displayed with a leading + in the output from the LISTSCH command are pending application actions.)

Since COLD can bring negative impact, we recommend:
- Make sure checkpoint has enough space, by issuing LISTCKPT in batch job when the system is busy, it shouldn’t be over 70% usage;
- 3 cyls normally is big enough for Master system, while 5 cyls is good even for very busy Master;
- Before stop CA WA ESP system for COLD start:
1. Issue “OPER QUIESCE”;
2. Issue “LSCH DATE” and keep a copy of the output; the lines with leading ‘+’ should be checked by operator later after COLD start;
3. Issue “OPER LISTHIST” and “OPER LISTEVS”, validate the outputs with ESPCOLD member;
4. Stop proxies before COLD start of Master, this will avoid too much data from proxies flood to Master before it’s initialized.
- After start CA WA ESP system with parm=COLD:
1. Reply “Y” to the prompt to format checkpoint file;
2. Issue “LSCH DATE” again and validate with the previous output;
3. Issue “OPER RESTART” to resume processing;
4. Warm start proxies one by one.

Outcomes