PeterCaci

Longer run times after consolidating batch to single LPAR

Discussion created by PeterCaci on Dec 5, 2017
Latest reply on Dec 18, 2017 by PeterCaci

Hello all,

First, I'm an MVS guy very new to IDMS so please excuse my lack of IDMS experience/knowledge.  I'm running with this since our IDMS folks are busy and I'm impatient.  Plus, I have a bad feeling I'm the causer.  :-)

 

Traditionally, our batch IDMS jobs have been split across two LPARs; SYSA and SYSB.  Both LPARs are participants in a sysplex and share DASD.  SYSA is designated as production and SYSB is designated as development.  The split workload is leftover from a time near-forgotten when there were two physical mainframes on the floor (today there is only one).  The production IDMS database lives on SYSA and all jobs running in CV mode ran on SYSA.  All jobs running in LOCAL mode ran on SYSB.

We've been pushing the capacity of our box, so, as an MVS'er, I wanted the ability to manipulate LPAR weighting to give production workloads more system resources as needed.  Can't do that when production is also running on the dev LPAR.  Technically I could, but I didn't want to squash the prod work on SYSB.  So I convinced the team to consolidate all production batch to a single LPAR (SYSA) which, of course, includes the previously mentioned IDMS jobs.  

Since the change, there are some IDMS batch jobs (not all) that are running longer and experience greater variation in overall run time.  Job stats show CPU and IO are nearly the same as previous.  Only the elapsed time appears to be significantly effected resulting in a noticeably longer batch cycle (long enough to generate complaints from the app folks).  This longer elapsed time issue does not appear to correlate with CV mode jobs or LOCAL mode jobs.  Other than being a 'database job', I can't find any commonality.  The job schedule hasn't changed; jobs that ran concurrently previous are running concurrently now (JES initiators were adjusted accordingly).

 

Any thoughts?  Suggestions?

 

If it's not CPU or IO, it leads me to suspect WAIT time.  But where is the wait time?  I don't notice any significant ENQs with RMF.  DASD response time is mostly sub-millisecond.

 

Talking with our IDMS folks, there is awareness of some index, buffer, and page size optimizations that need to be made.  However this was the case before the batch consolidation.  Could batch consolidation have exacerbated the aforementioned issues?

 

Thanks in advance!

Outcomes