ESP Workload Automation

  • 1.  Monitor ESP Job status

    Posted Jul 25, 2017 02:42 PM

    Does anybody have any methods of automatically monitoring for job statuses such as 'ready' 'reswait' etc...  I ask b/c we are being ask to put some logic in that will let 24x7 know if a job has been stuck in Ready for x minutes.

     

    Does any one monitor these types of statuses or any others that you may watch?  Is it manual or automated?  I'm open to other suggestions that could add value.  Thanks!



  • 2.  Re: Monitor ESP Job status
    Best Answer

    Posted Jul 25, 2017 04:17 PM

    We have created a number of different views in CSF that all Operators are expected to look at on a defined interval. 

     

    One of them is the READY View (PNODE = READY).

     

    If jobs are present they are expected to researched further by the Operator.

     

    We also have an automated process that creates a job that the Operators must address so we know that the are going through the CSF Views at the defined interval.



  • 3.  Re: Monitor ESP Job status

    Broadcom Employee
    Posted Jul 26, 2017 09:10 AM

    Rick thank you so much for your continuous support. About  views that you mentioned , can I ask you please if possible send me via email some examples , what data combination mostly your team is interested in monitoring prospective, job states, anticipated end/start time, minimum run time? It will help me to understand better the use cases you have.

     

    Thank you

    Tatevik 



  • 4.  Re: Monitor ESP Job status

    Posted Jul 26, 2017 12:46 AM

    We have only one Operator per 12 hour shift so we had to automate pretty much everything. At this time we check for jobs that have exceeded their average run time, agent notify/transmitter busy, externals, queued appls, reswait, held appls, held jobs, held events, suspended events  and wob trigger errors. With a few exceptions one can get all they need from an 'lap - all incomplete' and parse out the information. So, every half hour we issue the above, dump it to a flat file, take a snapshot of the online log (for the whos), run a series of jobs to retrieve info on all of the above and dump it to files. That data is then drawn upon by a verbose web page, a consolidated web page and a series of ISPF panels that is more intuitive. It's then up to the Ops to determine what needs to be actioned immediately and what can be left.



  • 5.  Re: Monitor ESP Job status

    Broadcom Employee
    Posted Jul 26, 2017 08:56 AM

    Thank you for bringing this up for discussion. ESP team has been working on public REST APIs for monitoring and this can be good candidate to extend that end points for showing the data you are interested. We are also building interface to display this monitoring data. We are going to have sprint review on August 1st,  if you are registered in validate.ca.com you should have already the invite. We will show you our progress, share the next steps and expectations. If you not have the invite and interested in it you can just contact me tatevik.stepanyan@ca.com and I will forward you the invite. I would be happy also do call with you and our internal team to discuss the use cases you are solving and to see the web page that you have created, which can help us better shape your needs in our planning. 

     

    Thank you for your interest, 

    Tatevik