Antoine_Sauteron_1266

System healthcheck

Discussion created by Antoine_Sauteron_1266 on Aug 4, 2016
These questions come up a lot : how should I check the health of my Automation Engine? Is there anything I should monitor?

So I started to put together a set of objects that summarizes some of the most common system information:

  • Automation Engine and initial data version.
  • CPU and DB performance (as can be seen in the server's logs).
  • Status of server processes.
  • Count of active tasks.
  • Usage over the last minute, the last 10 minutes, and the last hour.
  • Number of records stored in the RH table, which is usually a good indicator to check if maintenance should be performed, and against which client(s) in priority.

The output looks like this:
y4f4onb856hz.jpg

A few adjustments should be done before executing them in your environment:

1) make sure that the SQL is stored in the right tab in the VARA.SQLI objects

2) Adjust the target location of the WP log in the script line:
:SET &HND# = PREP_PROCESS_FILE(<Agent>,"<directory>\Wsrv_log_001_00.txt","*Performance*")
> The Agent should be running on the Automation Engine server.
> the directory should be the one where the server logs are stored (e.g. C:\AutomationEngine\temp)

Of course this only offers a limited view, and can / should be improved. I'm attaching the objects to this post, please feel free to import them and give some feedback.

I'm putting the code below, in case the import fails:

SCRI.HEALTHCHECK
!VERSION SUMMARY
:P [VERSION]
:SET  &VERSION# = SYS_INFO(SERVER, VERSION,ALL)
:PRINT Automation Engine version &VERSION#
:SET &INITIALDATA# = SYS_INFO(INITIALDATA, VERSION,ALL)
:P Initial data version &INITIALDATA#
:P ""

!SYSTEM PERFORMANCE
:P [GLOBAL PERFORMANCE]
:SET &HND# = PREP_PROCESS_FILE(<Agent>,"<directory>\WP_srv_log_001_00.txt","*Performance*")
:PROCESS &HND#
:   SET &LINE# = GET_PROCESS_LINE(&HND#)
:   SET &LINE# = MID(&LINE#,104)
:   PRINT &LINE#
:ENDPROCESS
:P ""

! Display status of server processes
:P [STATUS OF SERVER PROCESSES]
:SET &HANDLE# = PREP_PROCESS_VAR("VARA.SQLI.SERVER.PROCESSES")
:PROCESS &HANDLE#
:   SET &SERVER_PROCESS# = GET_PROCESS_LINE(&HANDLE#,1)
!:   SET &VALUE# = GET_SCRIPT_VAR(&VARIABLE#)
:      SET &PROCESS_ALIVE#=SYS_SERVER_ALIVE("&SERVER_PROCESS#")
:   SWITCH &PROCESS_ALIVE#
:     CASE "N"
:     PRINT  &SERVER_PROCESS# is down.
:     CASE "Y"
:     PRINT  &SERVER_PROCESS# is up.
:   ENDSWITCH
:ENDPROCESS
:P ""

!Display total workload usage
:P [TASKS COUNT]
!COUNT active tasks
:SET &COUNT_ACTIVE_TASKS#=SYS_ACTIVE_COUNT("ANY_ALIVE","*")
:SET &COUNT_ACTIVE_TASKS#=FORMAT(&COUNT_ACTIVE_TASKS#)
:P There are currently &COUNT_ACTIVE_TASKS# active tasks.
:P ""

!Calculate workload
:P [WORKLOAD]
!during the last minute
:SET &SYSBUSY01# = SYS_BUSY_01()
:SET &SYSBUSY01#=FORMAT(&SYSBUSY01#)
:P &SYSBUSY01#% during the last minute.
! during the last 10 minutes
:SET &SYSBUSY10# = SYS_BUSY_10()
:SET &SYSBUSY10#=FORMAT(&SYSBUSY10#)
:P &SYSBUSY10#% during the last 10 minutes.
! during the last hour
:SET &SYSBUSY60# = SYS_BUSY_60()
:SET &SYSBUSY60#=FORMAT(&SYSBUSY60#)
:P &SYSBUSY60#% during the last hour.
:P ""

!COUNT RECORDS IN RT Table
:P [NUMBER OF REPORTS PER CLIENT]
:SET &HND2# = PREP_PROCESS_VAR(VARA.SQLI.COUNT.REPORTS.PER.CLIENT)
:PROCESS &HND2#
:   SET &CLIENT# = GET_PROCESS_LINE(&HND2#,'2')
:   SET &VALUE# = GET_PROCESS_LINE(&HND2#,'3')
:   SET &CLIENT# = FORMAT(&CLIENT#, "0000")
:   PRINT Client &CLIENT# : &VALUE# records
:ENDPROCESS

VARA.SQLI.COUNT.REPORTS.PER.CLIENT
SELECT RH_Client as 'Client', COUNT (*) as 'number of records'
FROM RH
GROUP BY RH_Client
ORDER by 'Number of records' DESC;

VARA.SQLI.SERVER.PROCESSES
SELECT OH_Name
FROM OH
WHERE OH_OTYPE='SERV'

Outcomes