Automic Workload Automation

Back to discussions

Expand all | Collapse all

SYS_HOST_ALIVE Results

1. SYS_HOST_ALIVE Results

0 Recommend
Anon Anon
Posted Oct 25, 2013 05:26 PM

Reply Reply Privately
I had always mistakenly assumed that the SYS_HOST_ALIVE Script function was similar a UNIX ping and was reflective of the current state of the Agent’s connection status and ability to perform work. It appears though that is only accurate when the Agent has been stopped/terminated by UC4 functions. If the network connection has been lost or the server is genuinely dead then the results are suspect for a period of time. I believe that this delay is controlled by the KEEP_ALIVE parameter in the UC_HOSTCHAR_DEFAULT System Variable object in Client 0 and is currently set to 1800 seconds for all of our Agents. That means, as far as I can tell, UC4 checks to see if the Agent is “alive” every 31 minutes. It appears that even at its best, setting this value to 60 seconds, it could be up to two minutes out of date as to its current status when checked via SYS_HOST_ALIVE. Many of our applications have multiple server configurations with the intended usage of a primary and alternate host(s) on which to execute their tasks. The manner in which the selection occurs are most all dependent on the results of the SYS_HOST_ALIVE Script function. We have used this technique for years and it has never been reported as an issue. That means that either something has changed in the UC4 definitions (I don’t think so), nobody ever noticed or this is the first time that we specifically tested this particular condition (the server was physically turned off while it was active). It’s my belief that it is the latter situation that has finally presented itself. So, does anyone have a suggestion of what can be done to have the state of an Agent be more immediately reflective of its true status or am I missing something obvious? Thanks, Mark p.s. we are Operations Manager Version 8.
2. SYS_HOST_ALIVE Results

0 Recommend
Michael A. Lowry
Posted Oct 29, 2013 12:27 PM

Reply Reply Privately
You could accomplish this using PREP_PROCESS and UNIXCMD.
:SET &HND# = PREP_PROCESS(&AGENT#,UNIXCMD,"*","CMD=ping -c 1 -W &REMOTE_HOST# && echo success","UC_LOGIN=&LOGIN#") :PROCESS &HND# : SET &LINE# = GET_PROCESS_LINE(&HND#,1) : PRINT &LINE# : IF &LINE# = "success" : SET &HOST_ALIVE# = "TRUE" : ELSE : SET &HOST_ALIVE# = "FALSE" : ENDIF :ENDPROCESS :CLOSE_PROCESS &HND# :PRINT "Remote host &REMOTE_HOST# alive? &HOST_ALIVE#"
Save this UC4 script in a JOBI. Then check if the remote host is alive directly from your UC4 scripts/jobs: 1. Set the values of &AGENT#, &REMOTE_HOST#, and &LOGIN# 2. Run the JOBI with an :INC statement. 3. Perform actions based on the value of &HOST_ALIVE# For this to work, you must turn on the "Generate at runtime" option for any script or job within which you include this JOBI.
3. SYS_HOST_ALIVE Results

0 Recommend
Anon Anon
Posted Oct 29, 2013 02:03 PM

Reply Reply Privately
Michael: Thank you for your response. This is a technique that we are aware of but have discounted using it as we don’t believe that it will work consistently for us for the following reasons. 1. A genuine ping will only verify that the port on the selected host is active and not tell us that the Agent is genuinely “alive” and able to perform UC4 tasks. 2. We have many different platforms and not all of them are currently configured to support a ping type of command or response. 3. The host that is to execute the ping must be assumed to have an active Agent itself. If it’s not then we are possibly back to the original issue and do not want to be delayed with a “Waiting on Host”. Given the nature of our environment: connections, firewalls, security, etc., it would be very difficult to coordinate which host has a connection to the Agent that we wish to check without some very elaborate scheme. Again, thank you for this solution but the execution of a separate job just to determine if the Agent is alive is probably not very practical for us. I was hoping for a built-in UC4 function or technique that provided a genuine status as I had mistakenly believed that SYS_HOST_ALIVE provided. My search continues. I will be at Innovate next week and will probably raise this issue in the Support Labs or other appropriate venue.
4. SYS_HOST_ALIVE Results

0 Recommend
Timothy_Dodd_84
Posted Oct 29, 2013 02:26 PM

Reply Reply Privately
Hi Mark! I know that 'tcp_keepalive=' was added to the Linux ini file. I'm not sure which version it was added though, and I don't have my 8.00A docs handy. But, then again, you say UNIX and not Linux. You could also explore the 'tcp_keepalive_time=' in the ucsrv.ini file.
5. SYS_HOST_ALIVE Results

0 Recommend
Anon Anon
Posted Oct 29, 2013 03:56 PM

Reply Reply Privately
Tim: It appears that the 'tcp_keepalive” parameter is only associated with the OS/390 Agent, and the 'tcp_keepalive_time” is not documented in my Version 8.00A228-561 OM documentation. While my example of the ping was for UNIX, we would need this facility for all of our platforms such as: LINUX, Windows, SAP, OS/390, etc. Thanks anyway.

Automic Workload Automation

SYS_HOST_ALIVE Results

Anon AnonOct 25, 2013 05:26 PM

Michael A. LowryOct 29, 2013 12:27 PM

Anon AnonOct 29, 2013 02:03 PM

Timothy_Dodd_84Oct 29, 2013 02:26 PM

Anon AnonOct 29, 2013 03:56 PM

1. SYS_HOST_ALIVE Results

2. SYS_HOST_ALIVE Results

3. SYS_HOST_ALIVE Results

4. SYS_HOST_ALIVE Results

5. SYS_HOST_ALIVE Results