A prospect has the following concern about how Nimsoft is able to handle a job with a status of MSGW when it is used to monitor named jobs.
When a job has a status of MSGW, it could have this for two reasons:
1. The job has stopped due to an error, has sent an inquiry message to a message queue, and is waiting for a reply to that message before it can continue. This is the most common reason for a job to be in MSGW.
2. The job is intentionally monitoring a message queue, waiting for a message to arrive.
In both cases, the job status will be MSGW, but where in case 1 the job has stopped and an alert should be raised, in case 2, this situation is by design and is part of the job’s normal operation.
The problem comes when a job that is normally monitoring a message queue encounters an error and stops waiting for a reply to the error message because there won’t be any change to the job’s status. There are a few ways to tell the difference between these situations:
1. If checking the job manually, option 7 against the job in question on the WRKACTJOB display will attempt to display the message that the job is waiting on. If the job is monitoring a message queue (case 2 above) then a message will be returned that the job is not waiting for a specific message.
2. From a program, the IBM i API QUSRJOBI can be used to interrogate any job and return a flag that indicates whether the job is not in MSGW, is in MSGW but not waiting for a specific message, or in MSGW and stopped waiting for a message reply.
3. Another way to tell the difference would be to look at the job’s joblog and see whether the last message in the joblog is an inquiry message that hasn’t been replied to, but this is reasonably fiddly compared to the other methods.
Does anyone have a suggestion or advise on how to respond to this customer?