With the information that is presented, there is not a lot to go on for a definitive cause... The fact that the job aborted with:
Child: Job launch failed. : Resource temporarily unavailable
4 23:45:01- Child:Error executing program [È×Ôÿ¥� ]
,command
4 23:45:01- Child:Error Number = -1
4 23:45:01- Child:Error errno = 22:0
Child: Job return = -1
4 23:45:01- Child: put to memory:[-1]
4 23:45:01- Child: In memory:[-1]
and it has
- program [È×Ôÿ¥� ]
That program name seem like its corrupt... when it working normally does it display correctly in the job?
The fact that the first message is regarding "Resource temporarily unavailable" then showing that program name (that seem to be displayed strangely). later it show "Error errno" 22 which normally is for "Invalid argument"..
As you mention this issues does not occur all the time - When it does occurred, does your OS admin notice any system/memory resource on the system? Without more info on what exactly is happening to your system during one of these occurrence of the issues. My guess, would be that the OS probably hitting some resource available such as mem etc... and so the full augment of what it need to pass couldn't be completed.
I would suggest working with your OS admin to see if there is any trend with the resource on the machine during the time of these occurrence. Additionally, I would also recommend looking at the system rmi/agent logs as they tend to have more information then the job itself on what is occurring with the system during the time.