Description of problem: VM Universe jobs that have problems executing get put on Hold with a HoldReason set. A failure to run on one machine does not mean it would fail to execute on another VM capable execute node. VM Universe jobs that fail to run should be placed back in Idle state rather than on Hold. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Are there any details on the HoldReason?
I don't recall the HoldReason. Likely a statement that the job could not run.
So in looking through the code, this is expected behavior if the job outright fails to execute with some error (usually due to some botched submission or configuration). e.g. Hold reason: Error from slot1.com: VMGAHP_ERR_INTERNAL