Jobs can now be suspended/continued, similar to hold/release. Cumin needs to expose these operations.
Fixed in revision 5425. Added Suspend and Continue buttons to the job selection list in a submission drill down. Added "Suspend job" and "Continue job" task links inside a job details page.
Correction, fixed in 5246. The revision number in Comment 3 is transposed (5245) and the initial commit added only the buttons and the links, a lot is missing. This revision adds database upgrades, updated schema files, and changes to the Submission table drawing.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause Suspend and continue actions on jobs were added to the QMF and Aviary interfaces in condor. Consequence These actions were missing from the job control options in Cumin. Change Suspend and Continue buttons were added alongside the Hold, Release, and Remove buttons on the job selection table under a submission. Suspend and Continue task links were added to the list of job control tasks on a job details page. Additionally, the Suspended job count statistic for submissions was added as a column on submissions lists. The "Enqueued" column values were abbreviated to make room, with the full value now displayed when hovering with the mouse. Result Suspend and continue functionality has been integrated into Cumin.
Note on testing: 1) Submit a job from Cumin 2) Let it run 3) Drill into the Submission and use the "Suspend" button to suspend the job. 4) Check the status of the job with condor_q. Status should change to "S" 5) Check the Job status column. It should show "Suspended" (with some delay). 6) Drill into the job and check the details page. Job Status here should also show "Suspended" 7) Return to the Submission list. Check the Idle/Running/Completed/Held/Suspended columns, the Suspended column should show "1". 8) Drill into the Submission and use the "Continue" button to continue the job. 9) Check the status of the job with condor_q. Status should change to "I" or "R" 10) Check the Job status column. It should show "Running" (with some delay). 11) Drill into the job and check the details page. Job Status here should also show "Running" 12) Return to the Submission list. Check the Idle/Running/Completed/Held/Suspended columns, the Running column should show "1". The test can be repeated use the Suspend and Continue links in a job drill down, as opposed to the buttons. Also, multiple jobs can be selected at the same time. Things complicating this test at the moment: I managed to perform this test on a VM with just a few jobs. However, this may be affected by issues related to Bug 799838. In one instance, I saw a job disappear from QMF and Aviary after I suspended it (though condor_q showed it as suspended). It did not return until I "Continue"'d the job and it completed. Obviously, if jobs disappear after they are suspended it is difficult to test the continue function. We may have to re-evaluate after Bug 799838 is fixed.
Tested on RHEL 5.9/6.4 x i386/x86_64 with condor-7.8.8-0.3 condor-classads-7.8.8-0.3 condor-qmf-7.8.8-0.3 condor-wallaby-base-db-1.25-1 condor-wallaby-client-5.0.5-1 condor-wallaby-tools-5.0.5-1 cumin-0.1.5648-1 python-condorutils-1.5-6 python-qpid-0.18-4 python-qpid-qmf-0.18-13 python-wallaby-0.16.3-1 python-wallabyclient-5.0.5-1 qpid-cpp-client-0.18-13 qpid-cpp-server-0.18-13 qpid-qmf-0.18-13 qpid-tools-0.18-7 ruby-condor-wallaby-5.0.5-1 ruby-qpid-qmf-0.18-13 ruby-wallaby-0.16.3-1 wallaby-0.16.3-1 wallaby-utils-0.16.3-1 and it works. -->VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html