Bug 756384 - RFE: Add suspend/continue job operations
Summary: RFE: Add suspend/continue job operations
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: 2.1
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: 2.3
: ---
Assignee: Trevor McKay
QA Contact: Martin Kudlej
URL:
Whiteboard:
Depends On:
Blocks: 850563 876588
TreeView+ depends on / blocked
 
Reported: 2011-11-23 12:23 UTC by Matthew Farrellee
Modified: 2013-03-06 18:39 UTC (History)
8 users (show)

Fixed In Version: cumin-0.1.5251-1
Doc Type: Enhancement
Doc Text:
Cause Suspend and continue actions on jobs were added to the QMF and Aviary interfaces in condor. Consequence These actions were missing from the job control options in Cumin. Change Suspend and Continue buttons were added alongside the Hold, Release, and Remove buttons on the job selection table under a submission. Suspend and Continue task links were added to the list of job control tasks on a job details page. Additionally, the Suspended job count statistic for submissions was added as a column on submissions lists. The "Enqueued" column values were abbreviated to make room, with the full value now displayed when hovering with the mouse. Result Suspend and continue functionality has been integrated into Cumin.
Clone Of:
: 876588 (view as bug list)
Environment:
Last Closed: 2013-03-06 18:39:55 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0564 normal SHIPPED_LIVE Low: Red Hat Enterprise MRG Grid 2.3 security update 2013-03-06 23:37:09 UTC

Description Matthew Farrellee 2011-11-23 12:23:37 UTC
Jobs can now be suspended/continued, similar to hold/release. Cumin needs to expose these operations.

Comment 3 Trevor McKay 2012-03-09 21:46:28 UTC
Fixed in revision 5425.

Added Suspend and Continue buttons to the job selection list in a submission drill down.  Added "Suspend job" and "Continue job" task links inside a job details page.

Comment 4 Trevor McKay 2012-03-14 16:25:09 UTC
Correction, fixed in 5246.  The revision number in Comment 3 is transposed (5245) and the initial commit added only the buttons and the links, a lot is missing.  

This revision adds database upgrades, updated schema files, and changes to the Submission table drawing.

Comment 5 Trevor McKay 2012-03-16 21:28:21 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
    Suspend and continue actions on jobs were added to the QMF and Aviary interfaces in condor.

Consequence
    These actions were missing from the job control options in Cumin.

Change
    Suspend and Continue buttons were added alongside the Hold, Release, and Remove buttons on the job selection table under a submission.  Suspend and Continue task links were added to the list of job control tasks on a job details page.  Additionally, the Suspended job count statistic for submissions was added as a column on submissions lists.  The "Enqueued" column values were abbreviated to make room, with the full value now displayed when hovering with the mouse.

Result
    Suspend and continue functionality has been integrated into Cumin.

Comment 6 Trevor McKay 2012-04-26 17:36:56 UTC
Note on testing:

1) Submit a job from Cumin
2) Let it run
3) Drill into the Submission and use the "Suspend" button to suspend the job.
4) Check the status of the job with condor_q.  Status should change to "S"
5) Check the Job status column.  It should show "Suspended" (with some delay).
6) Drill into the job and check the details page.  Job Status here should also show "Suspended"
7) Return to the Submission list.  Check the Idle/Running/Completed/Held/Suspended columns, the Suspended column should show "1".

8) Drill into the Submission and use the "Continue" button to continue the job.
9) Check the status of the job with condor_q.  Status should change to "I" or "R"
10) Check the Job status column.  It should show "Running" (with some delay).
11) Drill into the job and check the details page.  Job Status here should also show "Running"
12) Return to the Submission list.  Check the Idle/Running/Completed/Held/Suspended columns, the Running column should show "1".
 

The test can be repeated use the Suspend and Continue links in a job drill down, as opposed to the buttons.  Also, multiple jobs can be selected at the same time.

Things complicating this test at the moment:

I managed to perform this test on a VM with just a few jobs.  However, this may be affected by issues related to Bug 799838.  In one instance, I saw a job disappear from QMF and Aviary after I suspended it (though condor_q showed it as suspended).  It did not return until I "Continue"'d the job and it completed.

Obviously, if jobs disappear after they are suspended it is difficult to test the continue function.  We may have to re-evaluate after Bug 799838 is fixed.

Comment 10 Martin Kudlej 2013-01-15 10:39:54 UTC
Tested on RHEL 5.9/6.4 x i386/x86_64 with
condor-7.8.8-0.3
condor-classads-7.8.8-0.3
condor-qmf-7.8.8-0.3
condor-wallaby-base-db-1.25-1
condor-wallaby-client-5.0.5-1
condor-wallaby-tools-5.0.5-1
cumin-0.1.5648-1
python-condorutils-1.5-6
python-qpid-0.18-4
python-qpid-qmf-0.18-13
python-wallaby-0.16.3-1
python-wallabyclient-5.0.5-1
qpid-cpp-client-0.18-13
qpid-cpp-server-0.18-13
qpid-qmf-0.18-13
qpid-tools-0.18-7
ruby-condor-wallaby-5.0.5-1
ruby-qpid-qmf-0.18-13
ruby-wallaby-0.16.3-1
wallaby-0.16.3-1
wallaby-utils-0.16.3-1

and it works. -->VERIFIED

Comment 12 errata-xmlrpc 2013-03-06 18:39:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html


Note You need to log in before you can comment on or make changes to this bug.