Bug 639383 - Cumin: "Longest Running Grid Submissions" based on QMF object creation time
Summary: Cumin: "Longest Running Grid Submissions" based on QMF object creation time
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: 1.3
Hardware: All
OS: Linux
low
medium
Target Milestone: 2.1
: ---
Assignee: Trevor McKay
QA Contact: Stanislav Graf
URL:
Whiteboard:
Depends On: 736709
Blocks: 743350
TreeView+ depends on / blocked
 
Reported: 2010-10-01 15:47 UTC by Matthew Farrellee
Modified: 2012-01-23 17:25 UTC (History)
5 users (show)

Fixed In Version: cumin-0.1.5033-1
Doc Type: Bug Fix
Doc Text:
Previously, Cumin based the age of a submission around the creation times of the QMF object that represented the submission in the MRG Messaging space. However, displays such as the Longest Running Grid Submissions table in the default persona were affected by events in the MRG Messaging space, and could therefore be inaccurate. This update ensures that the data generated by condor and integrated into Cumin present the earliest queue date of any job included in a submission, with the result that the Longest Running Grid Submissions display should now be accurate. In addition, a new column which shows the queue date has been added to the table.
Clone Of:
Environment:
Last Closed: 2012-01-23 17:25:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 736709 0 unspecified CLOSED Add a "runtime" or "queue date" statistic to the schema for submissions 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2012:0045 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 2.1 bug fix and enhancement update 2012-01-23 22:22:58 UTC

Internal Links: 736709

Description Matthew Farrellee 2010-10-01 15:47:16 UTC
Description of problem:

The Longest Running Grid Submissions list reports Duration based on when the QMF Submission object was last created. Not based on the time when the Submission was created, or some other useful metric such as the longest running job within the submission.

Fixing this may require extra information from the QMF plugins.

Comment 2 Trevor McKay 2011-08-25 19:30:47 UTC
Maybe this part of the default persona overview page should just be removed.  I believe it is the only place in cumin where we talk about "longest running".  It's not part of the grid persona, and we don't have the correct data anyway.

Comment 3 Matthew Farrellee 2011-08-25 19:36:35 UTC
If you do that, file a BZ to bring it back when we do have appropriate data. The use case of identifying long running submissions is a powerful one.

Comment 4 Trevor McKay 2011-08-25 20:35:55 UTC
Changing priority to low based on persona views (only visible in default, not grid or messaging)

Comment 5 Trevor McKay 2011-09-08 13:22:55 UTC
More feedback from UI review.  Assuming better data becomes available from condor, we should consider putting an "accumulated runtime" or "queue date" column (depends on what info is available) on the submissions table and make it sortable.

Adding a BZ for Grid to add data to the schema for submissions.

Comment 6 Martin Kudlej 2011-09-29 11:07:10 UTC
How we can test this bug? Is there any repro scenario?

Comment 7 Trevor McKay 2011-09-29 13:00:08 UTC
Martin,

  Here at least is a reproducer to show the problem.  More about verifying the solution when it is finished.

1. Set "persona: default" in the [web] section in cumin.config

2. Start cumin (and condor if it is not already running)

3. Wait for submissions to show up under Longest Running Submission on the front page.  The duration here will be the elapsed time since the QMF submission object was created.

4. Shut condor down and wait for all the submissions to disappear from Cumin's Longest Running Submission display.

5. Start condor again.  The durations under longest running Submissions should reset and begin near zero.

6. You can also look in the cumin database to verify that the durations reported match the _qmf_create_time column.

$psql -d cumin -U cumin -h localhost
cumin=# select * from "com.redhat.grid"."Submission";

(maybe limit the query to a particular submission, etc.

Comment 8 Trevor McKay 2011-09-30 14:50:49 UTC
Fixed in revision 5031.

Longest Running Submissions on the default persona front page now uses QDate.
Schema has been updated to include QDate (includes upgrade script).
"Enqueued" column replaces "Scheduler" column on Admin->Grid->Submissions tab.

Scheduler link added to submission summary (Admin->Grid->Submissions->pick one) to make up for the lost column.  This was done because adding "enqueued" to the submission table made the table too wide with the scheduler column still there.

Comment 10 Trevor McKay 2011-10-05 12:51:21 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause
    Cumin's idea of the age of a submission was based on the creation time of the QMF object representing the submission in the MRG Messaging space because there was not adequate data to determine submission age.

Consequence
    Displays such as the Longest Running Grid Submissions table in the default persona were affected by events in the MRG Messaging space and were not accurate.

Fix
    New data generated by condor and integrated in Cumin gives the earliest queue date of any job included in a submission.

Result
    Cumin uses queue date values as the age of a submission.  The Longest Running Grid Submission display should be accurate.  Additionally, a new column showing queue date has been added to submission table displays.

Comment 11 Stanislav Graf 2011-10-18 12:46:17 UTC
Reproduction on RHEL5/6 i386/x86_64:

0. Install cumin and condor:
# rpm -q condor
condor-7.6.3-0.3.el5
condor-7.6.3-0.3.el6
# rpm -q cumin
cumin-0.1.4916-1.el5
cumin-0.1.4916-1.el6
1. Set "persona: default" in the [web] section in cumin.config
2a. Start cumin (and condor if it is not already running)
2b. Submit jobs: 
# su -c 'echo -e "cmd=/bin/true\nhold=true\nqueue 10" | condor_submit' test
3. Wait for submissions to show up under Longest Running Submission on the
front page.  The duration here will be the elapsed time since the QMF
submission object was created.
4. Shut condor down and wait for all the submissions to disappear from Cumin's
Longest Running Submission display.
5. Start condor again.  The durations under longest running Submissions should
reset and begin near zero.
6. You can also look in the cumin database to verify that the durations
reported match the _qmf_create_time column.
$psql -d cumin -U cumin -h localhost
cumin=# select * from "com.redhat.grid"."Submission";

Verification on RHEL5/6 i386/x86_64:

0. Update cumin and condor:
# rpm -q condor
condor-7.6.4-0.8.el5
condor-7.6.4-0.8.el6
# rpm -q cumin
cumin-0.1.5068-1.el5
cumin-0.1.5068-1.el6
# cumin-admin upgrade-schema
1. skip
2a. Restart cumin
2b. skip
3. Wait for submissions to show up under Longest Running Submission on the
front page. 
4. Shut condor down and wait for all the submissions to disappear from Cumin's
Longest Running Submission display.
5. Start condor again.  The durations under longest running Submissions should NOT
reset and shoud NOT begin near zero.
6. You can also look in the cumin database to verify that the durations
reported now match the new QDate column.
$psql -d cumin -U cumin -h localhost
cumin=# select * from "com.redhat.grid"."Submission";
7. Longest Running Submissions on the default persona front page now uses QDate.
Schema has been updated to include QDate (includes upgrade script).
"Enqueued" column replaces "Scheduler" column on Admin->Grid->Submissions tab.
8. Scheduler link added to submission summary (Admin->Grid->Submissions->pick one)
to make up for the lost column.  This was done because adding "enqueued" to the
submission table made the table too wide with the scheduler column still there.

---> VERIFIED

Comment 12 Douglas Silas 2011-11-16 16:10:54 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,11 +1 @@
-Cause
+Previously, Cumin based the age of a submission around the creation times of the QMF object that represented the submission in the MRG Messaging space. However, displays such as the Longest Running Grid Submissions table in the default persona were affected by events in the MRG Messaging space, and could therefore be inaccurate. This update ensures that the data generated by condor and integrated into Cumin present the earliest queue date of any job included in a submission, with the result that the Longest Running Grid Submissions display should now be accurate. In addition, a new column which shows the queue date has been added to the table.-    Cumin's idea of the age of a submission was based on the creation time of the QMF object representing the submission in the MRG Messaging space because there was not adequate data to determine submission age.
-
-Consequence
-    Displays such as the Longest Running Grid Submissions table in the default persona were affected by events in the MRG Messaging space and were not accurate.
-
-Fix
-    New data generated by condor and integrated in Cumin gives the earliest queue date of any job included in a submission.
-
-Result
-    Cumin uses queue date values as the age of a submission.  The Longest Running Grid Submission display should be accurate.  Additionally, a new column showing queue date has been added to submission table displays.

Comment 13 errata-xmlrpc 2012-01-23 17:25:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0045.html


Note You need to log in before you can comment on or make changes to this bug.