Bug 782839 - Cumin should report changes in condor pool as INFO (instead ERROR)
Summary: Cumin should report changes in condor pool as INFO (instead ERROR)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: 2.1
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: 3.0
: ---
Assignee: Trevor McKay
QA Contact: Peter Belanyi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-18 16:33 UTC by Stanislav Graf
Modified: 2014-11-18 02:24 UTC (History)
6 users (show)

Fixed In Version: cumin-0.1.5251-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-05-06 13:52:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 760567 0 low CLOSED Change of DynamicQuota causes KeyError on empty data 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 765846 0 high CLOSED Submit VM job - doesn't work 2021-02-22 00:41:40 UTC

Internal Links: 760567 765846

Description Stanislav Graf 2012-01-18 16:33:17 UTC
Description of problem:
I was submitting VM jobs from cumin and at the same time other person was working on the pool.

During our session we hit according to log file several of "Exception: Object '...' is unknown"
---
3972 2012-01-18 14:18:10,557 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d%3Bmain.m%3Dgrid%3Bmain.grid.id%3D8%3Bmain.grid.view.body.m%3Dpool_submissions;widget=main.tasks
;widget=main.grid.view.body.pool_submissions.table
3972 2012-01-18 14:18:10,581 INFO Response 200 OK
3972 2012-01-18 14:18:14,863 ERROR Object 'rhel5i-xen-4' is unknown
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 729, in run
    self.store.update(cursor)
  File "/usr/share/cumin/python/cumin/model.py", line 786, in update
    self.machine_name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 230, in get_job_summaries
    return self._call(submission, "GetJobSummaries", callback, 0, 0)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 109, in call_method
    raise Exception("Object '%s' is unknown" % object_id)
Exception: Object 'rhel5i-xen-4' is unknown
3972 2012-01-18 14:18:20,407 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d%3Bmain.m%3Dgrid%3Bmain.grid.id%3D8%3Bmain.grid.view.body.m%3Dpool_submissions;widget=main.tasks
;widget=main.grid.view.body.pool_submissions.table
3972 2012-01-18 14:18:20,425 INFO Response 200 OK
---

And one "Exception: Agent disconnected"
---
3972 2012-01-18 14:07:57,155 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d.submission%3Bmain.m%3Dgrid%3Bmain.grid.m%3Dsubmission%3Bmain.grid.id%3D4%3Bmain.grid.view.body.
m%3Dpool_submissions%3Bmain.grid.submission.m%3Dview%3Bmain.grid.submission.id%3D11;widget=main.t
asks;widget=main.grid.submission.view.body.jobs.table
3972 2012-01-18 14:07:57,337 INFO Response 200 OK
3972 2012-01-18 14:08:30,155 ERROR Agent disconnected
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 729, in run
    self.store.update(cursor)
  File "/usr/share/cumin/python/cumin/model.py", line 786, in update
    self.machine_name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 230, in get_job_summaries
    return self._call(submission, "GetJobSummaries", callback, 0, 0)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 104, in call_method
    qmf_objs = agent.getObjects(_objectId=oid)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 3277, in getObjects
    raise Exception(context.exception)
Exception: Agent disconnected
3972 2012-01-18 14:08:40,849 INFO Expired 0 client sessions
3972 2012-01-18 14:08:53,705 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.grid.submission%3Bmain.m%3Dgrid%3Bmain.grid.m%3Dsubmission%3Bmain.grid.id%3D4%3Bmain.grid.view.body.m%3Dpool_submissions%3Bmain.grid.submission.m%3Dview%3Bmain.grid.submission.id%3D11;widget=main.tasks;widget=main.grid.submission.view.body.jobs.table
---

Version-Release number of selected component (if applicable):
cumin-0.1.5184-1.el5.noarch

How reproducible:
10% 

Steps to Reproduce:
1. Work with cumin and simultaneously with condor pool
  
Actual results:
Condor generates several exception/tracebacks and errors.

Expected results:
Cumin handles problems on the pool without errors.

Additional info:
Maybe related to Bug 760567
Cumin web interface wasn't vissibly affected.

Comment 2 Trevor McKay 2012-02-13 21:40:42 UTC
Fixed in revision 5218.

All log.exception(e) calls have been changed to the form log.debug(msg, exc_info=True).

This will remove all exception traces from log files unless the logging level is set to debug.  None of these traces was useful to users in general, and in all cases the exception was already being handled (there were no cases of log.exception followed by a raise statement).

Comment 3 Trevor McKay 2012-02-13 21:43:54 UTC
imho, it is very difficult to do any testing around this beyond a careful code review of a diff against the previous revision.  There is no way to inject exceptions in all of the various different places to test all execution paths without editing the code as a white box test to raise an exception in each affected block.

Comment 12 Peter Belanyi 2013-04-04 14:59:18 UTC
According to comment 3 I did a code review of the diff between revision 5217 and 5218 of cumin. All log.exception calls have been changed to log.debug calls or something more suitable.

--> VERIFIED


Note You need to log in before you can comment on or make changes to this bug.