This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 782839 - Cumin should report changes in condor pool as INFO (instead ERROR)
Cumin should report changes in condor pool as INFO (instead ERROR)
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin (Show other bugs)
2.1
Unspecified Unspecified
low Severity unspecified
: 3.0
: ---
Assigned To: Trevor McKay
Peter Belanyi
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-01-18 11:33 EST by Stanislav Graf
Modified: 2014-11-17 21:24 EST (History)
6 users (show)

See Also:
Fixed In Version: cumin-0.1.5251-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-05-06 09:52:23 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Stanislav Graf 2012-01-18 11:33:17 EST
Description of problem:
I was submitting VM jobs from cumin and at the same time other person was working on the pool.

During our session we hit according to log file several of "Exception: Object '...' is unknown"
---
3972 2012-01-18 14:18:10,557 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d%3Bmain.m%3Dgrid%3Bmain.grid.id%3D8%3Bmain.grid.view.body.m%3Dpool_submissions;widget=main.tasks
;widget=main.grid.view.body.pool_submissions.table
3972 2012-01-18 14:18:10,581 INFO Response 200 OK
3972 2012-01-18 14:18:14,863 ERROR Object 'rhel5i-xen-4' is unknown
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 729, in run
    self.store.update(cursor)
  File "/usr/share/cumin/python/cumin/model.py", line 786, in update
    self.machine_name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 230, in get_job_summaries
    return self._call(submission, "GetJobSummaries", callback, 0, 0)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 109, in call_method
    raise Exception("Object '%s' is unknown" % object_id)
Exception: Object 'rhel5i-xen-4' is unknown
3972 2012-01-18 14:18:20,407 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d%3Bmain.m%3Dgrid%3Bmain.grid.id%3D8%3Bmain.grid.view.body.m%3Dpool_submissions;widget=main.tasks
;widget=main.grid.view.body.pool_submissions.table
3972 2012-01-18 14:18:20,425 INFO Response 200 OK
---

And one "Exception: Agent disconnected"
---
3972 2012-01-18 14:07:57,155 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gri
d.submission%3Bmain.m%3Dgrid%3Bmain.grid.m%3Dsubmission%3Bmain.grid.id%3D4%3Bmain.grid.view.body.
m%3Dpool_submissions%3Bmain.grid.submission.m%3Dview%3Bmain.grid.submission.id%3D11;widget=main.t
asks;widget=main.grid.submission.view.body.jobs.table
3972 2012-01-18 14:07:57,337 INFO Response 200 OK
3972 2012-01-18 14:08:30,155 ERROR Agent disconnected
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 729, in run
    self.store.update(cursor)
  File "/usr/share/cumin/python/cumin/model.py", line 786, in update
    self.machine_name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 230, in get_job_summaries
    return self._call(submission, "GetJobSummaries", callback, 0, 0)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 104, in call_method
    qmf_objs = agent.getObjects(_objectId=oid)
  File "/usr/lib/python2.4/site-packages/qmf/console.py", line 3277, in getObjects
    raise Exception(context.exception)
Exception: Agent disconnected
3972 2012-01-18 14:08:40,849 INFO Expired 0 client sessions
3972 2012-01-18 14:08:53,705 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.grid.submission%3Bmain.m%3Dgrid%3Bmain.grid.m%3Dsubmission%3Bmain.grid.id%3D4%3Bmain.grid.view.body.m%3Dpool_submissions%3Bmain.grid.submission.m%3Dview%3Bmain.grid.submission.id%3D11;widget=main.tasks;widget=main.grid.submission.view.body.jobs.table
---

Version-Release number of selected component (if applicable):
cumin-0.1.5184-1.el5.noarch

How reproducible:
10% 

Steps to Reproduce:
1. Work with cumin and simultaneously with condor pool
  
Actual results:
Condor generates several exception/tracebacks and errors.

Expected results:
Cumin handles problems on the pool without errors.

Additional info:
Maybe related to Bug 760567
Cumin web interface wasn't vissibly affected.
Comment 2 Trevor McKay 2012-02-13 16:40:42 EST
Fixed in revision 5218.

All log.exception(e) calls have been changed to the form log.debug(msg, exc_info=True).

This will remove all exception traces from log files unless the logging level is set to debug.  None of these traces was useful to users in general, and in all cases the exception was already being handled (there were no cases of log.exception followed by a raise statement).
Comment 3 Trevor McKay 2012-02-13 16:43:54 EST
imho, it is very difficult to do any testing around this beyond a careful code review of a diff against the previous revision.  There is no way to inject exceptions in all of the various different places to test all execution paths without editing the code as a white box test to raise an exception in each affected block.
Comment 12 Peter Belanyi 2013-04-04 10:59:18 EDT
According to comment 3 I did a code review of the diff between revision 5217 and 5218 of cumin. All log.exception calls have been changed to log.debug calls or something more suitable.

--> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.