Bug 760567 - Change of DynamicQuota causes KeyError on empty data
Summary: Change of DynamicQuota causes KeyError on empty data
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: cumin
Version: 2.1
Hardware: Unspecified
OS: Unspecified
low
unspecified
Target Milestone: 2.3
: ---
Assignee: Chad Roberts
QA Contact: Peter Belanyi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-06 14:24 UTC by Stanislav Graf
Modified: 2014-11-18 02:24 UTC (History)
8 users (show)

Fixed In Version: cumin-0.1.5388-2
Doc Type: Bug Fix
Doc Text:
Cause: Occasionally the group information could be None when the call to the Negotiator was made, the cumin code in place was not able to handle that. Consequence: Stack trace on screen Fix: The fix was a natural fallout of switching to the non-flash-based charting. The new mechanism has a try/except block that would log an error if this condition ever came up again, but should not show a stack trace on screen. Result: Quota editing should be free of stack traces on screen again.
Clone Of:
Environment:
Last Closed: 2013-03-06 18:39:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to introduce an intentional whitebox test error (695 bytes, patch)
2013-02-07 18:13 UTC, Trevor McKay
no flags Details | Diff
Quota config file to place in /etc/condor/config.d to set up test (307 bytes, application/octet-stream)
2013-02-07 18:15 UTC, Trevor McKay
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 782359 0 medium CLOSED Condor HFS quota example returns "Unknown config:" from QMF 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 782839 0 low CLOSED Cumin should report changes in condor pool as INFO (instead ERROR) 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 908266 0 unspecified CLOSED Quota page expiry causes KeyError when trying to edit dynamic quotas 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHSA-2013:0564 0 normal SHIPPED_LIVE Low: Red Hat Enterprise MRG Grid 2.3 security update 2013-03-06 23:37:09 UTC

Internal Links: 782359 782839 908266

Description Stanislav Graf 2011-12-06 14:24:57 UTC
Description of problem:
I was playing with changing dynamic quota... I probably try to set 100% and nothing changed, so I click again to change and this error occurs.

Condor setup:
condor_config_val -dump > condor_config.txt (see attachment)

Cumin web log:
cp /var/log/cumin/web.log cumin_web.txt (see attachment)

Error mesage from browser:
http://s6x:45672/form.html?modes.m=NegotiatorEditDynamicQuota_66392912;modes.NegotiatorEditDynamicQuota_66392912.return=index.html%3Fframe%3Dmain.grid%3Bmain.m%3Dgrid%3Bmain.grid.id%3D1%3Bmain.grid.view.body.m%3Dquotas%3Bmain.grid.view.body.quotas.expand%3Dgroup_chemistry;modes.NegotiatorEditDynamicQuota_66392912.id=1;modes.NegotiatorEditDynamicQuota_66392912.neg=1;modes.NegotiatorEditDynamicQuota_66392912.group_leader=group_chemistry.lab1

Traceback (most recent call last):
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 1250, in communicate
    req.respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 663, in respond
    self._respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 669, in _respond
    response = self.wsgi_app(env, self.start_response)
  File "/usr/share/cumin/python/wooly/server.py", line 83, in service_request
    status, headers, content = self.service_page_request(page, env)
  File "/usr/share/cumin/python/wooly/server.py", line 130, in service_page_request
    content = page.service_error(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 484, in service_error
    return self.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/widgets.py", line 1290, in render_content
    writer.write(self.modes.render(session))
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/wooly/widgets.py", line 34, in render_content
    return mode.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 95, in render
    result = frag.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/stat.py", line 217, in render_img_href
    return super(StatFlashChart, self).render_href(session)
  File "/usr/share/cumin/python/cumin/stat.py", line 176, in render_href
    params = self.get_href_params(session)
  File "/usr/share/cumin/python/cumin/grid/negotiator.py", line 390, in get_href_params
    names = ["name=%s" % x for x in groups if info[x]["GROUP_QUOTA_DYNAMIC"] != ""]
KeyError: 'GROUP_QUOTA_DYNAMIC'


Version-Release number of selected component (if applicable):
MRG 2.1
cumin-0.1.5098-2.el6.noarch
condor-7.6.5-0.8.el6.x86_64


How reproducible:
I'm not sure.

Steps to Reproduce:
I'm not sure.
  
Actual results:
Cumin generates error.

Expected results:
Cumin runs without error.

Additional info:

Comment 1 Stanislav Graf 2011-12-06 14:27:02 UTC
Created attachment 541412 [details]
condor config

Comment 2 Stanislav Graf 2011-12-06 14:27:55 UTC
Created attachment 541413 [details]
cumin web.log errors

Comment 3 Chad Roberts 2011-12-06 14:53:46 UTC
I'm seeing dozens of the following error in the web.log file.  I'm not sure if this is indicating a config problem, or perhaps something else is going wrong.

24570 2011-12-06 14:07:44,295 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_DYNAMIC_group_chemistry
24570 2011-12-06 14:07:44,468 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_DYNAMIC_group_physics
24570 2011-12-06 15:05:02,901 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_group_physics.lab1
24570 2011-12-06 15:05:03,002 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_group_physics.lab2
24570 2011-12-06 15:05:03,155 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_group_physics.lab3
24570 2011-12-06 15:05:03,311 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_group_chemistry.lab1
24570 2011-12-06 15:05:03,481 ERROR CallSync: call returned with status of Unknown config: GROUP_QUOTA_group_chemistry.lab2

Later on in the log file, I'm seeing a few instances of this exception, which is also surprising:
24570 2011-12-06 14:52:20,808 ERROR Request timed out after 60 seconds
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 729, in run
    self.store.update(cursor)
  File "/usr/share/cumin/python/cumin/model.py", line 809, in update
    completion)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 214, in get_raw_config
    return self._call(negotiator, "GetRawConfig", callback, 0, 0, name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 104, in call_method
    qmf_objs = agent.getObjects(_objectId=oid)
  File "/usr/lib64/python2.6/site-packages/qmf/console.py", line 3277, in getObjects
    raise Exception(context.exception)

I'm not sure what to make of this yet.

Comment 4 Trevor McKay 2011-12-08 19:45:23 UTC
Thought this might be related to making quota editing dynamic (739658), but I did my best to overdrive the system and couldn't get a stack trace.

I'll try to synthesize some exceptions in some of the places above and see if I can figure out what happened.  We may just need an exception handler in negotiator.py.

Comment 5 Stanislav Graf 2012-01-13 09:55:14 UTC
I was able to hit this again... 

Use configuration example from GUG: Hierarchical Fair Share (HFS)
-----
GROUP_NAMES = group_physics, group_chemistry, group_physics.lab1,
group_physics.lab2, group_physics.lab3, group_physics.lab3.team1,
group_physics.lab3.team2, group_physics.lab3.team3,
group_chemistry.lab1, group_chemistry.lab2

GROUP_QUOTA_DYNAMIC_group_physics = .4
GROUP_QUOTA_DYNAMIC_group_chemistry = .4
GROUP_QUOTA_DYNAMIC_group_chemistry.lab1 = .4
GROUP_QUOTA_DYNAMIC_group_chemistry.lab2 = .6
GROUP_QUOTA_DYNAMIC_group_physics.lab1= .2
GROUP_QUOTA_DYNAMIC_group_physics.lab2= .2
GROUP_QUOTA_DYNAMIC_group_physics.lab3= .6
GROUP_QUOTA_DYNAMIC_group_physics.lab3.team1 = .2
GROUP_QUOTA_DYNAMIC_group_physics.lab3.team2 = .2
GROUP_QUOTA_DYNAMIC_group_physics.lab3.team3 = .4
-----

And then stress cumin whit constant changing quota, while 'Pending' from previous change is still there. This time cumin was OK and error happend while I was changing quota in popup window on the background of original cumin window, when I change quota, window refreshed ok.
-----
12392 2012-01-13 10:39:05,853 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gr
id%3Bmain.m%3Dgrid%3Bmain.grid.id%3D1%3Bmain.grid.view.body.m%3Dquotas;widget=main.tasks;widget=m
ain.grid.view.body.quotas.table
12392 2012-01-13 10:39:05,871 INFO Response 200 OK
12392 2012-01-13 10:39:09,697 ERROR Request timed out after 60 seconds
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 669, in execute
    remote.get_raw_config(negotiator, prepend+group, sync.get_completion())
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 214, in get_raw_config
    return self._call(negotiator, "GetRawConfig", callback, 0, 0, name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 104, in call_method
    qmf_objs = agent.getObjects(_objectId=oid)
  File "/usr/lib64/python2.4/site-packages/qmf/console.py", line 3277, in getObjects
    raise Exception(context.exception)
Exception: Request timed out after 60 seconds
12392 2012-01-13 10:39:09,769 ERROR Request timed out after 60 seconds
Traceback (most recent call last):
  File "/usr/share/cumin/python/cumin/model.py", line 669, in execute
    remote.get_raw_config(negotiator, prepend+group, sync.get_completion())
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 214, in get_raw_config
    return self._call(negotiator, "GetRawConfig", callback, 0, 0, name)
  File "/usr/share/cumin/python/sage/qmf/qmfoperations.py", line 236, in _call
    self.session.call_method(cb, obj, meth, args)
  File "/usr/share/cumin/python/cumin/session.py", line 104, in call_method
    qmf_objs = agent.getObjects(_objectId=oid)
  File "/usr/lib64/python2.4/site-packages/qmf/console.py", line 3277, in getObjects
    raise Exception(context.exception)
Exception: Request timed out after 60 seconds
12392 2012-01-13 10:39:09,788 ERROR CallSync: call returned with status of Unknown config: GROUP_
QUOTA_group_physics
12392 2012-01-13 10:39:10,809 INFO Request GET /index.update?session=index.html%3Fframe%3Dmain.gr
id%3Bmain.m%3Dgrid%3Bmain.grid.id%3D1%3Bmain.grid.view.body.m%3Dquotas;widget=main.tasks;widget=m
ain.grid.view.body.quotas.table
12392 2012-01-13 10:39:10,827 INFO Response 200 OK
-----

Version:
cumin-0.1.5184-1.el5.noarch

Notes:
I saved broken machine for further examining. I can probably reproduce this again. Cumin wasn't patched for delay to see pending. This is original build.

Comment 6 Chad Roberts 2012-01-16 14:31:13 UTC
When you say that you "stress" cumin while reproducing this, does that mean that you were running additioal load against it (via some automated tool)?  Or were you just editing quotas several times in a row rapidly?

Comment 7 Stanislav Graf 2012-01-16 14:42:11 UTC
Cumin was running on the same machine as condor - VM guest with 512MB ram, 2 cores. I was just editing quotas several times in a row rapidly. Between my browser and guest was slow VPN. No other tasks was performed on guest during this test.

Comment 8 Stanislav Graf 2012-01-17 09:36:57 UTC
- CallSync ERROR is not regression (reproduced on older version too, moved into Bug 782359

- python exception is probably regression introduced by Bug 739658, but cannot be easily reproduced

Comment 10 Trevor McKay 2012-01-17 14:51:10 UTC
Splitting this into multiple issues, changing the title.

1) KeyError, the original report on this BZ.  Cumin needs to deal with the case of empty quota data gracefully.

2) The exception from timeout listed above, this is a separate exception

3) A BZ against the condor plugins investigating why we receive "Uknown config" messages.

Assigning this to croberts, but tmckay is working on this as well...

Comment 11 Trevor McKay 2012-01-17 15:01:30 UTC
(In reply to comment #10)
> Splitting this into multiple issues, changing the title.

> 3) A BZ against the condor plugins investigating why we receive "Uknown config"
> messages.
> 

This is actually https://bugzilla.redhat.com/show_bug.cgi?id=782359 as noted by sgraf above.

Comment 12 Trevor McKay 2012-01-17 15:50:08 UTC
Update, looks like we really are dealing with only a single issue.

(In reply to comment #10)
> Splitting this into multiple issues, changing the title.
> 
> 1) KeyError, the original report on this BZ.  Cumin needs to deal with the case
> of empty quota data gracefully.

This is a real issue, still investigating

> 2) The exception from timeout listed above, this is a separate exception
> 

The timeout exception is handled gracefully and does not cause a screen trace.  It is only noted in the logs, and the user is notified of the timeout.

> 3) A BZ against the condor plugins investigating why we receive "Uknown config"
> messages.
>

See the noted BZ.  This is apparently due to a limitation in the data available to cumin at this time and is expected.  The errors in the log file are troublesome, but expected.
 
> Assigning this to croberts, but tmckay is working on this as well...

Comment 13 Trevor McKay 2012-01-17 20:35:32 UTC
Possible that this is another manifestation of empty group data?

Got this trace after repeatedly modifying grid.one quota

Traceback (most recent call last):
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 1250, in communicate
    req.respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 663, in respond
    self._respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 669, in _respond
    response = self.wsgi_app(env, self.start_response)
  File "/usr/share/cumin/python/wooly/server.py", line 83, in service_request
    status, headers, content = self.service_page_request(page, env)
  File "/usr/share/cumin/python/wooly/server.py", line 130, in service_page_request
    content = page.service_error(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 484, in service_error
    return self.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/widgets.py", line 1290, in render_content
    writer.write(self.modes.render(session))
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/wooly/widgets.py", line 34, in render_content
    return mode.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 95, in render
    result = frag.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 314, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 328, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/stat.py", line 217, in render_img_href
    return super(StatFlashChart, self).render_href(session)
  File "/usr/share/cumin/python/cumin/stat.py", line 176, in render_href
    params = self.get_href_params(session)
  File "/usr/share/cumin/python/cumin/grid/negotiator.py", line 390, in get_href_params
    names = ["name=%s" % x for x in groups if info[x]["GROUP_QUOTA_DYNAMIC"] != ""]
KeyError: 'GROUP_QUOTA_DYNAMIC'

Comment 14 Trevor McKay 2012-01-18 15:34:29 UTC
This seems to be very intermittent.  Tried again, and could not reproduce any abnormal behavior.  Previously, I was able to see the error multiple times over a few hours.  This may have an environmental component.

Recommend patching cumin in general to handle empty group data with a notice to the user and a retry to get the config data.

Comment 18 Trevor McKay 2012-03-01 19:20:14 UTC
Changing to public.

Comment 19 Chad Roberts 2012-05-16 18:24:21 UTC
Fixed in revision 5379 on trunk.

Comment 20 Chad Roberts 2012-05-16 18:26:45 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause:  Occasionally the group information could be None when the call to the Negotiator was made, the cumin code in place was not able to handle that.

Consequence:  Stack trace on screen

Fix:  The fix was a natural fallout of switching to the non-flash-based charting.  The new mechanism has a try/except block that would log an error if this condition ever came up again, but should not show a stack trace on screen.

Result:  Quota editing should be free of stack traces on screen again.

Comment 27 Peter Belanyi 2013-02-04 13:04:11 UTC
I have accidentally found a very similar bug:

Steps to reproduce:
1. Go to Quotas page
2. Let the page expire
3. After it expired, click on a displayed dynamic quota value to edit it
4. Redirected to login page (url=http://<ip>:45672/login.html?origin=form.html%3Fmodes.m%3DNegotiatorEditDynamicQuota_174437580%3Bmodes.NegotiatorEditDynamicQuota_174437580.return%3Dindex.html%253Fframe%253Dmain.grid%253Bmain.grid.id%253D12%253Bmain.grid.view.body.m%253Dquotas%3Bmodes.NegotiatorEditDynamicQuota_174437580.id%3D12%3Bmodes.NegotiatorEditDynamicQuota_174437580.neg%3D29%3Bmodes.NegotiatorEditDynamicQuota_174437580.group_leader%3Dgroup_chemistry)
5. After logging in we get to page showing the traceback (url=http://<ip>:45672/form.html?modes.m=NegotiatorEditDynamicQuota_174437580;modes.NegotiatorEditDynamicQuota_174437580.return=index.html%3Fframe%3Dmain.grid%3Bmain.grid.id%3D12%3Bmain.grid.view.body.m%3Dquotas;modes.NegotiatorEditDynamicQuota_174437580.id=12;modes.NegotiatorEditDynamicQuota_174437580.neg=29;modes.NegotiatorEditDynamicQuota_174437580.group_leader=group_chemistry)

Traceback (most recent call last):
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 1250, in communicate
    req.respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 663, in respond
    self._respond()
  File "/usr/share/cumin/python/wooly/wsgiserver/__init__.py", line 669, in _respond
    response = self.wsgi_app(env, self.start_response)
  File "/usr/share/cumin/python/wooly/server.py", line 104, in service_request
    status, headers, content = self.service_page_request(page, env)
  File "/usr/share/cumin/python/wooly/server.py", line 170, in service_page_request
    content = page.service_error(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 593, in service_error
    return self.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 348, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 362, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/widgets.py", line 1423, in render_content
    writer.write(self.modes.render(session))
  File "/usr/share/cumin/python/wooly/__init__.py", line 348, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 362, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/wooly/widgets.py", line 34, in render_content
    return mode.render(session)
  File "/usr/share/cumin/python/wooly/__init__.py", line 348, in render
    string = self.do_render(session, *args)
  File "/usr/share/cumin/python/wooly/__init__.py", line 362, in do_render
    self.__main_tmpl.render(writer, session, *args)
  File "/usr/share/cumin/python/wooly/template.py", line 90, in render
    result = frag(self.__object, session, *args)
  File "/usr/share/cumin/python/cumin/grid/negotiator.py", line 439, in render_subshares_title
    subshares = self.group_helper.get_children(session, self.group_leader.get(session))
  File "/usr/share/cumin/python/cumin/grid/negotiator.py", line 134, in get_children
    info[group]['children'] = list()
KeyError: u'group_chemistry'

(6. When I reloaded that page, I got to the "Edit Dynamic Group Quota" page showing an empty chart)

I could reproduce this 100% on RHEL 5/6 i386/x86_64, with cumin-0.1.5658-1.el5 

I could not decide whether this is the same bug or a different one. The traceback is similar, but not exactly the same, and the steps to reproduce are different as well, but the root cause might be the same thing. So first I just wanted to add this comment, so that we can decide whether to open a new bz or set this one back to assigned.

Comment 30 Peter Belanyi 2013-02-06 12:16:45 UTC
1. I have opened bug 908266 for the issue mentioned in comment 27.

2. Regarding this bug: I could not reproduce it and I don't know how I could verify that this is no longer an issue. Can you give us any suggestions or hints on testing?

Comment 33 Trevor McKay 2013-02-07 18:10:56 UTC
Okay, here is the best I can do.  It is a white box test using a common patch for cumin-0.1.5444-3 and the latest version.  The patch forces GROUP_QUOTA_DYNAMIC to be deleted for a specific quota subshare in a routine that is common to both versions.  This routine is called immediately before line 390 in negotiator.py which appears in the traceback in comment 1 and comment 13.

The necessary corresponding quota config file is attached as well to set up the expected quotas in condor.

Under cumin-0.1.5444-3, this will cause the KeyError 100% when the subshare or one of its siblings is edited (GRID.ONE or GRID.TWO).  The traceback in comment 1 and comment 13 will be seen (or something very, very close as this was produced against an earlier cumin)

In the latest version, editing will proceed without an error.

In both cases, web.stdout will contain this message to confirm that the quota values were modified programmatically:

Delete config GROUP_QUOTA_DYNAMIC for group coupling.normal

Steps to reproduce on old cumin:

1) Install cumin-0.1.5444-3 and condor on the same machine
2) cd /usr/share/cumin
3) patch -p1 < BZ760567.patch
4) put the attached quota config file in /etc/condor/config.d/80quota.config to create the quotas.
5) service condor restart
6) service cumin restart
7) Navigate to the quota tab.
8) Notice that any of the top level groups can be edited
9) Expand the Grid group.  Click on the link to edit Grid.One or Grid.Two
10) Boom!  traceback

Test against new Cumin:

1) Install latest cumin for testing
2) cd /usr/share/cumin
3) patch -p1 < BZ760567.patch
4) put the attached quota config file in /etc/condor/config.d/80quota.config to create the quotas.
5) service condor restart
6) service cumin restart
7) Navigate to the quota tab.
8) Notice that any of the top level groups can be edited
9) Expand the Grid group.  Click on the link to edit Grid.One or Grid.Two
10) Everything is fine!
11) Examine web.stderr to confirm that the quota group was deleted.

Steps to verify on both versions that the patch does not have unintended side effects beyond the test:

1) Set up either/both versions as above
2) Modify the /etc/condor/config.d/80quota.config file to set up dynamic quota groups with subshares but *CHANGE* the GRID group names to something else so the deletion code will not run
3) service condor restart
4) Edit dynamic quotas in cumin.  Unless the original BZ is inadvertently stumbled upon in the early version, editing should be successful with no errors.  Even if the error does occur on the older version, it should not be 100% reproducible as it is when the quota names match the patch.

The error should *never* occur on the newer version.  If a traceback is seen, please attach it as it is likely to be a new error.

Comment 34 Trevor McKay 2013-02-07 18:13:32 UTC
Created attachment 694684 [details]
Patch to introduce an intentional whitebox test error

This patch is common to cumin-0.1.5444-3 and the latest versions under test

Comment 35 Trevor McKay 2013-02-07 18:15:29 UTC
Created attachment 694685 [details]
Quota config file to place in /etc/condor/config.d to set up test

Comment 36 Peter Belanyi 2013-02-08 16:15:43 UTC
Thank you very much for the whitebox test.

Using the patch I was able to reproduce the bug on cumin-0.1.5444-3, the traceback was almost identical to the original one. After changing the group names it was working fine, so the patch did not have unintended side effects.

For the new version I tested cumin-0.1.5675-1 on RHEL 5/6 i386/x86_64, everything was working as expected, I did not see any tracebacks, and the patch did not have unintended side effects.

--> VERIFIED

Comment 38 errata-xmlrpc 2013-03-06 18:39:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html


Note You need to log in before you can comment on or make changes to this bug.