Bug 880855
Summary: | race in createRepo (_link_rpms trying to link non-existent rpm) | ||
---|---|---|---|
Product: | [Retired] Beaker | Reporter: | Dan Callaghan <dcallagh> |
Component: | scheduler | Assignee: | Nick Coghlan <ncoghlan> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | tools-bugs <tools-bugs> |
Severity: | unspecified | Docs Contact: | |
Priority: | high | ||
Version: | 0.10 | CC: | asaha, dcallagh, llim, pbunyan, qwan, rglasz, rmancy, xjia |
Target Milestone: | 0.13 | Keywords: | Regression |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | Scheduler | ||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-06-25 06:27:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Callaghan
2012-11-27 22:49:22 UTC
This is not too serious since the recipe will be retried (and should proceed) on the next iteration of queued_recipes. And it has only happened 11 times in our production Beaker environment since 0.10 was released. However we should fix it since it might be indicative of other worse problems. Also not giving this devel_ack+ yet since I'm not sure where the missing flock acquisition is (nor whether that is even the problem). *** Bug 953209 has been marked as a duplicate of this bug. *** Need to reconsider this for 1.0, since it appears the symptoms have changed in 0.12 (to abort rather than retry) Unfortunately, we still don't now how this is getting triggered, so we can't commit to having it fixed in 1.0 :( Then again... Task.disable unlinks RPMs without holding the flock [1], which could definitely cause these symptoms. The other Task XML-RPC APIs (upload and save) only work with new tasks, so couldn't cause any problems, but a disable operation while a createRepo() call was running could definitely cause these symptoms. [1] http://git.beaker-project.org/cgit/beaker/tree/Server/bkr/server/model.py?h=develop#n6477 Initial patch: http://gerrit.beaker-project.org/1926 Based on feedback on the patch, I'm going to rework this to create a clear TaskLibrary abstraction that localises all responsibility for manipulation of the RPM library in one place. dcallagh found a couple of errors in this patch: 2013-05-21 16:55:19,267 beakerd ERROR Error in schedule_queued_recipe(541) Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 397, in schedule_queued_recipes schedule_queued_recipe(recipe_id) File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 496, in schedule_queued_recipe recipe.createRepo() File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 5278, in createRepo Task.make_snapshot_repo(snapshot_repo) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6587, in make_snapshot_repo return cls.library.make_snapshot_repo(repo_dir) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6451, in make_snapshot_repo self._link_rpms(repo_dir) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6436, in _link_rpms for srcpath in self._all_rpms(): File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6429, in _all_rpms if os.path.isdir(srcname): NameError: global name 'srcname' is not defined 2013-05-21 16:59:15,453 beakerd ERROR Error in schedule_queued_recipe(542) Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 397, in schedule_queued_recipes schedule_queued_recipe(recipe_id) File "/usr/lib/python2.6/site-packages/bkr/server/tools/beakerd.py", line 496, in schedule_queued_recipe recipe.createRepo() File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 5278, in createRepo Task.make_snapshot_repo(snapshot_repo) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6587, in make_snapshot_repo return cls.library.make_snapshot_repo(repo_dir) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6451, in make_snapshot_repo self._link_rpms(repo_dir) File "/usr/lib/python2.6/site-packages/bkr/server/model.py", line 6437, in _link_rpms dstname = os.path.join(dst, name) NameError: global name 'name' is not defined I tracked down the gap in the test suite: those two code paths are only hit when there are tasks in the task library, and the test suite doesn't add any before it checks that provisioning works. /me bumps "getting basic coverage data for the current test suite" further up the wish list... Updated with fixes and test suite enhancements: http://gerrit.beaker-project.org/#/c/1958/ Beaker 0.13.1 has been released. |