Bug 950646
| Summary: | OverflowError in beah test harness poll call | ||
|---|---|---|---|
| Product: | [Retired] Beaker | Reporter: | Petr Sklenar <psklenar> |
| Component: | scheduler | Assignee: | beaker-dev-list |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | tools-bugs <tools-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 0.9 | CC: | aigao, asaha, azelinka, dcallagh, llim, psklenar, qwan, rmancy, xjia |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-10-02 04:36:46 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 798718 | ||
|
Description
Petr Sklenar
2013-04-10 14:50:20 UTC
(In reply to comment #0) > Description of problem: > Beaker job is aborted when test rpm is not downloaded > > Version-Release number of selected component (if applicable): > Version - 0.11.3 > > How reproducible: > rarely > > Steps to Reproduce: > 1, sometime (for unknown reason to me) yum doesn't download rpm: > https://beaker.engineering.redhat.com/recipes/844894#task11845642 > #Dont know why rpm cannot be downloaded. It's possible that someone could > create another rpm in that time but beaker shouldn't abort the whole job. The util-linux-ng package wasn't installed because it's not present in the RHEL7 tree you used. $ repoquery --disablerepo=* --enablerepo=RHEL-7.0-20130306.0 --repofrompath=RHEL-7.0-20130306.0,http://download.eng.bos.redhat.com/rel-eng/RHEL-7.0-20130306.0/compose/Server/x86_64/os/ util-linux-ng $ repoquery --disablerepo=* --enablerepo=RHEL-7.0-20130306.0 --repofrompath=RHEL-7.0-20130306.0,http://download.eng.bos.redhat.com/rel-eng/RHEL-7.0-20130306.0/compose/Server/x86_64/os/ util-linux util-linux-0:2.22.1-2.4.el7.x86_64 But that didn't abort your job. The actual error seems to be here: 2013-04-09 21:49:25,697 backend async_proc: INFO Extending Watchdog for task 11845649 by 9000.. 04/09/13 21:49:25 JobID:402207 Test:/CoreOS/vixie-cron/Regression/bug-232439_fail_on_first_Jan Response:1 2013-04-09 21:49:25,804 rhts_task checkin_start: INFO setting nohup 04/09/13 21:49:25 testID:11845649 start: 2006-12-31 09:56:00,282 backend.twisted emit: ERROR Unhandled Error Traceback (most recent call last): File "/usr/bin/beah-beaker-backend", line 9, in <module> load_entry_point('beah==0.6.43.dev201303102204', 'console_scripts', 'beah-beaker-backend')() File "/usr/lib/python2.7/site-packages/beah/backends/beakerlc.py", line 2007, in main debug.runcall(reactor.run) File "/usr/lib/python2.7/site-packages/beah/core/debug.py", line 11, in runcall a_callable(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1169, in run self.mainLoop() --- <exception caught here> --- File "/usr/lib64/python2.7/site-packages/twisted/internet/base.py", line 1181, in mainLoop self.doIteration(t) File "/usr/lib64/python2.7/site-packages/twisted/internet/epollreactor.py", line 362, in doPoll l = self._poller.poll(timeout, len(self._selectables)) exceptions.OverflowError: timeout is too large The OverflowError is repeated forever until the watchdog aborted the job. I'm not sure why this would happen, it seems like it must be a harness bug. Particularly since you had the same thing happen at the same point in your recipe on another system. I also noticed on the console log for R:844894 a very large number of RAID and SCSI offline errors from the kernel. Are those expected as part of the util-linux tests? Hi Petr, as per Dan's question above, could you provide a bit more info on the expected impact of the util-linux tests? Hi, I thing that this is not due to util-linux(-ng ) on the rhel7. We have set of tier tests with +-100 tests for the whole team. Some user will create more updates in one of the test during scheduling job and bump the version more times .... Then the whole job is aborted instead of one fail. I will try it to be sure, I let you know. Petr, bug 880855 affected versions prior to Beaker 0.13 and could result in jobs failing due to new task versions being uploaded. That's not the bug covered by this issue though - we're interested in the OverflowError noted above. I was trying to reproduce but I didn't succeed with it. I tried the same sets of tests and I works now. ( J:506814 or J:506813 ) FYI util-linux(|-ng) test cases does not expect any raid/scsi error. OK, we made a few reliability improvements to both beah and task repo creation over the last few releases, so it's quite plausible that this has been fixed since it was first encountered. Closing this one - please file a new bug report if you have anything similar recur. |