Bug 994970
Summary: | [RFE] Actively fight machine pool depletion | ||
---|---|---|---|
Product: | [Retired] Beaker | Reporter: | Hubert Kario <hkario> |
Component: | general | Assignee: | beaker-dev-list |
Status: | CLOSED WONTFIX | QA Contact: | tools-bugs <tools-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 0.13 | CC: | cbouchar, fedora, pholica, qwan, tools-bugs |
Target Milestone: | --- | Keywords: | FutureFeature |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-11-19 21:55:10 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 991236 | ||
Bug Blocks: |
Description
Hubert Kario
2013-08-08 10:20:41 UTC
(In reply to Hubert Kario from comment #0) > Beaker should schedule a "hardware check" job every few ours that installs That should obviously be: > Beaker should schedule a "hardware check" job every few hours that installs ^^^^^ This is an interesting idea. Is this primarily so then you don't waste time reserving a system that would have otherwise failed this hardware test? Has this been a common problem for you? The shortage of resources is already quite acute, and having more machines taken out of circulation while they have basic hardware tests performed on them would cause even further strain. I wonder if it might make more sense to do similar testing before/after a recipe is run on a system. The proposal is to do this on *Broken* machines, to see if they can be set back to Automated. The idea is to automatically pick up systems that actually failed due to some external problem in the lab, rather than anything being inherently wrong with the system itself. We need a more comprehensive machine health check that *maintainers* can initiate before we can consider automating any such check, though. (In reply to Raymond Mancy from comment #3) > This is an interesting idea. Is this primarily so then you don't waste time > reserving a system that would have otherwise failed this hardware test? Has > this been a common problem for you? This is a different problem, but yes, I've seen 7-8% provisioning (/distribution/install task) failure rate. > The shortage of resources is already quite acute, and having more machines > taken out of circulation while they have basic hardware tests performed on > them would cause even further strain. As Nick said, the issue at hand is supposed to alleviate the shortage of machines. (In reply to Nick Coghlan from comment #4) > The proposal is to do this on *Broken* machines Oh, ok. That makes more sense. |