Bug 1095028
Summary: | [Consume] fix for bug 1167735 [3.5-7.0] Abort media check will cause system halt [targeted to 7.3] | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | haiyang,dong <hadong> | ||||||
Component: | ovirt-node | Assignee: | Douglas Schilling Landgraf <dougsland> | ||||||
Status: | CLOSED ERRATA | QA Contact: | cshao <cshao> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 3.5.0 | CC: | aberezin, aburden, anande, cshao, dfediuck, fdeutsch, gklein, gouyang, hadong, harald, jspahr, leiwang, mgoldboi, ricardo.arguello, ycui | ||||||
Target Milestone: | ovirt-3.6.10 | Keywords: | Reopened, TestOnly, Tracking, ZStream | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | rhev-hypervisor7-7.3-20160923.2 | Doc Type: | Known Issue | ||||||
Doc Text: |
Aborting the media integrity check during Red Hat Enterprise Virtualization Hypervisor 7.0 boot causes system halt and failure to boot. This behaviour is fixed and check integrity can be skipped.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1415068 (view as bug list) | Environment: | |||||||
Last Closed: | 2017-01-20 07:16:05 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1167735 | ||||||||
Bug Blocks: | 1094719, 1415068 | ||||||||
Attachments: |
|
Judging from bz#817419 and bz#555107, this is intended behavior, or perhaps an unintended consequence. Previously, dracut's livemediacheck ignored bad returns from checkisomd5, so aborting the check would continue. But per bz#555107, aborting returns 1, which triggers a failure in checkisomd5, which brings us to the current behavior (Node dies if the media check is aborted). I'm going to close this as NOTABUG. If you want to reopne, we'll need to ask the isomd5sum and dracut maintainers to get a different return code if the mediacheck is aborted which dracut checks for and passes. Still can reproduce this issue with latest build: Version-Release number of selected component (if applicable): rhev-hypervisor7-7.0-20141119.0.el7ev ovirt-node-3.1.0-0.27.20141119git24e087e.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Boot from RHEV-H iso. 2. Press Esc key to abort media check. 3. Focus on screen. Actual results: [3.5-7.0] Abort media check will cause system halt. Expected results: Abort media check will not cause system halt. So reopen this bug due to system halt issue is not acceptable by user, and no such issue with RHEV 3.4 build. Thanks! RHEV-H 3.4 is RHEL 6.6. based, so the behavior can differ. As noted in another bug, please use rd.live.check=0 to disable the check. Okya, reopening, but only to track this in the release notes. Move to dracut component and re-open this bug. Hope dracut can fixed the system halt issue due to the issue is not acceptable by user. Thanks! Hey Fabian, If you still suggest using Workaround method to fixed this bug, so i suggest add "rd.live.check=0" into the kernel argument by default. so i need to assigned this bug again. (In reply to haiyang,dong from comment #8) > Hey Fabian, > > If you still suggest using Workaround method to fixed this bug, so i suggest > add "rd.live.check=0" into the kernel argument by default. > > so i need to assigned this bug again. No. The default is to run the check, as we want to be safe that the iso is good. My sugfgestion is that we add a known issue for this, would you agree on that solution? (In reply to Fabian Deutsch from comment #9) > (In reply to haiyang,dong from comment #8) > > Hey Fabian, > > > > If you still suggest using Workaround method to fixed this bug, so i suggest > > add "rd.live.check=0" into the kernel argument by default. > > > > so i need to assigned this bug again. > > No. The default is to run the check, as we want to be safe that the iso is > good. > > My sugfgestion is that we add a known issue for this, would you agree on > that solution? Hey Fabian, Due to for server machines, media checking will take a long time for users during boot, the users maybe press "esc" to abort media check. But Abort media check will cause system halt, it's not unacceptable for us. I still suggest we should fixed this issue, no workaround and no a known issue in the release notes. (In reply to haiyang,dong from comment #10) ... > Due to for server machines, media checking will take a long time for users > during boot, the users maybe press "esc" to abort media check. How long does the check take? > But Abort media check will cause system halt, it's not unacceptable for us. Why is the time to do the check inacceptable? And if it's unacceptable to you: You can disable it using rd.live.check=0 I just do not want to make this the default. > I still suggest we should fixed this issue, no workaround and no a known > issue in the release notes. It is going to be fixed in dracut. (In reply to Fabian Deutsch from comment #11) > (In reply to haiyang,dong from comment #10) > ... > > Due to for server machines, media checking will take a long time for users > > during boot, the users maybe press "esc" to abort media check. > > How long does the check take? media checking will take about 110 seconds in my server machines. > > > But Abort media check will cause system halt, it's not unacceptable for us. > > Why is the time to do the check inacceptable? > And if it's unacceptable to you: You can disable it using rd.live.check=0 Sorry for your misunderstand what i mean. I mean that if the users want to press "esc" to abort media check, it will cause system halt, this is not unacceptable for us. > > I just do not want to make this the default. > > > I still suggest we should fixed this issue, no workaround and no a known > > issue in the release notes. > > It is going to be fixed in dracut. ok, thanks (In reply to haiyang,dong from comment #12) > (In reply to Fabian Deutsch from comment #11) > > (In reply to haiyang,dong from comment #10) > > ... > > > Due to for server machines, media checking will take a long time for users > > > during boot, the users maybe press "esc" to abort media check. > > > > How long does the check take? > > media checking will take about 110 seconds in my server machines. That is indeed very long. We should investigate why it is taking so long. > > > > > But Abort media check will cause system halt, it's not unacceptable for us. > > > > Why is the time to do the check inacceptable? > > And if it's unacceptable to you: You can disable it using rd.live.check=0 > > Sorry for your misunderstand what i mean. I mean that if the users want to > press "esc" to abort media check, it will cause system halt, this is not > unacceptable for us. Thanks for the clarification. Moving this back to ON_QA, because we've got the workaround, and once it's fixed in dracut, it will land in RHEV-H as well. Test version rhev-hypervisor7-7.0-20141212.0.iso ovirt-node-3.1.0-0.34.20141210git0c9c493.el7.noarch Hey Fabian, Workaround :Add the kernel argument rd.live.check=0 to the kernel commandline didn't work to prevent the media check i need to remove "rd.live.check" from the kernel commandline to prevent the media check. since release notes give the wrong workaround. so i need to re-assigned this bug. Test version: rhev-hypervisor7-7.0-20150112.0.el7ev ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch Workaround :Remove the kernel argument rd.live.check from the kernel commandline to prevent the media check. this workaround works well, so changed it's status into "VERIFIED". Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0160.html (In reply to haiyang,dong from comment #15) > Test version: > rhev-hypervisor7-7.0-20150112.0.el7ev > ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch > > > Workaround :Remove the kernel argument rd.live.check from the kernel > commandline to prevent the media check. > > this workaround works well, so changed it's status into "VERIFIED". I see that 'rd.live.check' is still present as the kernel argument in rhev-hypervisor7-7.1-20150512.1.iso [See screenshot] Created attachment 1038382 [details]
screenshot1
(In reply to Anand Nande from comment #18) > (In reply to haiyang,dong from comment #15) > > Test version: > > rhev-hypervisor7-7.0-20150112.0.el7ev > > ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch > > > > > > Workaround :Remove the kernel argument rd.live.check from the kernel > > commandline to prevent the media check. > > > > this workaround works well, so changed it's status into "VERIFIED". > > I see that 'rd.live.check' is still present as the kernel argument in > rhev-hypervisor7-7.1-20150512.1.iso [See screenshot] No patch for this bug, from "Doc Text", we could get the workaround method for this bug, the user need to remove the kernel argument rd.live.check from the kernel commandline by manual to prevent the media check. If you want have a patch to deleted "rd.live.check" from the kernel commandline, please open this bug again (In reply to haiyang,dong from comment #20) > (In reply to Anand Nande from comment #18) > > (In reply to haiyang,dong from comment #15) > > > Test version: > > > rhev-hypervisor7-7.0-20150112.0.el7ev > > > ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch > > > > > > > > > Workaround :Remove the kernel argument rd.live.check from the kernel > > > commandline to prevent the media check. > > > > > > this workaround works well, so changed it's status into "VERIFIED". > > > > I see that 'rd.live.check' is still present as the kernel argument in > > rhev-hypervisor7-7.1-20150512.1.iso [See screenshot] > > No patch for this bug, from "Doc Text", we could get the workaround method > for this bug, the user need to remove the kernel argument rd.live.check from > the kernel commandline by manual to prevent the media check. > > If you want have a patch to deleted "rd.live.check" from the kernel > commandline, please open this bug again Yes (In reply to Anand Nande from comment #21) > (In reply to haiyang,dong from comment #20) > > > > No patch for this bug, from "Doc Text", we could get the workaround method > > for this bug, the user need to remove the kernel argument rd.live.check from > > the kernel commandline by manual to prevent the media check. > > > > If you want have a patch to deleted "rd.live.check" from the kernel > > commandline, please open this bug again > > Yes Hey Fabian, Could you declare whether we need a patch to deleted "rd.live.check" from the kernel commandline in ovirt-node or not? Let me first ask: Anand, why would you like to see the kernel argument removed? RHEV-H is livecd based, thus the inital check is very valuable, because it ensures that the ISO for booting has no errors. Closing this bug according to comment 17. Regarding the discussion from comment 20 on - If you really want to discuss the removal fo that kernel argument, then please open a new bug. We will compare the behavior between RHEV-H and RHEL to ensure that the behavior on RHEV-H is the same as on RHEL. Bug 1167735 is tracking this misbehavior in RHEL, once that is fixed, it will directly be inherited in RHEV-H. I'm leaving this one open for tracking purposes. No code change should be made from node side since behavior should be equal to RHEL. This bug is not considered a blocker for the release (removing the regression keyword so it won't reappear) I can reproduce this issue with RHEV-H 7.1 for RHEV 3.5.5 (rhev-hypervisor-7-7.1-20150917.0)+(ovirt-node-3.2.3-23.el7.noarch) build. Hi, To check if rhev-h 7.3.x contains the fix: - I have removed rhgb and quiet from grub and booted the iso. - Waited until the media check and pressed ESC - The installation continued Based on above, moving the bug to ON_QA Test version: Test version: rhev-hypervisor7-7.3-20161007.0 ovirt-node-3.6.1-31.0.el7ev.noarch plymouth-0.8.9-0.26.20140113.el7.x86_64 Test steps: 1. Boot from ISO, 2. Wait until the media check and pressed ESC Test result: The installation continued. So the bug is fixed, change bug status to VERIFIED. As per comment #17 this bug shouldn't have been reopened. A new bug should have been opened to address the issues. I'm now cloning this bug to a new one and moving this back to closed errata. I'll move to verified the cloned bug since this bug status is currently in verified state. Please move further comments to bug #1415068. |
Created attachment 893053 [details] attached Screenshot for system haled screen Description of problem: Right after booting from installation CD an integrity check of the medium is performed. It says that it can be cancelled by pressing esc. However, after doing so dracut says that it cannot continue, because integrity check failed and system halted(Seen system haled screen.png) Version-Release number of selected component (if applicable): rhevh-7.0-20140424.0.iso ovirt-node-3.1.0-0.2.20140424gitbfdfc00.el7 How reproducible: 100% Steps to Reproduce: Actual results: Expected results: Additional info: No this issue in rhevh 6.5 GA, so it's a regression bug.