Bug 1095028 - [Consume] fix for bug 1167735 [3.5-7.0] Abort media check will cause system halt [targeted to 7.3]
Summary: [Consume] fix for bug 1167735 [3.5-7.0] Abort media check will cause system h...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.10
: ---
Assignee: Douglas Schilling Landgraf
QA Contact: cshao
URL:
Whiteboard:
Depends On: 1167735
Blocks: rhevh-7.0 1415068
TreeView+ depends on / blocked
 
Reported: 2014-05-07 02:26 UTC by haiyang,dong
Modified: 2020-02-14 17:27 UTC (History)
15 users (show)

Fixed In Version: rhev-hypervisor7-7.3-20160923.2
Doc Type: Known Issue
Doc Text:
Aborting the media integrity check during Red Hat Enterprise Virtualization Hypervisor 7.0 boot causes system halt and failure to boot. This behaviour is fixed and check integrity can be skipped.
Clone Of:
: 1415068 (view as bug list)
Environment:
Last Closed: 2017-01-20 07:16:05 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
attached Screenshot for system haled screen (16.78 KB, image/png)
2014-05-07 02:26 UTC, haiyang,dong
no flags Details
screenshot1 (7.47 KB, image/png)
2015-06-13 17:21 UTC, Anand Nande
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:0160 0 normal SHIPPED_LIVE ovirt-node bug fix and enhancement update 2015-02-12 01:34:52 UTC

Description haiyang,dong 2014-05-07 02:26:30 UTC
Created attachment 893053 [details]
attached Screenshot for system haled screen

Description of problem:
Right after booting from installation CD an integrity check of the medium is performed. It says that it can be cancelled by pressing esc. However, after doing so dracut says that it cannot continue, because integrity check failed and system halted(Seen system haled screen.png)


Version-Release number of selected component (if applicable):
rhevh-7.0-20140424.0.iso 
ovirt-node-3.1.0-0.2.20140424gitbfdfc00.el7


How reproducible:
100%

Steps to Reproduce:


Actual results:

Expected results:

Additional info:
No this issue in rhevh 6.5 GA, so it's a regression bug.

Comment 2 Ryan Barry 2014-07-22 15:43:07 UTC
Judging from bz#817419 and bz#555107, this is intended behavior, or perhaps an unintended consequence.

Previously, dracut's livemediacheck ignored bad returns from checkisomd5, so aborting the check would continue. But per bz#555107, aborting returns 1, which triggers a failure in checkisomd5, which brings us to the current behavior (Node dies if the media check is aborted).

I'm going to close this as NOTABUG.

If you want to reopne, we'll need to ask the isomd5sum and dracut maintainers to get a different return code if the mediacheck is aborted which dracut checks for and passes.

Comment 3 cshao 2014-11-25 10:51:43 UTC
Still can reproduce this issue with latest build:

Version-Release number of selected component (if applicable):
rhev-hypervisor7-7.0-20141119.0.el7ev
ovirt-node-3.1.0-0.27.20141119git24e087e.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. Boot from RHEV-H iso.
2. Press Esc key to abort media check.
3. Focus on screen.

Actual results:
[3.5-7.0] Abort media check will cause system halt.

Expected results:
Abort media check will not cause system halt.

So reopen this bug due to system halt issue is not acceptable by user, and no such issue with RHEV 3.4 build.

Thanks!

Comment 4 Fabian Deutsch 2014-11-25 10:53:03 UTC
RHEV-H 3.4 is RHEL 6.6. based, so the behavior can differ.

As noted in another bug, please use rd.live.check=0 to disable the check.

Comment 5 Fabian Deutsch 2014-11-25 11:02:48 UTC
Okya, reopening, but only to track this in the release notes.

Comment 6 cshao 2014-11-25 11:06:12 UTC
Move to dracut component and re-open this bug.
Hope dracut can fixed the system halt issue due to the issue is not acceptable by user.

Thanks!

Comment 8 haiyang,dong 2014-12-11 09:12:23 UTC
Hey Fabian, 

If you still suggest using Workaround method to fixed this bug, so i suggest add "rd.live.check=0" into the kernel argument by default.

so i need to assigned this bug again.

Comment 9 Fabian Deutsch 2014-12-11 10:33:17 UTC
(In reply to haiyang,dong from comment #8)
> Hey Fabian, 
> 
> If you still suggest using Workaround method to fixed this bug, so i suggest
> add "rd.live.check=0" into the kernel argument by default.
> 
> so i need to assigned this bug again.

No. The default is to run the check, as we want to be safe that the iso is good.

My sugfgestion is that we add a known issue for this, would you agree on that solution?

Comment 10 haiyang,dong 2014-12-11 11:49:27 UTC
(In reply to Fabian Deutsch from comment #9)
> (In reply to haiyang,dong from comment #8)
> > Hey Fabian, 
> > 
> > If you still suggest using Workaround method to fixed this bug, so i suggest
> > add "rd.live.check=0" into the kernel argument by default.
> > 
> > so i need to assigned this bug again.
> 
> No. The default is to run the check, as we want to be safe that the iso is
> good.
> 
> My sugfgestion is that we add a known issue for this, would you agree on
> that solution?

Hey Fabian,

Due to for server machines,  media checking will take a long time for users during boot, the users maybe press "esc" to abort media check.
But Abort media check will cause system halt, it's not unacceptable for us.
I still suggest we should fixed this issue, no workaround and no a known issue in the release notes.

Comment 11 Fabian Deutsch 2014-12-11 14:47:58 UTC
(In reply to haiyang,dong from comment #10)
...
> Due to for server machines,  media checking will take a long time for users
> during boot, the users maybe press "esc" to abort media check.

How long does the check take?

> But Abort media check will cause system halt, it's not unacceptable for us.

Why is the time to do the check inacceptable?
And if it's unacceptable to you: You can disable it using rd.live.check=0

I just do not want to make this the default.

> I still suggest we should fixed this issue, no workaround and no a known
> issue in the release notes.

It is going to be fixed in dracut.

Comment 12 haiyang,dong 2014-12-12 02:20:18 UTC
(In reply to Fabian Deutsch from comment #11)
> (In reply to haiyang,dong from comment #10)
> ...
> > Due to for server machines,  media checking will take a long time for users
> > during boot, the users maybe press "esc" to abort media check.
> 
> How long does the check take?

media checking will take about 110 seconds in my server machines.
> 
> > But Abort media check will cause system halt, it's not unacceptable for us.
> 
> Why is the time to do the check inacceptable?
> And if it's unacceptable to you: You can disable it using rd.live.check=0

Sorry for your misunderstand what i mean. I mean that if the users want to press "esc" to abort media check, it will cause system halt, this is not unacceptable for us. 

> 
> I just do not want to make this the default.
> 
> > I still suggest we should fixed this issue, no workaround and no a known
> > issue in the release notes.
> 
> It is going to be fixed in dracut.
ok, thanks

Comment 13 Fabian Deutsch 2014-12-15 07:21:42 UTC
(In reply to haiyang,dong from comment #12)
> (In reply to Fabian Deutsch from comment #11)
> > (In reply to haiyang,dong from comment #10)
> > ...
> > > Due to for server machines,  media checking will take a long time for users
> > > during boot, the users maybe press "esc" to abort media check.
> > 
> > How long does the check take?
> 
> media checking will take about 110 seconds in my server machines.

That is indeed very long. We should investigate why it is taking so long.

> > 
> > > But Abort media check will cause system halt, it's not unacceptable for us.
> > 
> > Why is the time to do the check inacceptable?
> > And if it's unacceptable to you: You can disable it using rd.live.check=0
> 
> Sorry for your misunderstand what i mean. I mean that if the users want to
> press "esc" to abort media check, it will cause system halt, this is not
> unacceptable for us. 

Thanks for the clarification.

Moving this back to ON_QA, because we've got the workaround, and once it's fixed in dracut, it will land in RHEV-H as well.

Comment 14 haiyang,dong 2014-12-18 08:38:28 UTC
Test version
rhev-hypervisor7-7.0-20141212.0.iso
ovirt-node-3.1.0-0.34.20141210git0c9c493.el7.noarch

Hey Fabian,

Workaround :Add the kernel argument rd.live.check=0 to the kernel commandline didn't work to prevent the media check 

i need to remove "rd.live.check" from the kernel commandline to prevent the media check.

since release notes give the wrong workaround. so i need to re-assigned this bug.

Comment 15 haiyang,dong 2015-01-13 07:56:25 UTC
Test version:
rhev-hypervisor7-7.0-20150112.0.el7ev
ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch


Workaround :Remove the kernel argument rd.live.check from the kernel commandline to prevent the media check.

this workaround works well, so changed it's status into "VERIFIED".

Comment 17 errata-xmlrpc 2015-02-11 20:56:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-0160.html

Comment 18 Anand Nande 2015-06-13 17:20:21 UTC
(In reply to haiyang,dong from comment #15)
> Test version:
> rhev-hypervisor7-7.0-20150112.0.el7ev
> ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch
> 
> 
> Workaround :Remove the kernel argument rd.live.check from the kernel
> commandline to prevent the media check.
> 
> this workaround works well, so changed it's status into "VERIFIED".

I see that 'rd.live.check' is still present as the kernel argument in rhev-hypervisor7-7.1-20150512.1.iso  [See screenshot]

Comment 19 Anand Nande 2015-06-13 17:21:44 UTC
Created attachment 1038382 [details]
screenshot1

Comment 20 haiyang,dong 2015-06-15 02:30:42 UTC
(In reply to Anand Nande from comment #18)
> (In reply to haiyang,dong from comment #15)
> > Test version:
> > rhev-hypervisor7-7.0-20150112.0.el7ev
> > ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch
> > 
> > 
> > Workaround :Remove the kernel argument rd.live.check from the kernel
> > commandline to prevent the media check.
> > 
> > this workaround works well, so changed it's status into "VERIFIED".
> 
> I see that 'rd.live.check' is still present as the kernel argument in
> rhev-hypervisor7-7.1-20150512.1.iso  [See screenshot]

No patch for this bug, from "Doc Text", we could get the workaround method for this bug, the user need to remove the kernel argument rd.live.check from the kernel commandline by manual to prevent the media check.

If you want have a patch to deleted "rd.live.check" from the kernel commandline, please open this bug again

Comment 21 Anand Nande 2015-06-30 09:12:35 UTC
(In reply to haiyang,dong from comment #20)
> (In reply to Anand Nande from comment #18)
> > (In reply to haiyang,dong from comment #15)
> > > Test version:
> > > rhev-hypervisor7-7.0-20150112.0.el7ev
> > > ovirt-node-3.1.0-0.42.20150109gitd06b7c5.el7.noarch
> > > 
> > > 
> > > Workaround :Remove the kernel argument rd.live.check from the kernel
> > > commandline to prevent the media check.
> > > 
> > > this workaround works well, so changed it's status into "VERIFIED".
> > 
> > I see that 'rd.live.check' is still present as the kernel argument in
> > rhev-hypervisor7-7.1-20150512.1.iso  [See screenshot]
> 
> No patch for this bug, from "Doc Text", we could get the workaround method
> for this bug, the user need to remove the kernel argument rd.live.check from
> the kernel commandline by manual to prevent the media check.
> 
> If you want have a patch to deleted "rd.live.check" from the kernel
> commandline, please open this bug again

Yes

Comment 22 haiyang,dong 2015-06-30 10:10:53 UTC
(In reply to Anand Nande from comment #21)
> (In reply to haiyang,dong from comment #20)

> > 
> > No patch for this bug, from "Doc Text", we could get the workaround method
> > for this bug, the user need to remove the kernel argument rd.live.check from
> > the kernel commandline by manual to prevent the media check.
> > 
> > If you want have a patch to deleted "rd.live.check" from the kernel
> > commandline, please open this bug again
> 
> Yes

Hey Fabian,

Could you declare whether we need a patch to deleted "rd.live.check" from the kernel commandline in ovirt-node or not?

Comment 23 Fabian Deutsch 2015-06-30 10:20:11 UTC
Let me first ask:

Anand, why would you like to see the kernel argument removed?

RHEV-H is livecd based, thus the inital check is very valuable, because it ensures that the ISO for booting has no errors.

Comment 24 Fabian Deutsch 2015-07-02 09:11:29 UTC
Closing this bug according to comment 17.

Regarding the discussion from comment 20 on - If you really want to discuss the removal fo that kernel argument, then please open a new bug.

Comment 31 Fabian Deutsch 2015-08-18 09:33:21 UTC
We will compare the behavior between RHEV-H and RHEL to ensure that the behavior on RHEV-H is the same as on RHEL.

Comment 37 Fabian Deutsch 2015-08-31 06:57:03 UTC
Bug 1167735 is tracking this misbehavior in RHEL, once that is fixed, it will directly be inherited in RHEV-H.

Comment 40 Moran Goldboim 2015-09-20 13:48:52 UTC
I'm leaving this one open for tracking purposes.
No code change should be made from node side since behavior should be equal to RHEL.
This bug is not considered a blocker for the release (removing the regression keyword so it won't reappear)

Comment 41 cshao 2015-09-22 10:48:41 UTC
I can reproduce this issue with RHEV-H 7.1 for RHEV 3.5.5 (rhev-hypervisor-7-7.1-20150917.0)+(ovirt-node-3.2.3-23.el7.noarch) build.

Comment 48 Douglas Schilling Landgraf 2016-09-29 20:13:09 UTC
Hi,

To check if rhev-h 7.3.x contains the fix:

    - I have removed rhgb and quiet from grub and booted the iso.
    - Waited until the media check and pressed ESC
    - The installation continued

Based on above, moving the bug to ON_QA

Comment 49 cshao 2016-10-10 05:34:58 UTC
Test version:
Test version:
rhev-hypervisor7-7.3-20161007.0
ovirt-node-3.6.1-31.0.el7ev.noarch
plymouth-0.8.9-0.26.20140113.el7.x86_64

Test steps:
1. Boot from ISO,
2. Wait until the media check and pressed ESC

Test result:
The installation continued.
So the bug is fixed, change bug status to VERIFIED.

Comment 50 Sandro Bonazzola 2017-01-20 07:09:03 UTC
As per comment #17 this bug shouldn't have been reopened. A new bug should have been opened to address the issues.

I'm now cloning this bug to a new one and moving this back to closed errata.
I'll move to verified the cloned bug since this bug status is currently in verified state.

Comment 51 Sandro Bonazzola 2017-01-20 07:17:15 UTC
Please move further comments to bug #1415068.


Note You need to log in before you can comment on or make changes to this bug.