Bug 1346872

Summary: Hide boot entry of original root filesystem created by anaconda
Product: [oVirt] ovirt-node Reporter: Fabian Deutsch <fdeutsch>
Component: Installation & UpdateAssignee: Ryan Barry <rbarry>
Status: CLOSED CURRENTRELEASE QA Contact: Huijuan Zhao <huzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: bugs, cshao, dfediuck, huzhao, leiwang, mgoldboi, rbarry, tlitovsk, weiwang, yaniwang, ycui
Target Milestone: ovirt-4.0.5Keywords: Reopened
Target Release: 4.0Flags: dfediuck: ovirt-4.0.z+
huzhao: testing_plan_complete+
mgoldboi: planning_ack+
fdeutsch: devel_ack+
ycui: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: imgbased-0.8.5-0.1.el7ev Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-18 07:37:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Fabian Deutsch 2016-06-15 13:47:32 UTC
Description of problem:
Currently after instakllation 3 entries are shown:

ovirt-node-ng-x.y.z
CentOS …
tboot …

The "CentOS" entry is pointing to the original LV created by anacnoda, this entry should be removed, because that LV should not be accessed anymore.
At best we can just hide the entry, to be able to show it again if needed.
But if this does not work we can probably just delete it.

The tboot entry should be kept for trusted boots.

Thus this bug is really just about removing the initial boot entry from anaconda.

Comment 1 Huijuan Zhao 2016-07-19 11:06:30 UTC
Test version:
redhat-virtualization-host-4.0-20160713.0.iso
imgbased-0.7.2-0.1.el7ev.noarch

Test steps:
1. Install redhat-virtualization-host-4.0-20160713.0.iso
2. Reboot and focus on boot entry

Test results:
After step2, there are 2 entries are shown:
rhvh-4.0-0.20160713.0
tboot 1.8.1

So this issue is fixed on redhat-virtualization-host-4.0-20160713.0.iso, I will change the status to VERIFIED

Comment 2 Fabian Deutsch 2016-07-21 15:04:55 UTC
We should move this out to 4.1 because the current patches are quite risky.

In the worst case, the user can be left without any boot entries.

After the two new patches got merged, this bug should be moved to 4.1.

Comment 3 Red Hat Bugzilla Rules Engine 2016-07-21 15:05:01 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 4 Ryan Barry 2016-07-21 15:53:37 UTC
(In reply to Fabian Deutsch from comment #2)
> We should move this out to 4.1 because the current patches are quite risky.
> 
> In the worst case, the user can be left without any boot entries.

In order for the user to be left without any boot entries, no exceptions must be thrown during the update/initialization process, since removing other entries is the last step taken.

The bootloader abstraction must be in a state which is working "enough" to add a new entry (and parse entries), but broken enough which parsing other entries fails in order to leave the user without any boot entries.

There's some risk in removing bootloader entries at any time (or modifying bootloader entries), but this chain of events is unlikely.

We could also add a additional safeguard to remove_other_entries which only removes them as long as there are valid entries.

Comment 5 Fabian Deutsch 2016-07-21 18:29:45 UTC
We need to be careful.
If we mess up the boot entries, which are in fact the root of the whole boot chain, then user can not boot anymore, and there are high chances that the user needs to reinstall - or has a lot of manual work.

The issue here - a superfluous entry - is just cosmetic and does thus not justify to take the risk to leave the machine in an unbootable state.

Comment 6 Ryan Barry 2016-07-21 18:37:23 UTC
Yes, we need to be careful. 

That this is already verified and would be caught before a smoke test was even sent is true, but not relevant.

It's more that we face the same risks whether we remove bootloader entries now, in 4.1, or later. Without a better system-level abstraction than grubby, we must ask whether this an acceptable risk to take at any point, and balance that against the "cosmetic" possibility of users booting into a non-imgbased installation accidentally.

Comment 7 Fabian Deutsch 2016-07-22 09:33:29 UTC
The verification was only done in a positive flow.

It should not be to difficult to find a flow where the problem is triggered.

To mitigate the risk of loosing all boot entries and not beeing able to react to it, we could move the removal of the entries to a post-install part, i.e. as a service.

1. Service starts, removes entries
2. Login prompt is show with degraded, because check detected that all entries got removed
3. the user can still login and fix it.

In addition we can keep a backup of the original grub.cfg

But yes, we will always have the risk of making the machine unbootable, but there are measures we can take to reduce the risk.

Let's move the solution discussion to gerrit.

Comment 8 Ying Cui 2016-08-08 10:07:51 UTC
Could that be possible in RHVH 4.0 GA?  The grub info is confusing to end user.

Comment 9 Ryan Barry 2016-09-17 00:23:45 UTC
This needs an additional change to redhat-release-virtualization-host to enable the service.

For verification, please install, "systemctl enable imgbased-clean-grub.service", then reboot (this will happen automatically in the next build)

Comment 11 Ying Cui 2016-09-20 00:41:35 UTC
This bug should be for 4.1, move back to MODIFIED status, after cloning, then we can make the cloned bug to ON_QA on zstream. Thanks.

Comment 12 Fabian Deutsch 2016-09-21 14:23:52 UTC
Considering the bug workflow for upstream bugs, this bug is now for z-stream.

Comment 13 Huijuan Zhao 2016-09-22 03:12:44 UTC
Test version:
RHVH-4.0-20160919.1-RHVH-x86_64-dvd1.iso
imgbased-0.8.5-0.1.el7ev.noarch

Test steps:
1. Install redhat-virtualization-host-4.0-20160919.1.iso
2. Reboot and focus on boot entry

Test results:
After step2, there are 2 entries are shown:

rhvh-4.0-0.20160919.0
tboot 1.8.1


Ryan, are the two boot entries in grub right? and is it the final solution? If yes, I will change the status to VERIFIED. 
But what about "tboot 1.8.1"?

Comment 14 Ryan Barry 2016-09-22 04:17:59 UTC
I'm in favor of leaving this for trusted boot. See comment #1

Comment 15 Huijuan Zhao 2016-09-23 09:38:43 UTC
(In reply to Huijuan Zhao from comment #13)
> Test version:
> RHVH-4.0-20160919.1-RHVH-x86_64-dvd1.iso
> imgbased-0.8.5-0.1.el7ev.noarch
> 
> Test steps:
> 1. Install redhat-virtualization-host-4.0-20160919.1.iso
> 2. Reboot and focus on boot entry
> 
> Test results:
> After step2, there are 2 entries are shown:
> 
> rhvh-4.0-0.20160919.0
> tboot 1.8.1
> 

Update:

Test steps:
1. Install redhat-virtualization-host-4.0-20160919.1.iso
2. Reboot and focus on boot entry
3. Reboot again and focus on boot entry

Test results:
1. After step2, there are still 3 boot entries are shown:
----------
rhvh-4.0-0.20160919.0
Red Hat Enterprise Linux (3.10.0-327.36.1.el7.x86_64) 7.2
tboot 1.8.1
----------

2. After step3, there are 2 boot entries are shown:
-------------
rhvh-4.0-0.20160919.0
tboot 1.8.1
-------------

So there are 2 boot entries only if we reboot RHVH more than 2 times after install. Is this correct?

Comment 16 Ryan Barry 2016-09-23 13:48:52 UTC
Yes, this is correct.

See comment #7 -- the service triggers after the image is booted for the first time, so a second reboot is necessary.

Comment 17 Huijuan Zhao 2016-09-26 02:11:29 UTC
According to comment 15 and comment 16, I will change the status to VERIFIED

Comment 18 Sandro Bonazzola 2016-09-28 13:06:48 UTC
Moving back to POST since a few patches are still to be merged

Comment 19 Red Hat Bugzilla Rules Engine 2016-09-28 13:06:57 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 20 Sandro Bonazzola 2016-09-28 13:07:35 UTC
Moving to 4.0.5 since 4.0.4 has been already released and didn't contain latest patches

Comment 21 Red Hat Bugzilla Rules Engine 2016-09-28 13:07:42 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 22 Huijuan Zhao 2016-10-10 03:18:28 UTC
Test version:
redhat-virtualization-host-4.0-20161007.0
imgbased-0.8.5-0.1.el7ev.noarch
redhat-virtualization-host-image-update-placeholder-4.0-5.0.el7.noarch
kernel-3.10.0-512.el7.x86_64

Test steps:
1. Install redhat-virtualization-host-4.0-20161007
2. Reboot and focus on boot entry
3. Reboot again and focus on boot entry

Test results:
2. After step3, there are 2 boot entries are shown:
-------------
rhvh-4.0-0.20161007.0
tboot 1.9.4
-------------

So this bug is fixed in redhat-virtualization-host-4.0-20161007.0, change the status to VERIFIED