888635 – don't boot from un-selected devices

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 888635 - don't boot from un-selected devices

Summary: don't boot from un-selected devices

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.4
Hardware:	Unspecified
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Laine Stump
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	888633 903204 1039446
Blocks:
TreeView+	depends on / blocked

Reported:	2012-12-19 04:25 UTC by Dave Allan
Modified:	2018-11-12 07:47 UTC (History)
CC List:	18 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	The "seabios" BIOS code (used by QEMU as the initial boot code for every virtual machine) can be configured to prioritize the order in which various boot targets (hard disk 1, CD, PXE, etc) are checked for valid bootstrap code to load an operating system, and libvirt supports setting these priorities. Seabios' boot selector works like this: it starts with the highest priority boot target, looking for valid bootstrap code. If seabios is unable to get valid bootstrap code from this target, it moves on to the next target in the list (for example, if booting from CD fails, an attempt will be made to boot from hard disk if that is next in the list), and so on all the way down the list. If all targets fail, the bios will pause for 60 seconds, then restart the virtual machine, causing a retry of this process, again starting from the highest priority target. However, if any target does have "valid" bootstrap code (determined by computing a checksum of the returned data block, and meaning that it contains bootstrap code which will attempt to load a full operating system), that code will be executed and the bios will consider its job done (and so no restart/retry will be attempted), even if that valid boot sector is actually unable to boot an OS. For example most hard disks that have a partition table but don't yet have an operating system installed will still have a valid boot sector, and when that boot sector is executed, it will display a message something like "BOOT DISK FAILURE, PRESS ANY KEY", and wait there "forever" until a key is pressed, after which the bootstrap code that was loaded from the hard disk will again attempt to boot from the same hard disk ad infinitum until the guest is manually restarted. Unfortunately, seabios will attempt to retrieve a boot sector from every potential boot target, not just those that have been assigned a priority, so there is no way to completely disable any specific boot target (i.e. remove it from the list); it can only be moved lower in the priority list. This can be problematic if, for example, PXE (network) boot is set as the highest priority on a system that also contains a non-bootable hard disk; if the PXE boot code is unable to contact the PXE boot server before it times out (the timeout is quite short, and unfortunately not configurable), the seabios boot selector will move on to the hard disk (even if it hasn't been given any boot priority by libvirt at virtual machine startup time), and execute its boot sector, which results in the virtual machine being stuck at a "press any key" input prompt, thus requiring direct user intervention to boot the virtual machine. Workaround: The only known way to work around this problem currently is to create a small disk image with a valid boot sector that simply reboots the virtual machine, and mark that disk with a boot priority higher than the priority of the existing hard disk (but lower than the priority of the desired boot target). Here is a short shell command that will create the appropriate disk image: echo -e "00: b0 fe e6 64\n1fc: 00 00 55 aa\n" \ "7fffc: 00 00 00 00" \ \| xxd -r -g 1 -c 4 \ >/var/lib/libvirt/images/reboot.img (the xxd command is part of the vim-common package). And here is the XML required to add that device to a guest: <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/reboot.img'/> <target dev='hdb' bus='ide'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> <boot order='2'/> </disk> Assuming you've saved the above XML to the file reboot.xml, you can add this disk to domain "X" with the following command: virsh attach-device X reboot.xml --config (You can change the controller/bus/target/unit and/or target dev to any available values you like, as long as <boot order='x'/> places this device after PXE boot, and prior to any unwanted device in the priority list) Note the "<boot order='2'/>" - this assumes that you will add "<boot order='1'/> to the guest's <interface> definition to ensure that PXE boot is attempted before booting from this disk. (Also note that specifying <boot order='x'/> for individual devices is incompatible with the older method of specifying multiple "<boot dev='hd\|net\|etc'/> elements inside the <os> element of the guest configuration - you will need to remove any such lines from your guest's configuration).
Clone Of:	888633
Clones:	1037593 (view as bug list)
Environment:
Last Closed:	2013-01-09 02:08:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Dave Allan 2012-12-19 04:25:23 UTC

+++ This bug was initially created as a clone of Bug #888633 +++

Description of problem:

As I said in https://bugzilla.redhat.com/show_bug.cgi?id=831273#c10

VMs after failure of network boot tries to boot from other devices even if the they are not selected, this problem also exists in seabios upstream.

Currently we would adjust the priority according boot_deivces parameter in seabios, the default priority is 9999. We can resolve this issue by ignoring boot dev if its priority is 9999.

Version-Release number of selected component (if applicable):
seabios-0.6.1.2-25.el6.x86_64

How reproducible:

Steps to Reproduce:
1. qemu-kvm -boot order=c rhel6.image.qcow2   (vm disk is non-bootable)
  
Actual results:
1. vm will try to boot from disk, and failed
2. vm also try to boot from un-selected network 

Expected results:
vm only try to boot from disk.

Additional info:
related bug: Bug 821331 - [RFE] KVM guest retry pxe booting even after failure

Comment 1 Jiri Denemark 2012-12-20 09:22:43 UTC

I think we do all we can in libvirt and no further changes should be required. Libvirt explicitly sets bootindex (starting from 1) for devices that are configured as bootable in domain XML. All other devices have no bootindex associated with them.

Comment 2 Gleb Natapov 2012-12-20 09:25:13 UTC

(In reply to comment #1)
> I think we do all we can in libvirt and no further changes should be
> required. Libvirt explicitly sets bootindex (starting from 1) for devices
> that are configured as bootable in domain XML. All other devices have no
> bootindex associated with them.

We likely will not change current default behaviour. New parameter, to select this new behaviour, will be needed.

Comment 4 Dave Allan 2013-01-09 02:08:07 UTC

Unfortunately, this behavior cannot be changed without work in seabios, but the existing behavior is now documented.

Comment 5 Laine Stump 2013-02-06 16:30:13 UTC

The current behavior, and suggested workaround, has been documented in a knowledgebase article:

https://access.redhat.com/knowledge/node/306863

Comment 6 Laine Stump 2013-12-03 10:21:12 UTC

For future reference, newer qemu has added a "-boot strict" option which prevents this behavior, and upstream libvirt will now add that option to the qemu commandline any time it is present in the qemu binary:

commit 96fddee322c7d39a57cfdc5e7be71326d597d30a
Author: Laine Stump <laine>
Date:   Mon Dec 2 14:07:12 2013 +0200

    qemu: add "-boot strict" to commandline whenever possible

See Bug 888633 and Bug 903204 for details of the qemu fix.

Comment 7 Laine Stump 2013-12-03 12:37:34 UTC

The above commit also requires this commit to avoid build failures:

commit 9f6f2fa467e57f41e60835e7e94fd2487c9b0e5e
Author: Laine Stump <laine>
Date:   Tue Dec 3 12:58:50 2013 +0200

    tests: add forgotten boot-strict test files

Note You need to log in before you can comment on or make changes to this bug.