1474730 – Can't boot from a migrated/cloned boot disk

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1474730 - Can't boot from a migrated/cloned boot disk

Summary: Can't boot from a migrated/cloned boot disk

Keywords:
Status:	CLOSED DUPLICATE of bug 1020622
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	virt-manager
Sub Component:
Version:	7.3
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Pavel Hrdina
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-25 09:33 UTC by Nikola
Modified:	2017-09-05 13:45 UTC (History)
CC List:	15 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-09-05 13:45:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Nikola 2017-07-25 09:33:44 UTC

Description of problem:

When migrating /boot from sda1 to sdb1 KVM system fails to boot with "No bootable device found" when directed to boot from sdb1.
When doing the same process using IDE disks - system boots successfully.

Workaround: changing the disk bus from SCSI to IDE (or VirtIO) will allow the system to boot. Also, changing it back from IDE to SCSI will allow the system to boot.

Version-Release number of selected component (if applicable):

RHEL 7.3, 3.10.0-514.21.2.el7.x86_64
qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64
libvirt-2.0.0-10.el7_3.9.x86_64
virt-manager-1.4.0-2.el7.noarch

Disk layout:
sda 8:0 0 1G 0 disk
└─sda1 8:1 0 1G 0 part /boot
sdb 8:16 0 1G 0 disk
sdc 8:32 0 8G 0 disk
└─sdc1 8:33 0 7G 0 part
├─rhel-root 253:0 0 6G 0 lvm /
└─rhel-swap 253:1 0 1G 0 lvm [SWAP]

sda - original (old) boot device
sdb - new device to inherit the old boot device function
sdc - root LVM

Steps to Reproduce:
1. Create a boot partition on the new disk:
# sfdisk -d /dev/sda |grep -E 'sectors|bootable'|sfdisk --force /dev/sdb
2. Clone the /boot partition:
# dd if=/dev/sda1 of=/dev/sdb1 bs=512 conv=noerror,sync
3. Installing the bootloader on the new disk
# grub2-install /dev/sdb (this is done in rescue environment)
4. Detach sda device and reboot

Actual results:
System does not boot (no bootable device found), unless the workaround is applied (changing the disk bus to IDE or VirtIO via VMM).

Expected results:
System boots successfully, eventually allowing the permanent removal of the old boot device, while the new one takes its place.

Additional info:
Reproduced on RHEL 7 and Fedora 25. Both times using Virtual Manager (VMM)

Comment 2 Ademar Reis 2017-07-26 13:21:50 UTC

CongLi, can you please validate the testing by reproducing it in our QE environment? Testing this with RHEL-7.4 latest packages should be enough.

Comment 3 CongLi 2017-07-31 13:00:57 UTC

Reproduced this bug on the following version:
kernel-3.10.0-693.el7.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.3.x86_64
libvirt-3.2.0-14.el7.x86_64
virt-manager-1.4.1-7.el7.noarch

The steps are same as comment 0 via virt-manager.
1. virtio-scsi  --> can not boot up
2. virtio-blk   --> boot up successfully
3. ide          --> boot up successfully
4. virtio-scsi(failed) -> ide(successfully) -> virtio-scsi(successfully)


Thanks.

Comment 4 Fam Zheng 2017-08-17 12:49:23 UTC

Seems a duplicate of bug 1020622? Cong, could you please retest with seabios-1.10.2-3.el7 or above?

Comment 5 Sam Yangsao 2017-08-17 19:53:37 UTC

Just verified this occurs in RHV 4.1, and the workaround addresses this same issue, please test there as well.

Let me know if you need packaging info and if I should file a separate bz.

Thanks much.

Comment 6 CongLi 2017-08-18 02:47:22 UTC

(In reply to Fam Zheng from comment #4)
> Seems a duplicate of bug 1020622? Cong, could you please retest with
> seabios-1.10.2-3.el7 or above?

could reproduce this bug with seabios-1.10.2-3.el7.x86_64.

Comment 7 CongLi 2017-08-18 02:51:31 UTC

(In reply to Sam Yangsao from comment #5)
> Just verified this occurs in RHV 4.1, and the workaround addresses this same
> issue, please test there as well.

Could you reproduce this problem with ide drive for the boot disk?

If yes, please help provide your qemu, seabios, libvirt and virt-manager versions, I will have a try.

Thanks.

> Let me know if you need packaging info and if I should file a separate bz.
> 
> Thanks much.

Comment 8 Fam Zheng 2017-08-18 10:50:23 UTC

Looks like a virt-manager/libvirt issue. After the steps, bootindex= property is set on the initial disks but not on the cloned one.

Setting "<boot order=... />" attributes to disks explicitly with "virsh edit" fixes it, as long as the seabios is new enough to handle booting from non-zero LUN, or alternatively set the new boot hd as LUN 0.

FYI, the reproducer I have yields this final QEMU command line:

...

-drive file=/stor/images/3.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-1,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1,bootindex=1 -drive file=/stor/images/2.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-2 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=2,drive=drive-scsi0-0-0-2,id=scsi0-0-0-2
...

Which corresponds to this libvirt xml:

  <os>
    <boot dev='hd'/>
  </os>

  ...

     <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='unsafe'/>
      <source file='/stor/images/3.qcow2'/>
      <target dev='sdb' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/stor/images/2.qcow2'/>
      <target dev='sdc' bus='scsi'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>

As said above, the workaround is removing the <boot dev='hd' /> line and add <boot order="..." /> lines to each <disk> nodes. For old seabios it is also necessary to change the boot disk's "unit='...'" to "unit='0'".

Reassigning to virt-manager for further investigation.

Comment 10 Sam Yangsao 2017-08-21 14:14:32 UTC

(In reply to CongLi from comment #7)
> (In reply to Sam Yangsao from comment #5)
> > Just verified this occurs in RHV 4.1, and the workaround addresses this same
> > issue, please test there as well.
> 
> Could you reproduce this problem with ide drive for the boot disk?
> 
> If yes, please help provide your qemu, seabios, libvirt and virt-manager
> versions, I will have a try.
> 
> Thanks.
> 
> > Let me know if you need packaging info and if I should file a separate bz.
> > 
> > Thanks much.

Original first disk was virtio-scsi.  I added a second disk that was also virtio-scsi to test disk mirroring.  When I de-activated the first disk and attempted to boot off the second disk, it failed.  Changed the second disk to "ide", the OS booted up fine.

Here is the list of packages from my RHEL-H:

# rpm -qa |grep qemu
ipxe-roms-qemu-20160127-5.git6366fa7a.el7.noarch
qemu-kvm-rhev-2.6.0-28.el7_3.10.x86_64
libvirt-daemon-driver-qemu-2.0.0-10.el7_3.9.x86_64
qemu-kvm-common-rhev-2.6.0-28.el7_3.10.x86_64
qemu-kvm-tools-rhev-2.6.0-28.el7_3.10.x86_64
qemu-img-rhev-2.6.0-28.el7_3.10.x86_64

# rpm -qa |grep seabios
seabios-bin-1.9.1-5.el7_3.3.noarch

# rpm -qa |grep virt
fence-virt-0.3.2-5.el7.x86_64
libvirt-daemon-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-config-nwfilter-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-driver-nodedev-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-driver-qemu-2.0.0-10.el7_3.9.x86_64
ovirt-vmconsole-host-1.0.4-1.el7ev.noarch
virt-what-1.13-8.el7.x86_64
ovirt-imageio-common-1.0.0-0.el7ev.noarch
ovirt-vmconsole-1.0.4-1.el7ev.noarch
libvirt-client-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-driver-network-2.0.0-10.el7_3.9.x86_64
libvirt-python-2.0.0-2.el7.x86_64
libvirt-daemon-driver-secret-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-driver-interface-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-driver-storage-2.0.0-10.el7_3.9.x86_64
libvirt-daemon-kvm-2.0.0-10.el7_3.9.x86_64
virt-v2v-1.32.7-3.el7_3.2.x86_64
collectd-virt-5.7.1-4.el7.x86_64
libvirt-daemon-driver-nwfilter-2.0.0-10.el7_3.9.x86_64
libvirt-lock-sanlock-2.0.0-10.el7_3.9.x86_64
ovirt-imageio-daemon-1.0.0-0.el7ev.noarch

# uname -a
Linux XXXX 3.10.0-514.21.2.el7.x86_64 #1 SMP Sun May 28

Comment 11 Paolo Bonzini 2017-08-22 16:44:44 UTC

Sam, what is the QEMU command line when boot fails?

It may be that you are booting from non-zero LUNs, which was only fixed in 7.4 (bug 1020622).

Comment 12 Sam Yangsao 2017-09-05 00:16:42 UTC

(In reply to Paolo Bonzini from comment #11)
> Sam, what is the QEMU command line when boot fails?
> 
> It may be that you are booting from non-zero LUNs, which was only fixed in
> 7.4 (bug 1020622).

Sorry, didn't see this comment, not sure how you can get the QEMU command line when using RHV

Comment 13 Pavel Hrdina 2017-09-05 06:38:48 UTC

Hi, so the issue is in virt-manager.  When installing the guest virt-manager uses old libvirt syntax which will always mark only first disk as bootable.  The fix would be updating virt-manager and virt-install code to always use the new per device syntax.

Old syntax:

  ...
  <os>
    ...
    <boot dev='hd'/>
    ...
  </os>

New syntax:

  ...
  <devices>
    ...
    <disk ...>
      ...
      <boot order='1'/>
      ...
    </disk>
    ...
  </devices>

In addition when the old syntax is used and there are multiple disk devices virt-manager incorrectly shows all disk devices as configured to be bootable.

There is a workaround, if you modify boot order using virt-manager it will use the new syntax, but only in that case.

Comment 14 Pavel Hrdina 2017-09-05 13:45:06 UTC

Disregard my previous comment :) I've give it some more testing and checked the QEMU command line and after installing the seabios-1.10.2-3.el7 it works.  It doesn't matter whether the old boot XML or new boot XML is used, the correct disk is always marked as bootable.  The issue is indeed with the LUN being different than 0.  I'm closing this bug as a duplicate of BZ 1020622.

For the virt-manager issue that it incorrectly shows all disk devices as bootable if the old XML syntax is used I've created an upstream BZ 1488480.

*** This bug has been marked as a duplicate of bug 1020622 ***

Note You need to log in before you can comment on or make changes to this bug.