Bug 1687348

Summary: Guest can't boot successfully when there are two bootable disk in guest
Product: Red Hat Enterprise Linux 8 Reporter: Meina Li <meili>
Component: dracutAssignee: Lukáš Nykrýn <lnykryn>
Status: CLOSED WONTFIX QA Contact: qe-baseos-daemons
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.0CC: dracut-maint-list, dtardon, dyuan, hhan, lmen, lnykryn, xuzhang, yafu
Target Milestone: rc   
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1707693 (view as bug list) Environment:
Last Closed: 2021-02-01 07:39:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1707693    
Attachments:
Description Flags
boot.log none

Description Meina Li 2019-03-11 10:46:31 UTC
Created attachment 1542826 [details]
boot.log

Description of problem:
Guest can't boot successfully when there are two bootable disk in guest

Version-Release number of selected component (if applicable):
libvirt-5.0.0-5.virtcov.el8.x86_64
qemu-kvm-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64
kernel-4.18.0-74.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Prepare a guest with two bootable disk.
# virsh edit guest
...
 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/lmn.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/rhel76.qcow2'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
    </disk>
...

2. Start guest and check the status.
# virsh start guest
Domain guest started
# virsh list --all
 Id    Name   State
-----------------------
 256   guest    running

3. Check the console of the guest by virt-manager and find the guest will hang.
...
[    4.373732] ata2: SATA link down (SStatus 0 SControl 300)
[    4.373982] ata1: SATA link down (SStatus 0 SControl 300)
[    4.381651] ata4: SATA link down (SStatus 0 SControl 300)
[    4.381896] ata3: SATA link down (SStatus 0 SControl 300)
[    4.397653] ata6: SATA link down (SStatus 0 SControl 300)
[    4.397898] ata5: SATA link down (SStatus 0 SControl 300)
..

Actual results:
Can't boot successfully when there are two bootable disk in guest

Expected results:
Should boot the guest successfully

Additional info:
1) Please refer to the boot.log in attachment.
2) The UUID of vg, lv are different and the following are fstab file for two images:
For lmn.qcow2:
/dev/mapper/rhel-root   /                       xfs     defaults        0 0
UUID=d6423e79-98a9-4ff9-b5fc-a6ca2356f9b5 /boot                   xfs     defaults        0 0
/dev/mapper/rhel-swap   swap                    swap    defaults        0 0

For rhel76.qcow2:
/dev/mapper/rhel-root   /                       xfs     defaults        0 0
UUID=bd8118d2-aa31-4045-bcf5-fec6c7792e2a /boot                   xfs     defaults        0 0
/dev/mapper/rhel-swap   swap                    swap    defaults        0 0
3) Change both the path of rhel-root and rhel-swap to UUID and start guest, it can also reproduce.

Comment 2 David Tardon 2020-08-05 13:03:54 UTC
(In reply to Meina Li from comment #0)
> 2) The UUID of vg, lv are different

That doesn't matter.

>  and the following are fstab file for two
> images:
> For lmn.qcow2:
> /dev/mapper/rhel-root   /                       xfs     defaults        0 0
> UUID=d6423e79-98a9-4ff9-b5fc-a6ca2356f9b5 /boot                   xfs    
> defaults        0 0
> /dev/mapper/rhel-swap   swap                    swap    defaults        0 0
> 
> For rhel76.qcow2:
> /dev/mapper/rhel-root   /                       xfs     defaults        0 0
> UUID=bd8118d2-aa31-4045-bcf5-fec6c7792e2a /boot                   xfs    
> defaults        0 0
> /dev/mapper/rhel-swap   swap                    swap    defaults        0 0
> 3) Change both the path of rhel-root and rhel-swap to UUID and start guest,
> it can also reproduce.

What matters is that the VG and LV names are the same. Therefore when lvm scans the disks, it sees duplicate VGs and refuses to activate anything:

Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: Scanning devices vda2 vdb2  for LVM logical volumes rhel/root rhel/swap
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: inactive '/dev/rhel/swap' [2.00 GiB] inherit
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: inactive '/dev/rhel/root' [<17.00 GiB] inherit
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: inactive '/dev/rhel/swap' [2.00 GiB] inherit
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: inactive '/dev/rhel/root' [<17.00 GiB] inherit
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: Multiple VGs found with the same name: skipping rhel
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: Use --select vg_uuid=<uuid> in place of the VG name.
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: Multiple VGs found with the same name: skipping rhel
Aug 05 14:04:29 localhost.localdomain dracut-initqueue[381]: Use --select vg_uuid=<uuid> in place of the VG name.

Meanwhile, dracut is waiting till /sysroot is mounted. That is handled via a .mount unit, which in turn makes systemd wait for /dev/mapper/rhel-root, which never appears, because the LV it's on has not been activated. Finally, the timeout runs out and systemd gives up:

Aug 05 14:04:53 localhost.localdomain systemd[1]: dev-mapper-rhel\x2droot.device: Job dev-mapper-rhel\x2droot.device/start timed out.
Aug 05 14:04:53 localhost.localdomain systemd[1]: Timed out waiting for device dev-mapper-rhel\x2droot.device.

After which dracut gives up.

I guess we could add a new option rd.lvm.vg.uuid= and pass --select vg_uuid=... to lvm lvchange if it's set...

Comment 5 RHEL Program Management 2021-02-01 07:39:18 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 6 Meina Li 2021-03-17 06:05:00 UTC
Hi,

This bug is still existed on:
libvirt-7.0.0-8.module+el8.4.0+10233+8b7fd9eb.x86_64
qemu-kvm-5.2.0-10.module+el8.4.0+10217+cbdd2152.x86_64
kernel-4.18.0-293.el8.x86_64

And I think this problem may be worth solving. 
Do you think we still need to open this bug?

Thanks.

Comment 7 yafu 2021-09-22 07:36:37 UTC
Can not reproduce the issue with:
kernel-4.18.0-339.el8.x86_64
libvirt-7.6.0-3.module+el8.5.0+12510+80564ecf.x86_64
qemu-kvm-6.0.0-30.module+el8.5.0+12586+476da3e1.x86_64

Comment 8 Red Hat Bugzilla 2023-09-15 00:16:09 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days