Bug 1990808

Summary: Guest whose os is installed multiple disks but boot partition is installed on single disk can't boot into OS on RHEL 8
Product: Red Hat Enterprise Linux 9 Reporter: Vera <vwu>
Component: seabiosAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED WONTFIX QA Contact: Xueqiang Wei <xuwei>
Severity: medium Docs Contact:
Priority: medium    
Version: 9.0CC: ahadas, coli, jinzhao, juzhang, juzhou, kraxel, lmen, meili, mxie, mzhan, phrdina, rjones, tgolembi, tyan, tzheng, virt-maint, xiaodwan, xuwei, xuzhang, ymankad
Target Milestone: betaKeywords: Automation, Regression, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1924972 Environment:
Last Closed: 2022-08-16 09:34:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1924972    
Bug Blocks:    

Comment 1 Xueqiang Wei 2021-08-10 16:36:44 UTC
According to https://bugzilla.redhat.com/show_bug.cgi?id=1924972#c41, retested on rhel9, also hit this issue.

Versions:
kernel-5.14.0-0.rc4.35.el9.x86_64
qemu-kvm-6.0.0-10.el9
seabios-bin-1.14.0-5.el9.noarch


1. create two images for guest
# qemu-img create -f qcow2 /home/kvm_autotest_root/images/rhel900-64-virtio-scsi-.qcow2 30G
# qemu-img create -f qcow2 /home/kvm_autotest_root/images/data1.qcow2 20G

2. install a rhel9.0 guest, manual partitioning disks during install OS, root and swap partition are installed on two disks(select sda and sdb), but boot partition is installed on single disk, such as just select sdb to install boot.

qemu command lines:
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -machine q35,memory-backend=mem-machine_mem \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 14336 \
    -object memory-backend-ram,size=14336M,id=mem-machine_mem  \
    -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2  \
    -cpu 'Opteron_G5',+kvm_pv_unhalt \
    -chardev socket,server=on,wait=off,path=/tmp/avocado_1ot3f7dy/monitor-qmpmonitor1-20210809-101140-hem3YaW3,id=qmp_id_qmpmonitor1  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,server=on,wait=off,path=/tmp/avocado_1ot3f7dy/monitor-catch_monitor-20210809-101140-hem3YaW3,id=qmp_id_catch_monitor  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idnjDEZR \
    -chardev socket,server=on,wait=off,path=/tmp/avocado_1ot3f7dy/serial-serial0-20210809-101140-hem3YaW3,id=chardev_serial0 \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20210809-101140-hem3YaW3,path=/tmp/avocado_1ot3f7dy/seabios-20210809-101140-hem3YaW3,server=on,wait=off \
    -device isa-debugcon,chardev=seabioslog_id_20210809-101140-hem3YaW3,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi-.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -blockdev node-name=file_data,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/data1.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_data \
    -device scsi-hd,id=data,drive=drive_data,write-cache=on \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:18:70:50:40:66,id=iduEbo1J,netdev=idYATUsl,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idYATUsl \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20210803.6-x86_64-dvd1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=d,strict=off  \
    -no-shutdown \
    -enable-kvm \
    -monitor stdio \


3.Then finish OS installation for guest

4. Start the guest without 'bootindex'

5. Start the guest with 'bootindex'
add bootindex=1 to image data1.qcow2, e.g. "-device scsi-hd,id=data,drive=drive_data,write-cache=on,bootindex=1 \"


After step 3, guest installs successfully.
After step 4, guest can start successfully.
After step 5, guest can not start.

Comment 2 Xueqiang Wei 2021-08-11 04:34:23 UTC
Tested with edk2-ovmf-20210527gite1999b264f1f-5.el9.noarch, also hit this issue. So I think it's not seabios issue, can move it to qemu?


Versions:
kernel-5.14.0-0.rc4.35.el9.x86_64
qemu-kvm-6.0.0-10.el9
edk2-ovmf-20210527gite1999b264f1f-5.el9.noarch



1. create two images for guest
# qemu-img create -f qcow2 /home/kvm_autotest_root/images/rhel900-64-virtio-scsi-.qcow2 30G
# qemu-img create -f qcow2 /home/kvm_autotest_root/images/data1.qcow2 20G

2. install rhel9.0 guest, manual partitioning disks during install OS, root and swap partition are installed on two disks(select sda and sdb), but /boot/efi partition is installed on single disk, such as just select sdb to install boot.


# cp /usr/share/OVMF/OVMF_VARS.fd /home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-64-virtio-scsi.qcow2_VARS.fd

qemu command lines:
/usr/libexec/qemu-kvm \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox on  \
    -blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
    -blockdev node-name=file_ovmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
    -blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
    -machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
    -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
    -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
    -nodefaults \
    -device VGA,bus=pcie.0,addr=0x2 \
    -m 15360 \
    -object memory-backend-ram,size=15360M,id=mem-machine_mem  \
    -smp 12,maxcpus=12,cores=6,threads=1,dies=1,sockets=2  \
    -cpu 'Opteron_G5',+kvm_pv_unhalt \
    -chardev socket,id=qmp_id_qmpmonitor1,server=on,wait=off,path=/tmp/avocado_qrgu9qi7/monitor-qmpmonitor1-20210809-121558-wBNvFjXB  \
    -mon chardev=qmp_id_qmpmonitor1,mode=control \
    -chardev socket,id=qmp_id_catch_monitor,server=on,wait=off,path=/tmp/avocado_qrgu9qi7/monitor-catch_monitor-20210809-121558-wBNvFjXB  \
    -mon chardev=qmp_id_catch_monitor,mode=control \
    -device pvpanic,ioport=0x505,id=idUo2RS7 \
    -chardev socket,id=chardev_serial0,server=on,wait=off,path=/tmp/avocado_qrgu9qi7/serial-serial0-20210809-121558-wBNvFjXB \
    -device isa-serial,id=serial0,chardev=chardev_serial0  \
    -chardev socket,id=seabioslog_id_20210809-121558-wBNvFjXB,path=/tmp/avocado_qrgu9qi7/seabios-20210809-121558-wBNvFjXB,server=on,wait=off \
    -device isa-debugcon,chardev=seabioslog_id_20210809-121558-wBNvFjXB,iobase=0x402 \
    -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
    -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi-.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
    -blockdev node-name=file_data,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/data1.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_data \
    -device scsi-hd,id=data,drive=drive_data,write-cache=on \
    -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
    -device virtio-net-pci,mac=9a:50:b5:dd:b4:3e,id=idf1ZeRk,netdev=idbwJAZM,bus=pcie-root-port-3,addr=0x0  \
    -netdev tap,id=idbwJAZM,vhost=on \
    -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/linux/RHEL-9.0.0-20210803.6-x86_64-dvd1.iso,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \
    -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \
    -vnc :0  \
    -rtc base=utc,clock=host,driftfix=slew  \
    -boot menu=off,order=cdn,once=d,strict=off  \
    -no-shutdown \
    -enable-kvm \
    -monitor stdio \


3.Then finish OS installation for guest

4. Start the guest without 'bootindex'

5. Start the guest with 'bootindex'
add bootindex=1 to image data1.qcow2, e.g. "-device scsi-hd,id=data,drive=drive_data,write-cache=on,bootindex=1 \"


After step 3, guest installs successfully.
After step 4, guest can start successfully.
After step 5, guest can not start.

Comment 3 Gerd Hoffmann 2021-08-11 08:08:19 UTC
(In reply to Xueqiang Wei from comment #2)
> Tested with edk2-ovmf-20210527gite1999b264f1f-5.el9.noarch, also hit this
> issue. So I think it's not seabios issue, can move it to qemu?

No, it's clearly firmware.

> 4. Start the guest without 'bootindex'

What does 'fdisk -l /dev/sda /dev/sdb' in the guest print?

> After step 4, guest can start successfully.
> After step 5, guest can not start.

When you go to the ovmf boot menu, any difference between step 4 and step 5?

Comment 4 Xueqiang Wei 2021-08-13 09:15:59 UTC
(In reply to Gerd Hoffmann from comment #3)
> (In reply to Xueqiang Wei from comment #2)
> > Tested with edk2-ovmf-20210527gite1999b264f1f-5.el9.noarch, also hit this
> > issue. So I think it's not seabios issue, can move it to qemu?
> 
> No, it's clearly firmware.
> 
> > 4. Start the guest without 'bootindex'
> 
> What does 'fdisk -l /dev/sda /dev/sdb' in the guest print?


According to attached screenshot "how-to-partition-disks.png" in bug 1924972, I set up the partitions manually. I also attach one screenshot "boot_efi_screenshot".

In the guest:
# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   30G  0 disk 
├─sda1   8:1    0   25G  0 part /
└─sda2   8:2    0    2G  0 part [SWAP]
sdb      8:16   0   20G  0 disk 
└─sdb1   8:17   0    1G  0 part /boot/efi
sr0     11:0    1  7.5G  0 rom  /run/media/xuwei/RHEL-9-0-0-BaseOS-x86_64


# fdisk -l /dev/sda /dev/sdb
Disk /dev/sda: 30 GiB, 32212254720 bytes, 62914560 sectors
Disk model: QEMU HARDDISK   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 6825D157-A510-4229-8C36-867969E6EE70

Device        Start      End  Sectors Size Type
/dev/sda1      2048 52430847 52428800  25G Linux filesystem
/dev/sda2  52430848 56625151  4194304   2G Linux swap


Disk /dev/sdb: 20 GiB, 21474836480 bytes, 41943040 sectors
Disk model: QEMU HARDDISK   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 9FE7F0DA-C765-42EC-B60B-37DA94D8F7D4

Device     Start     End Sectors Size Type
/dev/sdb1   2048 2099199 2097152   1G EFI System



> 
> > After step 4, guest can start successfully.
> > After step 5, guest can not start.
> 
> When you go to the ovmf boot menu, any difference between step 4 and step 5?

In the Boot Manager Menu, I found the order of disks is different. Please refer to the attached screenshots. Thanks.


And I found, 
If I select the first item "Red Hat Enterprise Linux", the guest starts successfully.
If I select the item "UEFI QEMU QEMU HARDDISK", the guest starts successfully.
If I select the item "UEFI QEMU QEMU HARDDISK 2", the guest can not start.

Comment 11 Gerd Hoffmann 2021-08-16 05:44:20 UTC
> Device        Start      End  Sectors Size Type
> /dev/sda1      2048 52430847 52428800  25G Linux filesystem
> /dev/sda2  52430848 56625151  4194304   2G Linux swap

> Device     Start     End Sectors Size Type
> /dev/sdb1   2048 2099199 2097152   1G EFI System

Ok, so disk2 has the efi partition (as intended).

> In the Boot Manager Menu, I found the order of disks is different. Please
> refer to the attached screenshots. Thanks.

As expected.  Disks are sorted according to the boot order ...

> And I found, 
> If I select the first item "Red Hat Enterprise Linux", the guest starts
> successfully.
> If I select the item "UEFI QEMU QEMU HARDDISK", the guest starts
> successfully.
> If I select the item "UEFI QEMU QEMU HARDDISK 2", the guest can not start.

... so with "bootindex=1" for the data1 disk it goes first.

So, it all looks rather normal to me.  ovmf finds both disks and orders
them according to the bootindex.

The failure screenshot looks like shim+grub loaded just fine, but then grub
has trouble finding its config file.  That might be a result of the changed
disk ordering after install.

In general shuffling around disks after install is not guaranteed to work.
It does work in many cases because disks and partitions are referenced by
uuid these days.  But there always can be corner cases where things do not
work.

I'd suggest to change the testcase to use bootindex right from the start
(i.e. also for the install), not add it later on.

Comment 12 Xueqiang Wei 2021-08-18 06:13:48 UTC
(In reply to Gerd Hoffmann from comment #11)
> > Device        Start      End  Sectors Size Type
> > /dev/sda1      2048 52430847 52428800  25G Linux filesystem
> > /dev/sda2  52430848 56625151  4194304   2G Linux swap
> 
> > Device     Start     End Sectors Size Type
> > /dev/sdb1   2048 2099199 2097152   1G EFI System
> 
> Ok, so disk2 has the efi partition (as intended).
> 
> > In the Boot Manager Menu, I found the order of disks is different. Please
> > refer to the attached screenshots. Thanks.
> 
> As expected.  Disks are sorted according to the boot order ...
> 
> > And I found, 
> > If I select the first item "Red Hat Enterprise Linux", the guest starts
> > successfully.
> > If I select the item "UEFI QEMU QEMU HARDDISK", the guest starts
> > successfully.
> > If I select the item "UEFI QEMU QEMU HARDDISK 2", the guest can not start.
> 
> ... so with "bootindex=1" for the data1 disk it goes first.
> 
> So, it all looks rather normal to me.  ovmf finds both disks and orders
> them according to the bootindex.
> 
> The failure screenshot looks like shim+grub loaded just fine, but then grub
> has trouble finding its config file.  That might be a result of the changed
> disk ordering after install.
> 
> In general shuffling around disks after install is not guaranteed to work.
> It does work in many cases because disks and partitions are referenced by
> uuid these days.  But there always can be corner cases where things do not
> work.
> 
> I'd suggest to change the testcase to use bootindex right from the start
> (i.e. also for the install), not add it later on.


Add bootindex to disk1 and disk2, and tested with seabios and edk2, both installation and booting work well. Thanks Gerd.

Versions:
kernel-5.14.0-0.rc4.35.el9.x86_64
qemu-kvm-6.0.0-10.el9
edk2-ovmf-20210527gite1999b264f1f-5.el9.noarch
seabios-bin-1.14.0-5.el9.noarch

qemu command lines:
    -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
    -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio-scsi-.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \
    -device scsi-hd,id=image1,drive=drive_image1,write-cache=on,bootindex=1 \
    -blockdev node-name=file_data,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/data1.qcow2,cache.direct=on,cache.no-flush=off \
    -blockdev node-name=drive_data,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_data \
    -device scsi-hd,id=data,drive=drive_data,write-cache=on,bootindex=2 \


Since it's corner case, I set priority to medium. If I was wrong, please correct me. Thanks.



Hi Vera,

Could you try it with bootindex? For Gerd's suggestion(change the testcase to use bootindex), if it's acceptable for you? Thanks.