Bug 1443345
| Summary: | OVMF can not load windows iso and AHCI device at the same time | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | FuXiangChun <xfu> | ||||||||
| Component: | ovmf | Assignee: | Laszlo Ersek <lersek> | ||||||||
| Status: | CLOSED NOTABUG | QA Contact: | FuXiangChun <xfu> | ||||||||
| Severity: | urgent | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 7.4 | CC: | chayang, jcm, jsnow, juzhang, lijin, michen, xfu | ||||||||
| Target Milestone: | rc | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Windows | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2017-04-25 18:04:03 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
(CC John) So, in this case, the "ahci" device model name (visible on the comment 0 command line), is an alias for "ich9-ahci" (see "qdev_alias_table"). The built-in ich9-ahci device, at 0x1f.2, creates IDE buses ide.0 through ide.5, inclusive. The Windows ISO is available on ide.1. The explicitly created ich9-ahci device gets an auto-assigned PCI B/D/F from QEMU. And, it creates IDE buses ide.6 through ide.11. Multiple SATA controllers appear to be supported by libvirt as well, so I can't immediately reject this bug report as invalid. John, do you agree this is a valid use case? I think I'll have to reproduce the problem locally, with the same installer ISO that QE has. From the OVMF debug log attached to comment 0, I see that the Windows Server 2016 boot loader is launched successfully, and indeed it is loaded from port #1 of the SATA controller at 0x1f.2 (aka "ide.1"): > FSOpen: Open '\EFI\BOOT\BOOTX64.EFI' Success > [Bds] DevicePath expand: PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x1,0xFFFF,0x0) -> > PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x1,0xFFFF,0x0)/CDROM(0x1,0x235,0x2A1D3C) > /\EFI\BOOT\BOOTX64.EFI > [Security] 3rd party image[0] can be loaded after EndOfDxe: > PciRoot(0x0)/Pci(0x1F,0x2)/Sata(0x1,0xFFFF,0x0)/CDROM(0x1,0x235,0x2A1D3C) > /\EFI\BOOT\BOOTX64.EFI. > InstallProtocolInterface: [EfiLoadedImageProtocol] 7EB43D40 > Loading driver at 0x00010000000 EntryPoint=0x00010001090 cdboot.efi (In reply to Laszlo Ersek from comment #3) > Multiple SATA controllers appear to be supported by libvirt as well, so I > can't immediately reject this bug report as invalid. John, do you agree this > is a valid use case? I have no ready-tailored explanations for why this shouldn't work. I don't know if it's a "useful" use case, but I don't expect it to fail, either... When you say the ISO doesn't load, too, what exactly does that mean? Is windows loading but failing to properly initialize the installer (in that UEFI loaded it, but windows flubs the handoff and can't find its own data) or is it not getting that far? etc. (In reply to John Snow from comment #5) > (In reply to Laszlo Ersek from comment #3) > > Multiple SATA controllers appear to be supported by libvirt as well, so I > > can't immediately reject this bug report as invalid. John, do you agree this > > is a valid use case? > > I have no ready-tailored explanations for why this shouldn't work. I don't > know if it's a "useful" use case, but I don't expect it to fail, either... Right, so we agree on that. Thank you for confirming. > When you say the ISO doesn't load, too, what exactly does that mean? Is > windows loading but failing to properly initialize the installer (in that > UEFI loaded it, but windows flubs the handoff and can't find its own data) > or is it not getting that far? etc. This is exactly the information I'm missing from the report. Personally I can't repro the issue at all, and from QE's report, I see that at least the windows boot loader is successfully started. Anyhow, looks like I'm getting access to QE's test env; I hope I can reproduce it there, live. Thanks! OK, I can reproduce it in QE's environment. The entry point function of the cdboot.efi (aka \EFI\BOOT\BOOTX64.EFI) boot loader is called alright, but it exits with an error. One confusing thing is that the error message in the log from comment 0 is: > Loading driver at 0x00010000000 EntryPoint=0x00010001090 cdboot.efi > InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7EBB7B98 > ProtectUefiImageCommon - 0x7EB43D40 > - 0x0000000010000000 - 0x000000000010D000 > Error: Image at 00010000000 start failed: Time out <------- EFI_TIMEOUT > ProtectUefiImageCommon - 0x7EB43D40 > - 0x0000000010000000 - 0x000000000010D000 > Image Return Status = Time out <------- EFI_TIMEOUT This is returned in the case when a key is *not* pressed in time, while the windows boot loader is displaying the prompt > Press any key to boot from CD or DVD... So the log file attached to comment 0 doesn't reflect the actual problematic case. Namely, in case a key is pressed in time, the real failure produces: > Loading driver at 0x00010000000 EntryPoint=0x00010001090 cdboot.efi > InstallProtocolInterface: BC62157E-3E33-4FEC-9920-2D3B36D750DF 7EBB7B98 > ProtectUefiImageCommon - 0x7EAFE040 > - 0x0000000010000000 - 0x000000000010D000 > Error: Image at 00010000000 start failed: No mapping <---- EFI_NO_MAPPING > ProtectUefiImageCommon - 0x7EAFE040 > - 0x0000000010000000 - 0x000000000010D000 > Image Return Status = No mapping <---- EFI_NO_MAPPING This is a really rare error code in edk2. I first thought that cdboot.efi forwarded the error code verbatim from underlying edk2 service; however, after grepping edk2 for both EFI_NO_MAPPING and RETURN_NO_MAPPING, I don't think that's the case. I think cdboot.efi genuinely produces this error code, but I don't know why. Some more info, this time from the "info qtree" output. * When the "-device ahci,id=ahci0" option is not present, and the boot works fine, we have: > dev: ich9-ahci, id "" > addr = 1f.2 > romfile = "" > rombar = 1 (0x1) > multifunction = true > command_serr_enable = true > x-pcie-lnksta-dllla = true > x-pcie-extcap-init = true > class SATA controller, addr 00:1f.2, pci id 8086:2922 (sub 1af4:1100) > bar 4: i/o at 0x6080 [0x609f] > bar 5: mem at 0x98002000 [0x98002fff] > bus: ide.5 > type IDE > dev: ide-cd, id "ide-cd1" > drive = "cdrom1" > logical_block_size = 512 (0x200) > physical_block_size = 512 (0x200) > min_io_size = 0 (0x0) > opt_io_size = 0 (0x0) > discard_granularity = 512 (0x200) > write-cache = "auto" > share-rw = false > rerror = "auto" > werror = "auto" > ver = "2.5+" > wwn = 0 (0x0) > serial = "QM00011" > model = "" > unit = 0 (0x0) > bus: ide.4 > type IDE > bus: ide.3 > type IDE > bus: ide.2 > type IDE > bus: ide.1 > type IDE > dev: ide-drive, id "ide1-0-1" > drive = "drive-ide1-0-1" > logical_block_size = 512 (0x200) > physical_block_size = 512 (0x200) > min_io_size = 0 (0x0) > opt_io_size = 0 (0x0) > discard_granularity = 512 (0x200) > write-cache = "auto" > share-rw = false > rerror = "auto" > werror = "auto" > ver = "2.5+" > wwn = 0 (0x0) > serial = "QM00003" > model = "" > unit = 0 (0x0) > bus: ide.0 > type IDE That is, both the windows ISO (ide.1) and UefiShell.iso (ide.5) are hooked to the only (built-in) ich9-ahci device. Note that the UefiShell.iso CD-ROM is not assigned explicitly on the QEMU command line, it gets auto-assigned: -device ide-cd,drive=cdrom1,id=ide-cd1,bootindex=4 \ * whereas, with "-device ahci,id=ahci0" added, we get: > dev: ich9-ahci, id "ahci0" > addr = 02.0 > romfile = "" > rombar = 1 (0x1) > multifunction = false > command_serr_enable = true > x-pcie-lnksta-dllla = true > x-pcie-extcap-init = true > class SATA controller, addr 00:02.0, pci id 8086:2922 (sub 1af4:1100) > bar 4: i/o at 0x60c0 [0x60df] > bar 5: mem at 0x98005000 [0x98005fff] > bus: ahci0.5 > type IDE > dev: ide-cd, id "ide-cd1" > drive = "cdrom1" > logical_block_size = 512 (0x200) > physical_block_size = 512 (0x200) > min_io_size = 0 (0x0) > opt_io_size = 0 (0x0) > discard_granularity = 512 (0x200) > write-cache = "auto" > share-rw = false > rerror = "auto" > werror = "auto" > ver = "2.5+" > wwn = 0 (0x0) > serial = "QM00023" > model = "" > unit = 0 (0x0) > bus: ahci0.4 > type IDE > bus: ahci0.3 > type IDE > bus: ahci0.2 > type IDE > bus: ahci0.1 > type IDE > bus: ahci0.0 > type IDE I.e., the UefiShell.iso CD-ROM is auto-assigned to "ahci0.5", while: > > dev: ich9-ahci, id "" > addr = 1f.2 > romfile = "" > rombar = 1 (0x1) > multifunction = true > command_serr_enable = true > x-pcie-lnksta-dllla = true > x-pcie-extcap-init = true > class SATA controller, addr 00:1f.2, pci id 8086:2922 (sub 1af4:1100) > bar 4: i/o at 0x6080 [0x609f] > bar 5: mem at 0x98002000 [0x98002fff] > bus: ide.5 > type IDE > bus: ide.4 > type IDE > bus: ide.3 > type IDE > bus: ide.2 > type IDE > bus: ide.1 > type IDE > dev: ide-drive, id "ide1-0-1" > drive = "drive-ide1-0-1" > logical_block_size = 512 (0x200) > physical_block_size = 512 (0x200) > min_io_size = 0 (0x0) > opt_io_size = 0 (0x0) > discard_granularity = 512 (0x200) > write-cache = "auto" > share-rw = false > rerror = "auto" > werror = "auto" > ver = "2.5+" > wwn = 0 (0x0) > serial = "QM00003" > model = "" > unit = 0 (0x0) > bus: ide.0 > type IDE the built-in SATA controller only gets the explicitly assigned windows ISO. This shouldn't be a problem, really, because in the UEFI shell I can readily access both ISO images. Also, as a side effect, the "ahci0" controller's appearance causes QEMU to auto-assign the PCI address 02.0 to it, which pushes the rest of the auto-addressed devices higher. Namely, "virtio-scsi-pci" moves from 02.0 to 03.0, and "virtio-net-pci" moves form 03.0 to 04.0. Still unclear why this should matter at all. So here's the strangest thing.
* When the "-device ahci,id=ahci0" option is not present, and the boot works fine, cdboot.efi displays the
> Press any key to boot from CD or DVD...
prompt *graphically*. It uses EFI_GRAPHICS_OUTPUT_PROTOCOL to display the prompt.
* When the option is added, and the boot fails, the same prompt is displayed *textually*, using EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL. Those characters are then rendered to graphics by the edk2 console / graphics stack.
I'll attach two screenshots now, to illustrate.
Created attachment 1273981 [details] good (graphical prompt display) case from comment 10 Created attachment 1273982 [details] bad (textual prompt display) case from comment 10 This problem has been encountered by others: https://bbs.archlinux.org/viewtopic.php?pid=1505053#p1505053 This is getting crazy. Keeping the "ahci0" device, but moving the UefiShell.iso CD-ROM off it (i.e., from the automatically assigned bus=ahci0.5) to the builtin controller's bus=ide.5 explicitly, as in -device ahci,id=ahci0 \ -device ide-cd,drive=cdrom1,id=ide-cd1,bootindex=4,bus=ide.5 \ makes things work. Additionally, leaving the UefiShell.iso auto-assigned to bus=ahci0.5, but moving the Windows ISO to ahci0.1 explicitly, as in: -device ahci,id=ahci0 \ -device ide-cd,drive=cdrom1,id=ide-cd1,bootindex=4 \ -device ide-drive,drive=drive-ide1-0-1,id=ide1-0-1,bus=ahci0.1,unit=0,bootindex=1 \ *also* works. Apparently, it is fine to have multiple (built-in and explicitly added) AHCI controllers, as long as *all* CD-ROMs are hooked to the same one. Is this a known Windows boot loader restriction? OK, I'm going to say that this is not a bug in OVMF. The currently available evidence suggests that the symptoms seen are an undocumented peculiarity of cdboot.efi. If we want to further research cdboot.efi's behavior, I guess we should contact Microsoft, and ask them for an explanation. *** Bug 1524282 has been marked as a duplicate of this bug. *** |
Created attachment 1272494 [details] ovmf log Description of problem: Boot qemu-kvm process with ovmf+ahci+win2016/10 iso. windows iso failed to load. Notes:1.seabios works 2. linux iso works. Version-Release number of selected component (if applicable): 3.10.0-653.el7.x86_64 qemu-kvm-rhev-2.9.0-0.el7.mrezanin201703210848.x86_64 OVMF-20170228-3.gitc325e41585e3.el7.noarch How reproducible: 100% Steps to Reproduce: 1. /usr/libexec/qemu-kvm \ -M q35 \ -cpu Westmere \ -nodefaults -rtc base=utc \ -m 2G \ -smp 4,sockets=2,cores=2,threads=1 \ -enable-kvm \ -name rhel7.4 \ -uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \ -global driver=cfi.pflash01,property=secure,value=on \ -device ahci,id=ahci0 \ -drive file=/usr/share/OVMF/UefiShell.iso,if=none,cache=none,snapshot=off,aio=native,media=cdrom,id=cdrom1 -device ide-cd,drive=cdrom1,id=ide-cd1,bootindex=4 \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0 \ -drive file=/home/ovmf/guest/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -k en-us \ -debugcon file:/home/test/ovmf.log \ -global isa-debugcon.iobase=0x402 \ -serial unix:/tmp/console,server,nowait \ -boot menu=on,splash-time=100 \ -qmp tcp::4446,server,nowait \ -drive file=/home/ovmf/guest/win2016.qcow2,if=none,id=drive0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=threads \ -device virtio-scsi-pci,id=scsi1,disable-legacy=off,disable-modern=off \ -device scsi-hd,id=virtio-disk0,drive=drive0,bus=scsi1.0,bootindex=3 \ -vnc :1 \ -monitor stdio \ -drive file=/home/en_windows_server_2016_x64_dvd_9327751.iso,if=none,media=cdrom,format=raw,id=drive-ide1-0-1 -device ide-drive,drive=drive-ide1-0-1,id=ide1-0-1,bus=ide.1,unit=0,bootindex=1 \ -device virtio-net-pci,netdev=tap10,mac=08:9e:01:c2:6d:6e,disable-legacy=off,disable-modern=off,bootindex=2 \ -netdev tap,id=tap10 \ -smbios type=1,manufacturer=redhat-kvmqe,product=rhel7.4-kvm,version=7.444444,serial=123456789,uuid=4C4C4544-0044-3010-8047-B4C04F313232,sku=fuxc,family=rhel7 \ -fda /usr/share/virtio-win/virtio-win_amd64.vfd \ -vga qxl \ 2. 3. Actual results: Fail to boot windows iso. Expected results: successful Additional info: