Fedora Account System
Red Hat Associate
Red Hat Customer
Created attachment 1917114 [details] Log of run of reproducer.sh Description of problem: When running virt-install on aarch64 system (Ampere Altra) if provide the `--cloud-init` option the VM starts up then terminates. Without `--cloud-init` the VM will boot up and start running and get to a login prompt. The VM boots and then a message about "Boot Option Restoration", "Press any key to stop system reset". Then after about 5 seconds the VM will terminate. Running the exact same reproducer script on x86_64 works (using the image https://dl.fedoraproject.org/pub/fedora/linux/releases/36/Cloud/x86_64/images/Fedora-Cloud-Base-36-1.5.x86_64.qcow2) On x86_64 it boots and I can login as the user `fedora` with the password `Password1` Version-Release number of selected component (if applicable): virt-manager-4.0.0-1.fc36.noarch How reproducible: Always Steps to Reproduce: Running as a local user who is part of the `libvirt` group. $ cat reproducer.sh #!/bin/bash set -u set -x set -e TOP_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) VIRSH_POOL="${TOP_DIR}/virsh-pool" mkdir -pv "${VIRSH_POOL}" LIBVIRT_IMAGE_DIR="/var/lib/libvirt/images/" FEDORA_QCOW_URL="https://dl.fedoraproject.org/pub/fedora/linux/releases/36/Cloud/aarch64/images/Fedora-Cloud-Base-36-1.5.aarch64.qcow2" FEDORA_QCOW="${TOP_DIR}/$(basename "${FEDORA_QCOW_URL}")" FEDORA_QCOW_VIRSH_POOL="${VIRSH_POOL}/$(basename "${FEDORA_QCOW}")" if [[ ! -f "${FEDORA_QCOW}" ]]; then wget "${FEDORA_QCOW_URL}" fi CLOUD_INIT_USER_DATA="${VIRSH_POOL}/user-data.txt" CLOUD_INIT_META_DATA="${VIRSH_POOL}/meta-data.txt" function cloud_config { cat <<EOF > "${CLOUD_INIT_USER_DATA}" #cloud-config password: Password1 chpasswd: { expire: False } ssh_pwauth: True hostname: foo-tester EOF cat <<EOF > "${CLOUD_INIT_META_DATA}" instance-id: foo-tester local-hostname: foo-tester EOF } cloud_config cp -v "${FEDORA_QCOW}" "${FEDORA_QCOW_VIRSH_POOL}" virt_args=() virt_args+=(--name foo-tester) virt_args+=(--memory 1024) # virt_args+=(--disk size=10,backing_store="${FEDORA_QCOW_VIRSH_POOL}",bus=virtio) virt_args+=(--disk "${FEDORA_QCOW_VIRSH_POOL}",size=10,format=qcow2,bus=virtio) virt_args+=(--os-variant fedora36) virt_args+=(--nographics) virt_args+=(--network bridge=virbr0,model=virtio) virt_args+=(--import) virt_args+=(--cloud-init user-data="${CLOUD_INIT_USER_DATA}",meta-data="${CLOUD_INIT_META_DATA}") # virt_args+=(--debug) virt-install "${virt_args[@]}"
I tried to create my own ISO image to hold the `user-data` and `meta-data` files. Using the `--cdrom` option to virt-install did not work. But using the `--disk myisoimage.iso,device=cdrom` did work. I was able to login as user `fedora` with password `Password1` into the VM.
Thanks for the report. For the failing case, please add --debug and post the failing output. Then do the same for your manual `--disk ISO,device=cdrom` case The manual `--cdrom` attempt won't work since that's telling the VM to boot off the cloud-init media
Created attachment 1917336 [details] Run with --debug and --cloud-init
Created attachment 1917337 [details] Run with --debug and an ISO image
(In reply to Cole Robinson from comment #2) > Thanks for the report. > > For the failing case, please add --debug and post the failing output. > Then do the same for your manual `--disk ISO,device=cdrom` case > > The manual `--cdrom` attempt won't work since that's telling the VM to boot > off the cloud-init media Thanks Cole. I have uploaded the two different log files as attachments.
We debugged a bit more on IRC. There's one issue with virtio-scsi controller not being added, but it's not the root issue. I filed that here: https://github.com/virt-manager/virt-manager/issues/445 The main issue is that OVMF is performing a one time VM reset, which virt-install is not expecting. virt-install is expecting the VM to shutdown only after the OS is booted and cloud-init has run, then virt-install redefines the VM to remove cloud-init ISO media and drop cloud-init smbios data from the VM config. But since the VM is reseting before even hitting the OS, virt-install is ejecting media too early. I'd like to know why aarch64 OVMF/AAVMF is resetting. kraxel and/or lersek is this expected? here's the output with some control codes removed. It says 'Boot Option Restoration' before the reset is invoked [Tue, 11 Oct 2022 10:22:46 virt-install 32000] DEBUG (cli:265) Running text console command: virsh --connect qemu:///session console foo-tester Connected to domain 'foo-tester' Escape character is ^] (Ctrl + ]) Tpm2GetCapabilityPcrs - 00000004 alg - 4 alg - B alg - C alg - D Image type X64 can't be loaded on AARCH64 UEFI system. [2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01HBdsDxe: loading Boot0002 "UEFI Misc Device 2" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0) BdsDxe: starting Boot0002 "UEFI Misc Device 2" from PciRoot(0x0)/Pci(0x1,0x4)/Pci(0x0,0x0) [0m[37m[44m[01;01H/------------------------------------------------------------------------------\[02;01H| Boot Option Restoration ...------------------------------------------------------------------------------/[26;24HPress any key to stop system reset[48;03HBooting in 5 seconds [48;03HBooting in 4 seconds [48;03HBooting in 3 seconds [48;03HBooting in 2 seconds [48;03HBooting in 1 second [05;01H[0m[37m[40mReset System UEFI firmware starting. ��SyncPcrAllocationsAndPcrMask! Tpm2GetCapabilityPcrs - 00000004 alg - 4 alg - B alg - C alg - D Image type X64 can't be loaded on AARCH64 UEFI system. [2J[01;01H[=3h[2J[01;01H[2J[01;01H[=3h[2J[01;01HBdsDxe: loading Boot0004 "Fedora" from HD(1,GPT,5C481284-4AF4-437F-93B6-6FB2721A75D2,0x800,0x32000)/\EFI\fedora\shimaa64.efi BdsDxe: starting Boot0004 "Fedora" from HD(1,GPT,5C481284-4AF4-437F-93B6-6FB2721A75D2,0x800,0x32000)/\EFI\fedora\shimaa64.efi [0m[30m[40m[2J[01;01H[0m[37m[40m[02;30HGNU GRUB version 2.06 [04;02H/---------------------------------------------------------------------------....\----------------------------------------------------------------------------/[43;02H[44;02H Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, `e' to edit the commands before booting or `c' for a command-line. ESC to return previous menu.
> The main issue is that OVMF is performing a one time VM reset, which > virt-install is not expecting. virt-install is expecting the VM to shutdown > only after the OS is booted and cloud-init has run, then virt-install > redefines the VM to remove cloud-init ISO media and drop cloud-init smbios > data from the VM config. But since the VM is reseting before even hitting > the OS, virt-install is ejecting media too early. > Tpm2GetCapabilityPcrs - 00000004 I think this is TPM initialitation. I have seen this "Boot Option Restoration" screen in guests with TPM added, and I know that certain TPM operations need to be done early in boot (PEI phase). When changing TPM config options in the OVMF setup menu (like enabling/disabling TPM banks) OVMF goes through a reboot too to actually apply then.
Gerd is right about the TPM PPI (Physical Presence Interface) opcodes heavily interfering with the normal boot process, but IMO this is something else. Namely, IMO what you're seeing is the "fallback behavior" of shim. "Boot Option Restoration" is a message from "fallback.c" in the "shim" project. This "fallback" logic is active when you have an installed UEFI operating system on your disk (such as Fedora or RHEL), but you have no UEFI Boot Options for booting specifically that operating system. In such cases, a default / fallback boot logic takes place in the platform firmware (the UEFI boot manager), and the installed OS provides a utility -- called the same as the normal boot loader on *removable* UEFI media -- for restoring (recreating) UEFI boot options. That's what you are seeing here. Once the "fallback" portion of "shim" restores / recreates a UEFI boot option for launching the installed OS, it re-sets the system. The root cause of this problem is that you define a UEFI domain (a disk image, effectively) without a matching UEFI varstore file (one that would be in sync with the completed installation process of the existent OS image on the disk). Shim perceives this as a loss of UEFI Boot Options, recreates them, and then reboots the VM.
*** Bug 2140164 has been marked as a duplicate of this bug. ***
Thanks for the info Laszlo + Gerd. I looked at shim fallback.c; the reset only happens when TPM is present, otherwise it just boots the first image. Justification is here: https://github.com/rhboot/shim/commit/431b8a2e I reproduced with x86_64 and cloud image too, so it's not aarch64 specific. We are less likely to hit this on x86_64 since efi isn't the default.
(In reply to Cole Robinson from comment #10) > Thanks for the info Laszlo + Gerd. I looked at shim fallback.c; the reset > only happens when TPM is present, otherwise it just boots the first image. > Justification is here: https://github.com/rhboot/shim/commit/431b8a2e > > I reproduced with x86_64 and cloud image too, so it's not aarch64 specific. > We are less likely to hit this on x86_64 since efi isn't the default. We want to be able to move people to EFI for x86_64 though, so we definitely need a solution. If we assume that *every* Linux pre-built disk image that has EFI support is going to contain 'shim' as the first bootloader, then we know we will always get this early reboot. We can query libosinfo to ask what disks have EFI support IIRC. I see the initial XML is being given: - <sysinfo type="smbios"> - <system> - <entry name="serial">ds=nocloud</entry> - </system> - </sysinfo> - <on_reboot>destroy</on_reboot> </domain> IIUC, this use of the SMBIOS should be redundant according to docs: https://cloudinit.readthedocs.io/en/latest/topics/datasources/nocloud.html [quote] You can provide meta-data and user-data to a local vm boot via files on a vfat or iso9660 filesystem. The filesystem volume label must be cidata or CIDATA. Alternatively, you can provide meta-data via kernel command line or SMBIOS “serial number” option. [/quote] IOW, as long as we use the label 'cidata', there's no need for SMBIOS settings. GIT history shows cloud-init supported 'cidata' since 2017, and 'CIDATA' since 2019
This message is a reminder that Fedora Linux 36 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '36'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see it. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 36 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16. Fedora Linux 36 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.
Reopening, I believe this is still relevant for virt-install/virt-manager
This message is a reminder that Fedora Linux 38 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 38 on 2024-05-21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '38'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see it. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 38 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 38 entered end-of-life (EOL) status on 2024-05-21. Fedora Linux 38 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.