Bug 1308678
Summary: | clearly separate SB-less, SMM-less OVMF binary from SB+SMM OVMF binary | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Laszlo Ersek <lersek> | ||||||
Component: | ovmf | Assignee: | Laszlo Ersek <lersek> | ||||||
Status: | CLOSED ERRATA | QA Contact: | aihua liang <aliang> | ||||||
Severity: | unspecified | Docs Contact: | Jiri Herrmann <jherrman> | ||||||
Priority: | unspecified | ||||||||
Version: | 7.3 | CC: | areis, chayang, dgilbert, huding, juzhang, lersek, mprivozn, mrezanin, qizhu, xfu, zhguo | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | ovmf-20160608-1.git988715a.el7 | Doc Type: | No Doc Update | ||||||
Doc Text: |
no doc update
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-11-04 08:39:52 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1341733 | ||||||||
Bug Blocks: | 1086864, 1304483 | ||||||||
Attachments: |
|
Description
Laszlo Ersek
2016-02-15 17:43:51 UTC
How to test this: The first part of the testing / setup happens *before* upgrading the OVMF binary package to the version that has the fix for this bug. (01) Verify that the "nvram" stanza in "/etc/libvirt/qemu.conf" looks as follows: nvram = [ "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd" ] If it doesn't look like this, modify it, and then restart libvirtd (systemctl restart libvirtd). (02) Install a new OVMF virtual machine -- let's call it "Fedora-i440fx-no-secboot". In virt-manager, in the last dialog, select "Customize configuration before install", and flip the Firmware selection from BIOS to UEFI. Proceed with the installation as usual. (Including the final boot into the freshly installed guest.) (03) Install another new OVMF virtual machine -- let's call it "Fedora-i440fx-almost-secboot". Do the same as in step (02), but after the installation completes, and the installed guest has been booted up, there are other steps as well: (a) Shut down the VM. (b) Attach the ISO image "/usr/share/OVMF/UefiShell.iso" to the virtual machine. IDE or virtio-scsi CD-ROM are both okay. Make sure that the CD-ROM comes first in the boot order, and the main disk (virtio-blk, virtio-scsi, or IDE -- not relevant) comes second. (c) Boot the virtual machine. You will end up in the UEFI shell. Execute the following command: EnrollDefaultKeys.efi It should succeed. Then run: reset -s (d) Boot the virtual machine again. This time Fedora should be launched automatically. Log in to the virtual machine, and run the following command: dmesg | grep -i secure The output should contain "Secure Boot enabled". (e) Shut down the VM. At this point, upgrade the OVMF package. Continue with the following steps: (04) Boot the "Fedora-i440fx-no-secboot" virtual machine. Verify that it boots okay, and that the OpenSSL banner has disappeared from the TianoCore splash screen. Shut down the VM. (05) Modify the configuration of the "Fedora-i440fx-almost-secboot" virtual machine: remove the CD-ROM with the "UefiShell.iso" image from the *boot order only* (do not remove the device or the image itself). (06) Boot the "Fedora-i440fx-almost-secboot" VM. (a) Verify that it boots okay, and that the OpenSSL banner has disappeared from the TianoCore splash screen. (b) Log in to the guest, and repeat the command dmesg | grep -i secure The secure boot feature should *not* be reported. (c) Reboot the guest, and interrupt the boot process at the TianoCore splash screen. Navigate to "Boot Manager" | "EFI Internal Shell". This option should exist, and the UEFI shell should be launched. (d) Execute the following command: EnrollDefaultKeys.efi The command should run, but it should *fail* with the following message: error: GetVariable("SetupMode", 8BE4DF61-93CA-11D2-AA0D-00E098032B8C): Not Found (e) Shut off the VM with the reset -s UEFI shell command. At this point we have verified that a firmware binary has been preserved under the previous pathname (/usr/share/OVMF/OVMF_CODE.fd), that the secure boot feature has been removed from it, and that preexistent guests continue working with it (except the secure boot aspects themselves, which is intentional). The rest of the steps can be used optionally, to verify the secure boot functionality as well. Please note that if you decide to do this, it will incur some host dependencies: - the host kernel must contain the fix for bug 1271404 - qemu-kvm-rhev must contain the fix for bug 1202822 - libvirt must contain the fix for bug 1304483 (07) Verify that the "nvram" stanza in "/etc/libvirt/qemu.conf" looks as follows: nvram = [ "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd", "/usr/share/OVMF/OVMF_CODE.secboot.fd:/usr/share/OVMF/OVMF_VARS.fd" ] If it doesn't look like this, modify it, and then restart libvirtd (systemctl restart libvirtd). (08) Install a new virtual machine, called "Fedora-q35-secboot". In virt-manager, in the last dialog, select "Customize configuration before install". Flip the machine type from i440fx to Q35, and *also* change the firmware type from BIOS to "Custom -- /usr/share/OVMF/OVMF_CODE.secboot.fd". (It is possible that virt-manager will call this a different name later on.) Installation should complete, including the boot into the freshly installed guest. (09) Repeat steps (a) through (e) from (03), for the "Fedora-q35-secboot" VM. Note that at some points the Linux boot process may feel a bit slower. This is a consequence of the SMM feature in OVMF. Functionally it should have no impact. Actually, I think I can open up this bug to the grand public, with one clarification: (In reply to Laszlo Ersek from comment #0) > (1b) Removing the Secure Boot feature from the OVMF_CODE.fd binary solves > the "invalid sense of security" problem that "OVMF_CODE.fd" has in > RHEL-7.2 (i.e., SB enabled, but no SMM). [...] This is a problem in name only, of course -- we had been aware of what "SB without SMM" meant; SB was enabled only for testing / development purposes, and OVMF was Tech Preview. The docs for this bug will be squashed into the relnotes for bug 1202819. Further changes are required for this bug; moving it back to ASSIGNED. Namely, an agreement has been reached on virt-devel: "OVMF_CODE.fd" shall not be shipped at all. This means that there will be no support for i440fx machine types. Also, in comment 1, steps 01 through 06 are no longer relevant per se, and step 07 should be updated to remove the "OVMF_CODE.fd" line. Sorry, this hasn't FailedQA; it's a scope change that supersedes the earlier state of this bug. Removing FailedQA. Created attachment 1163001 [details]
ovmf log
Package used:
kernel-3.10.0-375.el7.x86_64
OVMF-20160419-2.git90bb4c5.el7.noarch
qemu-kvm-rhev-2.6.0-4.el7.x86_64
qemu cmd line:
/usr/libexec/qemu-kvm -name fedora22-3 \
-machine pc-q35-rhel7.2.0,accel=kvm,usb=off,vmport=off,smm=on \
-cpu IvyBridge \
-drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive file=/var/lib/libvirt/qemu/nvram/fedora22-3_VARS.fd,if=pflash,format=raw,unit=1 \
-m 2048 -realtime mlock=off \
-smp 1,sockets=1,cores=1,threads=1 -uuid 9a69847b-ff96-41bc-8a66-ff96113a1da3 \
-nodefaults -rtc base=utc,driftfix=slew \
-global driver=cfi.pflash01,property=secure,value=on \
-global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
-boot strict=on \
-device virtio-scsi-pci,id=scsi0 -device virtio-serial-pci,id=virtio-serial0 -drive file=/home/Fedora-q35-secboot.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 \
-drive file=/usr/share/OVMF/UefiShell.iso,if=none,id=drive-scsi0-0-0-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
-spice port=5900,addr=10.66.11.59,disable-ticketing,image-compression=off,seamless-migration=on \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 -monitor stdio \
-global isa-debugcon.iobase=0x402 -debugcon file:/home/ovmf.log
Following steps from step (07), in step (03)(c), UEFI shell always stuck like:
Shell>EnrollDefaultKeys.efi
info: SetupMode=1 SecureBoot=0 SecureBootEnable=0 CustomMode=0 VendorKeys=1
Attach debug log.
Hi Laszlo, Could u look at this? Hello Zhiyi, I'm looking into this. Meanwhile could you please provide your host's /proc/cpuinfo? Thanks. (I'm not clearing the NEEDINFO just yet.) My host info, only present cpu0 here, cpu1-cpu7 are same as cpu0: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz stepping : 9 microcode : 0x1b cpu MHz : 1670.117 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt bogomips : 6784.88 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Regarding the issue reported in comment 14, I can reproduce it. I've done a bunch of analyzing in the upstream thread <http://thread.gmane.org/gmane.comp.bios.edk2.devel/12864>, but I'm not finished yet. I've identified the lower-level symptoms of the bug, but I cannot explain why the symptoms exist. I can also not tie the bug to any specific commit. I did bisect edk2 and I identified two commits: one that "introduces" the bug (and "causes" the issue in comment 14), and another that "fixes" the bug (makes it disappear, present in upstream only). However, those commits are not actually related to the symptoms, they just cause some perturbance in timing, or in something else that provokes / masks the symptoms. (For example, reverting the supposed "fix" on top of edk2 master doesn't make the symptoms reappear.) If we rebased OVMF to current upstream edk2 (2f7b34b20842), then the symptoms would go away (they don't reproduce with upstream), but I wouldn't be able to tell what caused them originally, and what "fixed" them. At the moment I can't even identify the culprit component (KVM, QEMU, OVMF) with certainty. Changing QEMU and KVM versions doesn't seem to make a difference (the symptoms reproduce with TCG even), while edk2 has specific tree *states* (not individual patches!) that trigger and mask the problem. This implies OVMF is the most likely culprit, but I've got no proof. I'll leave the NEEDINFO in place for a bit longer. I found the bug (with some gentle hints from Paolo :)), and I have a fix for it as well (it is a tuneables bug in upstream too). The bug is independent from this RHBZ; it's just that the testing of this RHBZ triggered it. I will file a new RHBZ (for OVMF) about it, and make it block this RHBZ. Once that RHBZ is fixed, QE can return to testing this RHBZ as well. For now I am moving this RHBZ back to ASSIGNED. Once we have a new build, we can immediately move this RHBZ to MODIFIED (no additional patches needed here). Thank you for the report, Zhiyi; you uncovered a nasty bug (SMM stack overflow). Due to https://bugzilla.redhat.com/show_bug.cgi?id=1308678#c12, i440fx not support on ovmf,only test step (07) (08) (09) Package used: kernel-3.10.0-433.el7.x86_64 OVMF-20160608-1.git988715a.el7.noarch qemu-kvm-rhev-2.6.0-5.el7.x86_64 Install fedora with qemu cmd line, just like (08): /usr/libexec/qemu-kvm -name fedora22-3 \ -machine q35,accel=kvm,usb=off,vmport=off,smm=on \ -cpu IvyBridge \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/usr/share/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -m 2048 -realtime mlock=off \ -smp 1,sockets=1,cores=1,threads=1 -uuid aa8e03b1-8953-450f-b63d-c1064b9e442d \ -nodefaults -rtc base=utc,driftfix=slew \ -boot strict=on \ -global driver=cfi.pflash01,property=secure,value=on \ -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \ -device virtio-scsi-pci,id=scsi0 -drive file=/home/Fedora-Live-Workstation-x86_64-23-10.iso,if=none,id=drive-scsi0-0-0-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \ -device virtio-serial-pci,id=virtio-serial0 -drive file=/home/Fedora-q35-secboot-2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 \ -spice port=5900,addr=10.66.11.59,disable-ticketing,image-compression=off,seamless-migration=on \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 -monitor stdio \ -global isa-debugcon.iobase=0x402 -debugcon file:/home/ovmf.log Change iso from /home/Fedora-Live-Workstation-x86_64-23-10.iso to /usr/share/OVMF/UefiShell.iso and launch fedora guest, then do step as (09).After step (09), fedora guest is launched successfully and output from dmesg: [root@dhcp-11-67 ~]#dmesg | grep -i secure [ 0.000000] Secure boot enabled After 5 trials, all tests pass. Hi Laszlo, Retest the steps (07) (08) (09) and steps all pass. As we talked, this bug can verify 1202819 and 1271404 and 1202822 and 1341733, could u check my test result from https://bugzilla.redhat.com/show_bug.cgi?id=1308678#c21 and comment whether we can verify these bugs? BR/ Guo, Zhiyi (1) Thanks for the testing, it looks good, except for one part: -drive file=/usr/share/OVMF/OVMF_VARS.fd,if=pflash,format=raw,unit=1 This option is incorrect. (And I think it makes sense to emphasize its incorrectnes at this point, because I assume QE will continue testing OVMF with direct qemu-kvm invocations, and this issue should be clarified once and for all.) Namely, the unit=1 pflash drive is a read-write one. It is not read-only. This pflash drive contains the UEFI variable store for the virtual machine. It is a host side file that is private to the specific virtual machine (just as writeable hard disk images), and must not be shared between virtual machines. So, what libvirt does is, whenever you create a new OVMF virtual machine, it creates a *copy* of "/usr/share/OVMF/OVMF_VARS.fd", when the VM is launched for the very first time. That copy is stored, on the host, under "/var/lib/libvirt/qemu/nvram/". And when the guest actually runs, the command line option is something like (for a guest called "ovmf.fedora"): -drive file=/var/lib/libvirt/qemu/nvram/ovmf.fedora.q35_VARS.fd,if=pflash,format=raw,unit=1 The point is, if you run an OVMF virtual machine directly with qemu-kvm, then this initial copying step falls on you. You must first create a copy of OVMF_VARS.fd for the given virtual machine, and then in all further invocations of QEMU, for the same virtual machine, you must pass that *copy* as the unit=1 pflash drive. In my own QEMU test scripts, I have logic like this: # if the varstore does not exist for this VM yet, then copy it first from # the OVMF_VARS.fd template file if ! test -e my_varstore.fd; then cp -v /usr/share/OVMF/OVMF_VARS.fd my_varstore.fd fi # use the VM's own private varstore /usr/libexec/qemu-kvm ... \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=my_varstore.fd,if=pflash,format=raw,unit=1 \ ... Thus, - OVMF_CODE.secboot.fd is read-only and shared between VMs, - OVMF_VARS.fd is also read-only, but never accessed by VMs. It is only used as a template file, by libvirt and by the human tester, for creating the VM's own, actual, private varstore file, before the first run. If you pass the OVMF_VARS.fd template file directly to QEMU, as the unit=1 pflash drive, then two things can happen: - if you run qemu-kvm as an unprivileged user, then the file will be mapped read-only, and UEFI variable services will be broken in the guest, in obscure ways - if you run qemu-kvm as root, then the /usr/share/OVMF/OVMF_VARS.fd template file will be overwritten, which breaks packaging (--> it modifies a system-wide file, you can check it with "rpm --verify OVMF"). In order to recover from this, you have to uninstall and reinstall the OVMF package. (The second could never happen with libvirtd: not only does libvirtd create the private varstore from the template, it also changes to UID "qemu", which has no permission to overwrite files under /usr/share/OVMF.) (2) Another thing I've noticed is the bootindex / CD changes, between installation and installed startup. It is actually not necessary to change ISOs and bootindexes at all. For example, with the following scheme: - system disk: bootindex=0 - UefiShell.iso: bootindex=1 - Fedora LiveCD: bootindex=2 you can test everything necessary, without touching the QEMU command line at all. Namely, at the very first boot, the system dsk will be empty, so it cannot be booted, and OVMF will skip over it. UefiShell.iso will be booted, and you will end up at the UEFI shell prompt. This is when you can run "EnrollDefaultKeys.efi", and the "reset -s" (or "reset -c", for rebooting) commands. At the second boot, the system disk is still empty, so it will be skipped. UefiShell.iso will also be skipped, because the UEFI shell binary is not signed, but in the previous step, we have enabled secure boot. So, the third option, Fedora LiveCD will be booted (because that one is signed). This is when you install the guest to the system disk (already running in secure boot mode), and after installation, reboot the guest. At the third boot, the system disk is populated, and the boot loader on it (shim.efi) is signed as well. So the installed Fedora OS will be booted (as option #0), and you can verify it is running in secure boot mode (dmesg). --------------- Summary: (1) whenever running OVMF guests directly with /usr/libexec/qemu-kvm, please take care to copy the varstore template to a private varstore file first, and then pass the *copy* to the VM, as unit=1 pflash. (2) It is possible to insert both the UefiShell.iso and the Fedora Live CDs at the same time (using two virtio-scsi CD-ROMs, for example), and employ bootindices such that you never have to modify the QEMU command line. Remark (2) is just for your convenience, but remark (1) is important. I would like to request that you repeat the test, with (1) addressed. Then it will be fine to mark all of: bug 1202819, bug 1271404, bug 1202822, and bug 1341733 as VERIFIED. (Plus this one too.) Before retesting, please be sure to reinstall the pristine OVMF package, because on your host, "/usr/share/OVMF/OVMF_VARS.fd" may have been overwritten already. Thanks! Furthermore, please upload the ovmf debug log when your testing is complete, so that I can sanity-check a few things in it. Thank you! Created attachment 1168166 [details] ovmf log retest Package used: kernel-3.10.0-433.el7.x86_64 OVMF-20160608-1.git988715a.el7.noarch qemu-kvm-rhev-2.6.0-5.el7.x86_64 Before reinstall OVMF-20160608-1.git988715a.el7.noarch, execute: [root@dhcp-11-59 ovmf]# rpm --verify OVMF ..5....T. /usr/share/OVMF/OVMF_VARS.fd .....UG.. /usr/share/OVMF/UefiShell.iso After reinstall, nothing out after rpm verify, ovmf package is clean. qemu cmd used: /usr/libexec/qemu-kvm -name fedora22-3 \ -machine q35,accel=kvm,usb=off,vmport=off,smm=on \ -cpu IvyBridge \ -drive file=/usr/share/OVMF/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/home/zhguo/fedora_VARS.fd,if=pflash,format=raw,unit=1 \ -m 2048 -realtime mlock=off \ -smp 1,sockets=1,cores=1,threads=1 -uuid aa8e03b1-8953-450f-b63d-c1064b9e442d \ -nodefaults -rtc base=utc,driftfix=slew \ -boot strict=on \ -global driver=cfi.pflash01,property=secure,value=on \ -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \ -device virtio-serial-pci,id=virtio-serial0 -drive file=/home/zhguo/Fedora-q35-secboot.qcow2,if=none,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=0 \ -device virtio-scsi-pci,id=scsi0 -drive file=/usr/share/OVMF/UefiShell.iso,if=none,id=drive-scsi0-0-0-0,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \ -device virtio-scsi-pci,id=scsi1 -drive file=/home/zhguo/Fedora-Live-Workstation-x86_64-23-10.iso,if=none,id=drive-scsi0-0-0-1,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=1,drive=drive-scsi0-0-0-1,id=scsi0-0-0-1,bootindex=2 \ -spice port=5900,addr=10.66.11.59,disable-ticketing,image-compression=off,seamless-migration=on \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 -monitor stdio \ -global isa-debugcon.iobase=0x402 -debugcon file:/home/zhguo/ovmf.log Run as unprivileged user and follow steps of (2) from https://bugzilla.redhat.com/show_bug.cgi?id=1308678#c23 , both steps are running following explain. After third boot, login and output from dmesg: [root@dhcp-11-67 ~]#dmesg | grep -i secure [ 0.000000] Secure boot enabled Attach ovmf log for check Hi Laszlo, Here is the latest result, please check! BR/ Guo Zhiyi Hi Guo Zhiyi, thanks for retesting, everything looks fine! Please go ahead with the planned status changes for the bugzillas in question. Thanks! Laszlo Has verified, SB can work normally. Verified Version: Kernel version:3.10.0-504.el7.x86_64 qemu-kvm-rhev version:qemu-kvm-rhev-2.6.0-22.el7.x86_64 OVMF version:OVMF-20160608-3.git988715a.el7.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2608.html |