Bug 1382641 - Cannot generate BSOD crash dump with secure boot enabled
Summary: Cannot generate BSOD crash dump with secure boot enabled
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: virtio-win
Version: 7.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Ladi Prosek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-07 09:47 UTC by Ladi Prosek
Modified: 2017-04-29 17:23 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-15 10:36:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ladi Prosek 2016-10-07 09:47:20 UTC
Description of problem:
Windows guests running on OVMF firmware with secure boot enabled fail to generate crash dumps if the system drive uses virtio-block or virtio-scsi controller.

How reproducible:
100% deterministic for me

Windows 10 Anniversary Edition (1607) 64-bit
virtio-win-1.9.0-3.el7

Steps to Reproduce:
1. Install Win10 in an OVMF VM

QEMU command line:
qemu-system-x86_64 \
-name Windows10 \
-machine q35,accel=kvm,usb=off,vmport=off \
-cpu Broadwell -m 2048 -realtime mlock=off \
-smp 1,sockets=1,cores=1,threads=1 \
-uuid ddc56ddc-a3c3-4c9f-b1d5-264fc8231691 \
-no-user-config -nodefaults \
-rtc base=utc,driftfix=slew \
-global kvm-pit.lost_tick_policy=discard \
-no-hpet -no-shutdown \
-boot strict=on,menu=on \
-device ich9-usb-ehci1,id=usb \
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0 \
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2 \
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4 \
-device ich9-ahci,id=sata0 \
-device virtio-serial-pci,id=virtio-serial0 \
-drive file=/home/lprosek/vm/1607.qcow2,if=none,id=drive-virtio0-0-0,format=qcow2 \
-device virtio-scsi-pci,id=scsi0 \
-device scsi-hd,bus=scsi0.0,drive=drive-virtio0-0-0,id=scsi2 \
-drive if=none,id=drive-sata0-0-0,readonly=on \
-drive file=/usr/share/virtio-win/virtio-win.iso,if=none,id=drive-sata0-0-1,readonly=on,format=raw \
-drive file=/usr/share/edk2/ovmf/UefiShell.iso,if=none,id=drive-sata0-0-2,readonly=on,format=raw \
-device ide-cd,bus=sata0.0,drive=drive-sata0-0-0,id=sata0-0-0,bootindex=2 \
-device ide-cd,bus=sata0.1,drive=drive-sata0-0-1,id=sata0-0-1 \
-device ide-cd,bus=sata0.2,drive=drive-sata0-0-2,id=sata0-0-2 \
-netdev user,id=hostnet0 \
-device e1000,netdev=hostnet0,id=net0,mac=52:54:00:f4:20:99 \
-chardev pty,id=charserial0 \
-device isa-serial,chardev=charserial0,id=serial0 \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \
-chardev spicevmc,id=charredir0,name=usbredir \
-chardev spicevmc,id=charredir1,name=usbredir \
-device virtio-balloon-pci,id=balloon0 \
-msg timestamp=on \
-drive unit=0,if=pflash,format=raw,readonly,file=/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd \
-drive unit=1,if=pflash,format=raw,file=/home/theuser/OVMF_VARS.fd

2. Enable secure boot

Boot into UefiShell (UefiShell.iso is inserted in a virtual CD) and run:
EnrollDefaultKeys.efi
reset -s

Then enable secure boot in OVMF setup program:
Device Manager -> Secure Boot Configuration

3. Download and run the NotMyFault tool to trigger a BSOD
https://live.sysinternals.com/notmyfault64.exe


Actual results:
Crash dump is not generated, "The system could not successfully load the crash dump driver." is written to Windows event log.

Expected results:
Crash dump is generated.

Comment 1 Ladi Prosek 2016-10-07 09:51:20 UTC
This issue is hard to debug because secure boot and live kernel debugging are mutually exclusive. It is also not possible to easily use a debug/instrumented vioscsi/viostor driver as secure boot only allows Microsoft-signed drivers on 1607.

Comment 3 Laszlo Ersek 2016-10-07 16:13:49 UTC
(In reply to Ladi Prosek from comment #0)

> 2. Enable secure boot
> 
> Boot into UefiShell (UefiShell.iso is inserted in a virtual CD) and run:
> EnrollDefaultKeys.efi
> reset -s
> 
> Then enable secure boot in OVMF setup program:
> Device Manager -> Secure Boot Configuration

Running "EnrollDefaultKeys.efi" and then rebooting the platform are sufficient. "Device Manager -> Secure Boot Configuration" is only necessary if you'd like to disable Secure Boot after it's been enabled, or you'd like to manually enroll or delete some certificates.

Thanks!
Laszlo

Comment 4 Ladi Prosek 2016-10-07 16:23:31 UTC
(In reply to Laszlo Ersek from comment #3)
> (In reply to Ladi Prosek from comment #0)
> 
> > 2. Enable secure boot
> > 
> > Boot into UefiShell (UefiShell.iso is inserted in a virtual CD) and run:
> > EnrollDefaultKeys.efi
> > reset -s
> > 
> > Then enable secure boot in OVMF setup program:
> > Device Manager -> Secure Boot Configuration
> 
> Running "EnrollDefaultKeys.efi" and then rebooting the platform are
> sufficient. "Device Manager -> Secure Boot Configuration" is only necessary
> if you'd like to disable Secure Boot after it's been enabled, or you'd like
> to manually enroll or delete some certificates.

Thanks! I should really get into the habit of writing these steps down as I do them instead of trying to remember what I did last week..

Comment 5 Ladi Prosek 2017-02-14 15:21:15 UTC
It seems to work fine on Windows 10 1511 upgraded to 1607. This would suggest that it is related to the new driver signing requirements in clean-installed 1607 with secure boot enabled (ref: bug 1376048).

Laszlo, given the state of support of OVMF in RHEL, would you be fine with moving this to 7.5? Thanks!

Comment 11 Ladi Prosek 2017-02-15 09:54:41 UTC
I have retested three builds of Windows 10 64-bit
1607 14393.0
1607 14393.187
1607 14393.447

with OVMF and secure boot enabled and cannot reproduce this anymore.

I had troubles installing the right viostor driver on the VM that I believe exhibited this bug back in October 2016. Only after I removed all prior versions from the driver store (Device Uninstall with "Delete the driver software for this device" checked) did the driver successfully load. So I would speculate that this was also causing the issue I saw earlier. The storage driver needs to be reloaded on BSOD and perhaps Windows wasn't finding the correct one for some reason.

Comment 12 Ladi Prosek 2017-02-15 10:36:00 UTC
OK, mystery solved. Here's how it can be reproduced:

1) Install Windows with a Microsoft signed (WHQL) storage driver
2) In Device Manager update the driver to a Red Hat signed one
3) Reboot
4) Trigger BSOD

Apparently the new signing policy has an exception for boot drivers. Windows boots even with a cross-signed driver, undocumented, as far as I can tell. This exception is not implemented on the BSOD code path though, so the driver fails to load there.

Nothing for us to do here, closing.

Comment 21 lijin 2017-04-14 07:48:07 UTC
(In reply to Ladi Prosek from comment #1)
> This issue is hard to debug because secure boot and live kernel debugging
> are mutually exclusive. It is also not possible to easily use a
> debug/instrumented vioscsi/viostor driver as secure boot only allows
> Microsoft-signed drivers on 1607.

Hi Ladi,

May I confirm with you about the driver signature when secure boot enabled?

Is the Microsoft-signed driver a must when secure boot enabled?

I run windows 2016 under q35/ovmf recently,and found that redhat signed netkvm&balloon can not loaded correctly when do secure boot while blk&scsi driver can load well.

Is this the drivers' issue or it's by design?

Comment 22 Ladi Prosek 2017-04-18 07:07:25 UTC
(In reply to lijin from comment #21)
> (In reply to Ladi Prosek from comment #1)
> > This issue is hard to debug because secure boot and live kernel debugging
> > are mutually exclusive. It is also not possible to easily use a
> > debug/instrumented vioscsi/viostor driver as secure boot only allows
> > Microsoft-signed drivers on 1607.

Hi lijin,

> Hi Ladi,
> 
> May I confirm with you about the driver signature when secure boot enabled?
> 
> Is the Microsoft-signed driver a must when secure boot enabled?

Yes, that's the official message from Microsoft. More on this in bug 1376048.

> I run windows 2016 under q35/ovmf recently,and found that redhat signed
> netkvm&balloon can not loaded correctly when do secure boot while blk&scsi
> driver can load well.

That's in line with the observation made in this bug. Boot-critical drivers seem to load fine with secure boot even without a Microsoft signature. As far as I know this fact is undocumented.

> Is this the drivers' issue or it's by design?

Seems to be either a bug in the OS or a by-design behavior of the OS. It is unlikely that a driver issue would be causing it to load when it shouldn't.


Note You need to log in before you can comment on or make changes to this bug.