Bug 1496170

Summary: Inconsistent MOR control variables exposed by OVMF, breaks Windows Device Guard
Product: Red Hat Enterprise Linux 7 Reporter: Ladi Prosek <lprosek>
Component: ovmfAssignee: Laszlo Ersek <lersek>
Status: CLOSED ERRATA QA Contact: FuXiangChun <xfu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: ailan, areis, chayang, jinzhao, juzhang, michen, mrezanin, vrozenfe, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovmf-20171011-1.git92d07e48907f.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 16:28:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1469787, 1488203, 1544696    
Bug Blocks:    
Attachments:
Description Flags
DG_Readiness fail
none
fix screenshot-1
none
fix screenshot-2
none
Expected output of DG_Readiness_Tool_v3.2.ps1 -Ready
none
Capable-HVCI
none
msinfo32 none

Description Ladi Prosek 2017-09-26 15:06:18 UTC
Description of problem:
With virtualization-based security enabled, Windows will try to use the Secure MOR feature for "platform reset attack mitigation":

https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/device-guard-requirements

https://www.trustedcomputinggroup.org/wp-content/uploads/Platform-Reset-Attack-Mitigation-Specification.pdf

Here's the relevant code in winload.efi:

winload!BlTcgFwSetAndLockMemoryOverwriteRequestControl:
[...]
00000001`40071d91 lea     r12,[winload!`string' (00000001`4012b890)]
                          ^ "MemoryOverwriteRequestControlLock"
00000001`40071d98 mov     rdx,r15
00000001`40071d9b mov     rcx,r12
00000001`40071d9e lea     r9,[rbp+48h]
00000001`40071da2 lea     r8,[rbp+40h]

00000001`40071da6 call    winload!EfiGetVariable
00000001`40071dab cmp     eax,0C0000023h
00000001`40071db0 je      00000001`40071ed1 ; not taken
00000001`40071db6 test    eax,eax
00000001`40071db8 js      00000001`40071dc7 ; not taken
00000001`40071dba cmp     qword ptr [rbp+48h],1
00000001`40071dbf jne     00000001`40071ed1 ; not taken
00000001`40071dc5 test    eax,eax

00000001`40071dc7 setns   bl ; bl <- 1
00000001`40071dca mov     dl,bl
00000001`40071dcc call    winload!BlTcgFwSetMemoryOverwriteRequestBit
00000001`40071dd1 test    eax,eax
00000001`40071dd3 jns     00000001`40071dde ; not taken
00000001`40071dd5 test    bl,bl
00000001`40071dd7 je      00000001`40071de2 ; not taken
00000001`40071dd9 jmp     00000001`40071ed6
[...]
00000001`40071de2 xor     eax,eax ; success
[...]
00000001`40071ed6 add     rsp,30h
[epilog]


winload!EfiGetVariable returns NTSTATUS (sign bit set on failure). BlTcgFwSetMemoryOverwriteRequestBit accesses the "MemoryOverwriteRequestControl" variable and returns NTSTATUS also.

OVMF exposes MemoryOverwriteRequestControlLock but, unless SecurityPkg/Tcg/MemoryOverwriteControl/TcgMor.inf is enabled, not MemoryOverwriteRequestControl. Windows therefore succeeds in reading MemoryOverwriteRequestControlLock at 00000001`40071da6 above, remembers this fact in bl and BlTcgFwSetMemoryOverwriteRequestBit makes the whole thing fail. This bubbles up and leads to a "The virtualization-based security enablement policy check at phase 0 failed with status: The object was not found." error message in system event log.

Had MemoryOverwriteRequestControlLock not been found and the call at 00000001`40071da6 failed, the branch at 00000001`40071dd7 would have been taken and everything would be fine (i.e. oh well, the platform does not support MOR, never mind and carry on).

This has been confirmed by rebuilding OVMF with:

--- a/MdeModulePkg/Universal/Variable/RuntimeDxe/TcgMorLockSmm.c
+++ b/MdeModulePkg/Universal/Variable/RuntimeDxe/TcgMorLockSmm.c
@@ -390,5 +390,6 @@ MorLockInit (
   //
   // Set variable to report capability to OS
   //
-  return SetMorLockVariable (0);
+  return EFI_SUCCESS;
 }

and verified that Windows then enables virtualization-based security.


Version-Release number of selected component (if applicable):
ovmf-20170228-5.gitc325e41585e3.el7


How reproducible:
100%

Steps to Reproduce:
1. Install Windows Server 2016 or Windows 10 64-bit Enterprise (note that the edition is important here) on an OVMF-based VM.
2. Enable secure boot.
3. Run the Device Guard and Credential Guard hardware readiness tool to enable DG
   https://www.microsoft.com/en-us/download/details.aspx?id=53337
4. Reboot as instructed.

Actual results:
Run msinfo32.exe and see "Device Guard Virtualization based security" is marked as enabled but not running in "System Summary". Go to Event Viewer and find the following messages in Windows Logs -> System:

"The virtualization-based security enablement policy check at phase 0 failed with status: The object was not found."
"Credential Guard (LsaIso.exe) is configured but the secure kernel is not running; continuing without Credential Guard."


Expected results:
"Device Guard Virtualization based security" is marked as running. No kernel errors in event log. Readiness tool run with -Ready is all green.

Additional info:
The ask here is to either enable the MOR module in OVMF builds or make sure that reading MemoryOverwriteRequestControlLock *without* the MOR module compiled in returns EFI_NOT_FOUND as expected.

Comment 2 Laszlo Ersek 2017-09-26 17:07:25 UTC
I remember reading (and writing) about MemoryOverwriteRequestControl. I've
found the following references now:

https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04546.html

> (2f) Modules that we *could* use, but *should not*, at this point:
> 
>        MemoryOverwriteControl/TcgMor.inf
> 
>      MOR is "Memory Overwrite Request". It is a feature specified
>      separately, in another TCG specification ("Platform Reset Attack
>      Mitigation"), and it is optional for a firmware platform to
>      support. (For example, as far as I can see, Linux doesn't even try
>      to detect or use it.) If you care about the threat model and how
>      MOR mitigates that threat, please read the spec on the TCG website.
> 
>      For initial TPM enablement in OVMF, we should avoid MOR support.
>      The module above initializes the "MemoryOverwriteRequestControl"
>      variable, which is one third of the MOR implementation.

The following Intel whitepaper also discusses it (I think I've only read v1
of this paper yet, and MOR appeared first in v2):

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-white-papers
https://github.com/tianocore-docs/Docs/raw/master/White_Papers/A_Tour_Beyond_BIOS_Implementing_UEFI_Authenticated_Variables_in_SMM_with_EDKII_V2.pdf

I'll have to read up on this material first. Thanks!

Comment 3 Ladi Prosek 2017-09-27 13:30:22 UTC
(In reply to Laszlo Ersek from comment #2)
> [...]
> I'll have to read up on this material first. Thanks!

Thank you! This is by no means urgent, I have unblocked myself by rebuilding OVMF with the hack in comment 0.

Also, feel free assign back to me with high-level instructions -- I'll be happy to write a patch and/or test it.

Comment 4 Laszlo Ersek 2017-09-28 09:58:32 UTC
Here's my understanding:

(1) The "MemoryOverwriteRequestControl" variable comes from the "TCG Platform Reset Attack Mitigation Specification".

(2) The "MemoryOverwriteRequestControlLock" variable is a Microsoft-only addition, from <https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/device-guard-requirements>. (Thanks for the link in comment 1.)

(3) From staring at the disassembly, it looks like the code intends to cope with the following cases:

- "MemoryOverwriteRequestControl" present, "MemoryOverwriteRequestControlLock"
  present (set the former and lock it with the latter)

- "MemoryOverwriteRequestControl" present, "MemoryOverwriteRequestControlLock"
  missing (I guess in this case it still sets the former but doesn't lock it
  via the latter)

- "MemoryOverwriteRequestControl" missing, "MemoryOverwriteRequestControlLock"
  missing (no support for MOR at all)

It fails when the variables are inconsistent, namely "MemoryOverwriteRequestControl" is missing, but "MemoryOverwriteRequestControlLock" is present.

I agree this is invalid, in particular because Microsoft (very correctly) introduced "MemoryOverwriteRequestControlLock" under a new variable namespace GUID (namely BB983CCF-151D-40E1-A07B-4A17BE168292), so they effectively *own* the variable. If platform firmware sets a variable that Microsoft specifies, then the variable should satisfy the constraints and other bits of that specification.

I think that the current edk2 code does not consider the three possible levels of support for these variables:
- no support for either of MemoryOverwriteRequestControl /
  MemoryOverwriteRequestControlLock
- support for MemoryOverwriteRequestControl only
- support for both.

I'll start an upstream thread and CC you.

Comment 6 Laszlo Ersek 2017-10-03 21:29:52 UTC
Posted

[edk2] [PATCH 0/6] MdeModulePkg/VariableSmm: fix MOR / MorLock inconsistency
Message-Id: <20171003212834.25740-1-lersek>
https://lists.01.org/pipermail/edk2-devel/2017-October/015547.html

Comment 7 Laszlo Ersek 2017-10-10 10:13:46 UTC
Fixed in upstream commit range 35ac962b5473..fda8f631edbb.

Comment 9 Laszlo Ersek 2017-10-10 10:20:47 UTC
Ladi, can you please provide verification instructions for virt-QE?

(If you have a separate BZ for enabling Device Guard, and you've already written up the instructions under that BZ, please just leave a pointer.

... It might make sense to make that BZ dependent on this one, too, or else to cross-reference them at least, using the "See Also" field.)

Thanks!

Comment 11 Ladi Prosek 2017-10-19 13:13:54 UTC
(In reply to Laszlo Ersek from comment #9)
> Ladi, can you please provide verification instructions for virt-QE?

Apologies for the delay, I'll do it as soon as a RHEL kernel with the SMM fixes is available for testing. Thanks!

Comment 12 Ladi Prosek 2017-12-06 11:01:48 UTC
Bug 1488203 is now in MODIFIED state and kernel-3.10.0-809.el7.x86_64.rpm has the SMM fix.

I am providing verification instructions as promised.

1. Set up a host with:
kernel-3.10.0-809.el7
qemu-kvm-rhev-2.10.0-11.el7
ovmf-20171011-1.git92d07e48907f.el7

Make sure that the kvm-intel kernel module is loaded with nested=1

2. Download a 64-bit Win10 ISO from MSDN, one that has the Enterprise SKU. I used:
en_windows_10_multiple_editions_version_1703_updated_july_2017_x64_dvd_10925340.iso

3. Create a new virtual drive:
$ qemu-img create -f qcow2 win10.qcow2 20G

4. Run QEMU, modifying paths as necessary:
/usr/libexec/qemu-kvm \
-machine q35 \
-cpu host,hv-relaxed,hv_spinlocks=0x2000,hv_time,vmx,invtsc \
-drive id=drive_image1,if=none,cache=none,snapshot=off,aio=threads,format=qcow2,file=/home/win10.qcow2 \
-enable-kvm \
-m 4G \
-smp 2 \
-drive id=drive_cd1,if=none,cache=none,snapshot=off,aio=threads,media=cdrom,file=/home/en_windows_10_multiple_editions_version_1703_updated_july_2017_x64_dvd_10925340.iso \
-drive id=drive_cd2,if=none,cache=none,snapshot=off,aio=threads,media=cdrom,file=/usr/share/OVMF/UefiShell.iso \
-device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \
-device ide-cd,id=cd2,drive=drive_cd2,bus=ide.1,unit=0 \
-device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \
-no-hpet \
-monitor stdio \
-device e1000e,netdev=netdev1 \
-netdev user,id=netdev1 \
-global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 \
-device ide-hd,drive=drive_image1,id=virtio-disk0,bootindex=1 \
-drive unit=0,if=pflash,format=raw,readonly,file=/usr/share/OVMF/OVMF_CODE.secboot.fd \
-drive unit=1,if=pflash,format=raw,file=/usr/share/OVMF/OVMF_VARS.fd \
-vnc 0.0.0.0:1

5. Install certificates by executing the following in the UEFI shell:
Shell> FS0:
FS0:\> EnrollDefaultKeys.efi

6. Install Windows, make sure to install the Enterprise edition.

7. Download the Device Guard and Credential Guard hardware readiness tool from:
https://www.microsoft.com/en-us/download/details.aspx?id=53337
and unzip it in the guest.

8. In the guest open a PowerShell window and run:
> DG_Readiness_Tool_{version}.ps1 -Enable

9. Reboot.


To verify that Device Guard is really enabled use either:
> DG_Readiness_Tool_{version}.ps1 -Ready

or

> C:\Windows\System32\msinfo32.exe
(look for Device Guard in the System Summary screen)

With older OVMF without this fix you won't see Device Guard enabled and should be able to find an entry in the Event Viewer (Windows Logs -> System) similar to:
"The virtualization-based security enablement policy check at phase 0 failed with status: The object was not found."

Comment 13 Laszlo Ersek 2017-12-06 14:29:29 UTC
Great, thank you!

Comment 14 FuXiangChun 2017-12-13 06:06:36 UTC
Ladi,

According to comment12, I cann't load this script (DG_Readiness_Tool_{version}.ps1 script( -Enable) inside guest.  I attached a guest screenshot. Could you help me check it? Thanks!

Comment 15 FuXiangChun 2017-12-13 06:08:08 UTC
Created attachment 1367099 [details]
DG_Readiness fail

Comment 16 FuXiangChun 2017-12-13 06:09:24 UTC
In addition, I tested it with OVMF-20171011-4.git92d07e48907f.el7.noarch.

Comment 17 Ladi Prosek 2017-12-13 06:20:41 UTC
Hi,

(In reply to FuXiangChun from comment #14)
> Ladi,
> 
> According to comment12, I cann't load this script
> (DG_Readiness_Tool_{version}.ps1 script( -Enable) inside guest.  I attached
> a guest screenshot. Could you help me check it? Thanks!

Running

  Set-ExecutionPolicy -ExecutionPolicy Unrestricted

in the PowerShell window should fix it.

Comment 18 FuXiangChun 2017-12-14 12:28:15 UTC
Thanks Ladi,

According to comment17, DG_Readiness_Tool_{version}.ps1 -Enable & Set-ExecutionPolicy -ExecutionPolicy Unrestricted execution is successful.  I added 2 screenshots to attachment. I will set this bug as verified. If it is wrong, please let me know.

Comment 19 FuXiangChun 2017-12-14 12:30:33 UTC
Created attachment 1367955 [details]
fix screenshot-1

Comment 20 FuXiangChun 2017-12-14 12:31:17 UTC
Created attachment 1367956 [details]
fix screenshot-2

Comment 21 Ladi Prosek 2017-12-14 14:20:43 UTC
(In reply to FuXiangChun from comment #20)
> Created attachment 1367956 [details]
> fix screenshot-2

"HVCI is not running" - that doesn't look right.

What do you see when you run msinfo32.exe (check the last ~10 lines of comment 12)? Thanks!

Comment 22 Ladi Prosek 2017-12-14 14:28:45 UTC
Created attachment 1368040 [details]
Expected output of DG_Readiness_Tool_v3.2.ps1 -Ready

Comment 23 FuXiangChun 2017-12-21 16:33:05 UTC
Ladi,

You are right, I also found "HVCI is not running". I try to execute DG_Readiness_Tool_v3.2.ps1 -Capable -HVCI, But still "HVCI is not running".

Do you have any good idea?

Comment 24 FuXiangChun 2017-12-21 16:34:24 UTC
Created attachment 1370959 [details]
Capable-HVCI

Comment 25 FuXiangChun 2017-12-21 16:35:33 UTC
Created attachment 1370973 [details]
msinfo32

Comment 33 errata-xmlrpc 2018-04-10 16:28:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0902

Comment 34 Laszlo Ersek 2020-03-24 00:36:50 UTC
For the record, I've also run into some trouble with enabling VBS. The
key error message was in the Event Viewer:

> Hypervisor launch failed; Processor does not provide the features
> necessary to run the hypervisor (leaf 0x1, register 0x2: features
> needed 0x7FFA3223, features supported 0xFFFA3223).

This error message identifies bit#31 (value 0x8000_0000) in CPUID EAX=1
("leaf 1") / ECX ("register 2").

This bit is the "hypervisor present" bit, per
<https://kb.vmware.com/s/article/1009458>. When my Windows 10 Enterprise
2015 LTSB N guest (v1507, aka build 10.0.10240) sees that this bit is
present, it rejects starting the Hyper-V role. (It refuses to run
nested.) But this bit can be hidden with the following domain XML
snippet:

<domain ...>
  <cpu ...>
    <feature policy='disable' name='hypervisor'/>
  </cpu>
</domain>

With this domain XML tweak, VBS is enabled OK for me (Credential-Guard
and HVCI are running, according to "DG_Readiness_Tool_v3.2.ps1 -Ready",
and "msinfo32.exe" reports "Device Guard Virtualization based security"
as "running").