Bug 1734505 - RHEL 8 TPM passthrough having bitlocker encryption issues in guest [NEEDINFO]
Summary: RHEL 8 TPM passthrough having bitlocker encryption issues in guest
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Marc-Andre Lureau
QA Contact: Qinghua Cheng
URL:
Whiteboard:
Depends On: 1754508 1754906 1758153
Blocks: 1771318
TreeView+ depends on / blocked
 
Reported: 2019-07-30 17:50 UTC by amashah
Modified: 2020-07-23 10:11 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-14 15:38:17 UTC
Type: Bug
Target Upstream Version:
marcandre.lureau: needinfo? (giuseppe.scardino)


Attachments (Terms of Use)
turn on passthrough tpm device fail (45.88 KB, image/png)
2019-08-07 08:25 UTC, FuXiangChun
no flags Details
manage-bde-fail (29.96 KB, image/png)
2019-08-08 11:15 UTC, FuXiangChun
no flags Details
tpm.msc fail (29.72 KB, image/png)
2019-08-12 09:33 UTC, FuXiangChun
no flags Details
Bitlocker-could-not-be-enabled (91.96 KB, image/png)
2019-08-20 05:05 UTC, FuXiangChun
no flags Details
manage-bde-c-disk (20.51 KB, image/png)
2019-08-20 05:06 UTC, FuXiangChun
no flags Details
TCGLogs collected (29.35 KB, application/zip)
2019-08-29 07:02 UTC, Giuseppe
no flags Details
TCG PCR Event log reader (1.15 KB, text/plain)
2019-08-29 11:30 UTC, Marc-Andre Lureau
no flags Details
Test 2 (105.05 KB, application/zip)
2019-08-30 07:55 UTC, Giuseppe
no flags Details
Windows PCRs reader (5.63 KB, application/zip)
2019-09-05 14:08 UTC, Giuseppe
no flags Details

Description amashah 2019-07-30 17:50:27 UTC
Description of problem:

We are testing on a machine with a hardware TPM v1.2. The machine is running RHEL 8.0 with a VM configured to run Windows 10. The TPM is passed through to the VM and is visible to Windows. 
We are attempting to enable BitLocker (hard disk encryption) using the TPM. The normal process is to run a command on Windows (manage-bde -on c:) which will configure BitLocker then do a test reboot to make sure that the drive can be unlocked before it actually starts to encrypt the hard drive. From the event log is seems that this is successful up to the reboot, but after the reboot a dialog is displayed reporting that the drive couldn't be encrypted.

Version-Release number of selected component (if applicable):


How reproducible:

Always repeatable. 

Steps to Reproduce:
1. Passthrough TPM to Windows 10 guest
2. Enable BitLocker 
3. After reboot bitlocker should start encrypting drive
4. Further reboots should unlock the drive with the TPM

Actual results:

After the reboot a dialog is displayed reporting that the drive couldn't be encrypted.


Expected results:

It should be possible to enable BitLocker. After the reboot BitLocker should start encrypting the hard drive. Further reboots should unlock the hard drive with the TPM.

Additional info:
The attacked images show the results of manage-bde, apparently successful, then the dialog that is displayed after the reboot. Finally the contents of the Windows event log showing the bitlocker events. 
I've also confirmed that BitLocker can be enabled on the same hardware if Windows is installed directly on the machine. Also BitLocker can be enabled in the VM if it doesn't use the TPM.

Comment 4 FuXiangChun 2019-08-07 08:25:41 UTC
Created attachment 1601264 [details]
turn on passthrough tpm device fail

Comment 5 FuXiangChun 2019-08-07 08:26:00 UTC
Hi Marc-Andre,

I found passthrough TPM device cann't be turned on inside win10 guest(via tpm.msc). whatever ovmf or seabios. I added a screenshot to attachment. But vtpm works. If need file a bug to track it, please let QE know.  In addition. Do I need add this scenario(comment 0) to vtpm test plan?

Comment 6 Marc-Andre Lureau 2019-08-07 08:29:50 UTC
(In reply to FuXiangChun from comment #5)
> Hi Marc-Andre,
> 
> I found passthrough TPM device cann't be turned on inside win10 guest(via
> tpm.msc). whatever ovmf or seabios. I added a screenshot to attachment. But
> vtpm works. If need file a bug to track it, please let QE know.

It is expected for passthrough to fail this.

>  In
> addition. Do I need add this scenario(comment 0) to vtpm test plan?

A bitlocker test? that would be nice.

Comment 7 FuXiangChun 2019-08-07 09:07:18 UTC
(In reply to Marc-Andre Lureau from comment #6)
> (In reply to FuXiangChun from comment #5)
> > Hi Marc-Andre,
> > 
> > I found passthrough TPM device cann't be turned on inside win10 guest(via
> > tpm.msc). whatever ovmf or seabios. I added a screenshot to attachment. But
> > vtpm works. If need file a bug to track it, please let QE know.
> 
> It is expected for passthrough to fail this.
> 
> >  In
> > addition. Do I need add this scenario(comment 0) to vtpm test plan?
> 
> A bitlocker test? that would be nice.

Ok, I will add a bitlocker test case to vtpm test plan.

Comment 8 FuXiangChun 2019-08-07 09:08:37 UTC
Hi amashah,

QE tried to reproduce this bug. The following are steps detailed. If I missed any steps,please correct me.

1) Passthrough TPM to Windows 10 guest with virsh 

    <tpm model='tpm-tis'>
      <backend type='passthrough'>
        <device path='/dev/tpm0'/>
      </backend>
    </tpm>

2) can find TPM device inside win10 guest. But it cann't be turned on via tpm.msc.

3) encryption D: inside guest. D: disk will be encryed directly(don't need reboot).
  manage-bde -on D:

4) after rebooting guest,  any user still can access D:'s context. 

Notes: For step3. If use 'turn on Bitlocker' via 'mouse right', Then I need set a password to access D:,  password still works after rebooting guest. 

Anyway, seems I cann't reproduce this bug.

Comment 9 Kit Patterson 2019-08-07 11:54:00 UTC
Hi FuXiangChun, 

Using Bitlocker to encrypt a data drive won't use the TPM, as you've discovered. You need to encrypt the main system partition - typically the C: drive. You can use the same command to do this - "manage-bde -on c:". 

Depending on how Windows was installed you may also need to prepare the system drive by creating an extra boot partition. You can do this with the following command: 
bdehdcfg -target c: shrink -newdriveletter s: -size 300 -quiet

Can you repeat this test? 

Thanks.

Comment 11 FuXiangChun 2019-08-08 11:14:39 UTC
Kit,

According to comment9, Fail to execute this command("manage-bde -on c:") inside guest. As passthrough tpm doesn't support 'turn on' (please check comment4 and comment6). I will add a screenshot to attachment.  see if you hit the same problem before. Thanks.

Comment 12 FuXiangChun 2019-08-08 11:15:34 UTC
Created attachment 1601773 [details]
manage-bde-fail

Comment 13 Kit Patterson 2019-08-08 18:57:09 UTC
Hi FuXiangChun

You need to reset the TPM before doing this test. You may be able to do this with the TPM.msc utility inside Windows, or alternatively you may have to do it from the BIOS boot menu.
If you're doing this from the BIOS you may have to reboot twice - first find the relevant menu item and select 'clear', then reboot and find the relevant menu item and make sure the TPM is set to 'active' or 'enabled'. After that you can reboot the machine to start RHEL, the VM and Windows. 
From Windows you can confirm that the TPM is ready to use with the tpm.msc utility. 

Once the TPM is ready and active you can run the manage-bde command to enable bitlocker on the C: drive with the TPM for encryption - after rebooting you should see the error dialog showing that BitLocker has failed to be enabled. 

Thanks.

Comment 14 FuXiangChun 2019-08-12 09:32:59 UTC
(In reply to Kit Patterson from comment #13)
> Hi FuXiangChun
> 
> You need to reset the TPM before doing this test. You may be able to do this
> with the TPM.msc utility inside Windows, or alternatively you may have to do
> it from the BIOS boot menu.
> If you're doing this from the BIOS you may have to reboot twice - first find
> the relevant menu item and select 'clear', then reboot and find the relevant
> menu item and make sure the TPM is set to 'active' or 'enabled'. After that
> you can reboot the machine to start RHEL, the VM and Windows. 
> From Windows you can confirm that the TPM is ready to use with the tpm.msc
> utility. 

I know what your mean. But I am a little confused.  Inside my windows guest. I cann't reset the TPM device with the tpm.msc. It will show a error as comment4.  In comment6. Marc-Andre said that It is the expected for passthrough. Why does it work inside your windows guest? If possible. I will use another machine to try it later(not sure If I can reserve machine with TPM device). 
 
> 
> Once the TPM is ready and active you can run the manage-bde command to
> enable bitlocker on the C: drive with the TPM for encryption - after
> rebooting you should see the error dialog showing that BitLocker has failed
> to be enabled. 
> 
> Thanks.

Comment 15 FuXiangChun 2019-08-12 09:33:34 UTC
Created attachment 1602799 [details]
tpm.msc fail

Comment 16 Giuseppe 2019-08-13 10:09:29 UTC
Hi FuXiangChun,

In your previous comment 11 you told that the tpm-passthrough doesn't support 'turn on' and it could be ok, but maybe you should check your BIOS configuration. 
You should turn on and clear the TPM from the BIOS and then start the command to encrypt the disk.

Thanks,
Giuseppe

Comment 17 FuXiangChun 2019-08-15 07:37:01 UTC
(In reply to Giuseppe from comment #16)
> Hi FuXiangChun,
> 
> In your previous comment 11 you told that the tpm-passthrough doesn't
> support 'turn on' and it could be ok, but maybe you should check your BIOS
> configuration. 
> You should turn on and clear the TPM from the BIOS and then start the
> command to encrypt the disk.
> 
> Thanks,
> Giuseppe

Hi Giuseppe,

I got this error message. Is it similar to your error message in comment0?

C:\Users\Administrator>manage-bde -on c:
BitLocker Drive Encryption: Configuration Tool version 10.0.18362
Copyright (c)2013 Microsoft Corporation. All rights reserved.

volume C:[]
[OS Volume]
Key Protectors Added:

ERROR: An error occurred (code 0x80284001):
An internal error has occurred within the Trusted Platform Module support program.

NOTE: If the -on switch has failed to add key protectors or start encryption,
you may need to call "manage-bde -off" before attempting -on again. 

Thanks
Xiangchun Fu

Comment 18 Giuseppe 2019-08-16 07:38:42 UTC
Hi XiangChun,
The error seems like your boot is configured as UEFI, could you set the boot to legacy for RH and VM too in order to work in a simple environment to isolate the problem?

Thanks,
Giuseppe

Comment 19 FuXiangChun 2019-08-20 05:04:12 UTC
Hi Giuseppe,

I think I reproduced this issue. please check the screenshot in attachment.

Comment 20 FuXiangChun 2019-08-20 05:05:30 UTC
Created attachment 1605948 [details]
Bitlocker-could-not-be-enabled

Comment 21 FuXiangChun 2019-08-20 05:06:12 UTC
Created attachment 1605949 [details]
manage-bde-c-disk

Comment 22 Giuseppe 2019-08-20 06:28:55 UTC
Hi FuXiang,
Yes, this is the issue we discover.

Thanks,
Giuseppe

Comment 23 Marc-Andre Lureau 2019-08-22 06:01:42 UTC
attachment 1605948 [details] is what you get after attachment 1605949 [details] & reboot? Is there anything worth in the event log?

Comment 24 Giuseppe 2019-08-22 08:12:38 UTC
Hi, the attachments reproduce the same result we obtain.
Unfortunately, from the event log there aren't worth information. 
What we discover is that bypassing the hardware check and adding also a KEY as protector (manage-bde -on c: -s -rp -used) the encryption starts, but the TPM is not used as the PCRs 8,9,10 and 11 are changing after the reboot for some reason. This could be the problem as Bitlocker is using the PCRs 0,2,4 and 11

Comment 29 Marc-Andre Lureau 2019-08-24 13:49:57 UTC
Is this the targeted configuration?

host: rhel 8.0, TPM 1.2 on host
guest: win10 q35 with BIOS (seabios) and TPM 1.2 TIS device

Could you collect the BIOS log too? -device isa-debugcon,iobase=0x402,chardev=log -chardev file,id=log,path=/tmp/bios.log

TPM PCR can't be reset unless hardware reboot.

We need to identify the set of PCR that seabios & windows extend. We must prevent RHEL host from extending the PCR values after VM is started. /dev/tpm access should be exclusive (I believe there is no concurrency/RM with 1.2).

If needed to support VM restart (without HW restart), we will need to remove seabios TPM measurements too (if seabios is actually doing measurements with passthrough, which remains to be verified with the logs).

The remaining subset of PCRs can be used to tighten the key. The list of PCRs can be configured in "Configure TPM platform validation profile" in the policy editor. For some reason, PCR11 must be included. In theory, that should be enough, but this setup remains fragile imho. A Windows security expert should probably be involved to validate the profile, depending on the use case.

Comment 33 Marc-Andre Lureau 2019-08-26 07:55:59 UTC
thanks, so TPM support in seabios is currently disabled in RHEL. Good, I suppose.

So, what remains should be Windows-side only.

I found some tools to dump TCG log: https://github.com/mattifestation/TCGLogTools

Getting the JSON dump might help: ConvertTo-TCGEventLog -LogBytes (Get-TCGLogContent -LogType SRTMBoot) -MinimizedX509CertInfo | ConvertTo-Json -Depth 8 | Out-File 'TCGlog.json'

Please try to collect (at least 2) logs: from host _power on_, vm boot, collect json (repeat).

Comment 39 Giuseppe 2019-08-29 07:01:38 UTC
Hi,
I confirm the configuration:
host: rhel 8.0, TPM 1.2 on host
guest: win10 q35 with BIOS (seabios) and TPM 1.2 TIS device
/dev/tpm access should be exclusive to the VM, at boot no other program uses the TPM and the tcsd service is inactive.
In our configuration, the only reboot of the VM is not the default behaviour, we do always a full reboot (host + guest).

I have used your suggested tool but it fails to convert the data. The ConvertTo-TCGEventLog fails with its examples on the TCGLogTools project. Is it possible that this tool works well with TPM 2.0?
I attach the data collected running this command "Get-TCGLogContent -LogType SRTMBoot" after a full double reboot: host power on -> vm boot -> collect data -> reboot VM and host and collect again the info

Comment 40 Giuseppe 2019-08-29 07:02:27 UTC
Created attachment 1609279 [details]
TCGLogs collected

Comment 41 Marc-Andre Lureau 2019-08-29 11:30:22 UTC
Created attachment 1609371 [details]
TCG PCR Event log reader

With the attached python script, I can read the windows log.

I think they are "partial", since we only get the Windows measurements. Thus the log do not respect the complete format specified in https://docs.microsoft.com/en-us/windows/win32/api/tbs/nf-tbs-tbsi_get_tcg_log_ex#remarks

With 2 consecutive boots I get:

 elmarco@boraha  ~/Downloads  python3 tcglog.py 0000000009-0000000000.log
Entry(pcr=8, event=<Event.EV_COMPACT_HASH: 12>, digest=b'\x9e\xf3\xd2\xc1 \xaf\xdb\xa2\x00Iuh\x13B\xe7{h%\x88R', size=4)
Entry(pcr=9, event=<Event.EV_COMPACT_HASH: 12>, digest=b'BYSU\x96\x81 \xef\xb6\x8d\x84L\xda<\x8eOP\xf3\xa7\xed', size=4)
Entry(pcr=10, event=<Event.EV_COMPACT_HASH: 12>, digest=b'\xf0]\xa4\xa5\xe7s1b\x89\x98h\xe8\xfb\xa7N~\x96\x94\x885', size=4)
Entry(pcr=12, event=<Event.EV_EVENT_TAG: 6>, digest=b'G\xe2\x8c\x9e`x\xe8\xfe\x84/j\xae\xce\xa2\xfc&D\\\x12\xdb', size=344)
Entry(pcr=13, event=<Event.EV_EVENT_TAG: 6>, digest=b'\x1b\x9b\xd7\xb9\x12p\x8c{1\xaa\x12K\x9f\xa5f\xf2\xe9\x16?\x93', size=1084)
Entry(pcr=14, event=<Event.EV_EVENT_TAG: 6>, digest=b"\x01\xfd`\xa7\x1944\xb2^\xe8\x87\x08'\xfdCk\x12Z\xa0=", size=302)
Entry(pcr=12, event=<Event.EV_EVENT_TAG: 6>, digest=b'\xcc\xdd\xd2\x98\x1b\xa3)\x18a\xe8\x85\xb0\x1e\xcc7\x14\xe2\xd6\x99\\', size=4759)
Entry(pcr=13, event=<Event.EV_EVENT_TAG: 6>, digest=b'\xa3\xdex\xff\xcb\x96\xd0\x00`\xa4g\x00\xd5-\xef\xe71\x10fS', size=24797)
Entry(pcr=14, event=<Event.EV_EVENT_TAG: 6>, digest=b"\xc5\xfd\xd5\xda\x06Jz'\x15|\x01T\x88\xe8`r\xba\x83\x0e\xfa", size=612)
Entry(pcr=11, event=<Event.EV_COMPACT_HASH: 12>, digest=b':@r\xcckw\xe2c\x9dO\xdc\x91\xc9\x1e\xfc\x11\xbc>3\xc3', size=4)
Entry(pcr=12, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)
Entry(pcr=13, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)
Entry(pcr=14, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)
 elmarco@boraha  ~/Downloads  python3 tcglog.py 0000000008-0000000000.log
Entry(pcr=8, event=<Event.EV_COMPACT_HASH: 12>, digest=b'\x9e\xf3\xd2\xc1 \xaf\xdb\xa2\x00Iuh\x13B\xe7{h%\x88R', size=4)
Entry(pcr=9, event=<Event.EV_COMPACT_HASH: 12>, digest=b'BYSU\x96\x81 \xef\xb6\x8d\x84L\xda<\x8eOP\xf3\xa7\xed', size=4)
Entry(pcr=10, event=<Event.EV_COMPACT_HASH: 12>, digest=b'\xf0]\xa4\xa5\xe7s1b\x89\x98h\xe8\xfb\xa7N~\x96\x94\x885', size=4)
Entry(pcr=12, event=<Event.EV_EVENT_TAG: 6>, digest=b'D\xed\xff\xd1\xb8\xa2\xcc\x8b\x99_k\x11\xcd\x8dB\xd1\x9c\xca\x02\x02', size=344)
Entry(pcr=13, event=<Event.EV_EVENT_TAG: 6>, digest=b'\x1b\x9b\xd7\xb9\x12p\x8c{1\xaa\x12K\x9f\xa5f\xf2\xe9\x16?\x93', size=1084)
Entry(pcr=14, event=<Event.EV_EVENT_TAG: 6>, digest=b"\x01\xfd`\xa7\x1944\xb2^\xe8\x87\x08'\xfdCk\x12Z\xa0=", size=302)
Entry(pcr=12, event=<Event.EV_EVENT_TAG: 6>, digest=b'\xdf\x07:\x15\x06i\x93\xbc|s\\\x19z\x9a\x08\xb1\t\x07]K', size=4759)
Entry(pcr=13, event=<Event.EV_EVENT_TAG: 6>, digest=b'\xa3\xdex\xff\xcb\x96\xd0\x00`\xa4g\x00\xd5-\xef\xe71\x10fS', size=24797)
Entry(pcr=14, event=<Event.EV_EVENT_TAG: 6>, digest=b"\xc5\xfd\xd5\xda\x06Jz'\x15|\x01T\x88\xe8`r\xba\x83\x0e\xfa", size=612)
Entry(pcr=11, event=<Event.EV_COMPACT_HASH: 12>, digest=b':@r\xcckw\xe2c\x9dO\xdc\x91\xc9\x1e\xfc\x11\xbc>3\xc3', size=4)
Entry(pcr=12, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)
Entry(pcr=13, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)
Entry(pcr=14, event=<Event.EV_SEPARATOR: 4>, digest=b'\x9d\x7fI\x93\x88\xda\xa8\xe7\xd7\xf1\xe3\x99an9\xe5\x89\x1d9\x9d', size=4)

It looks like we should have repeatable values in the PCRs on HW host reboot.

Comment 42 Giuseppe 2019-08-30 07:55:24 UTC
Created attachment 1609779 [details]
Test 2

Hi,
If this could be useful, I attach two other tests including the PCR values read with a MS tool too.
The first file RebootVM.zip includes a double sequence of reboots including a  reboot of the only VM and a reboot of the guest + host
The second file Bitlocker.zip includes a reboot (guest + host) after the bitlocker command to encrypt the disk is called. I have included the values from a starting point, immediately after the command is called and after the reboot.

All the PCR values are saved using both tools (TCGLogTools and the MS tool)

Comment 43 Kit Patterson 2019-09-01 09:36:58 UTC
This is just to capture some information that's been discussed by email

With MS BitLocker, PCR 11 is reset to zero during the reboot (like all PCRs.) The BitLocker Volume Encryption Key (VEK) is then sealed against this 0 value so that it can only be unsealed after a reboot. The Microsoft boot manager “BOOTMGR” will do the un-seal on the VEK and it will then ‘cap’ PCR 11 by extending it with a static value. This means that after BOOTMGR has run, since it’s never possible to reset PCR11 to zero, it’s never possible to unseal the VEK again. The VEK is secure after the boot is complete. 
 
The fact that PCR 11 is capped after BOOTMGR runs means that, as in our case, if BOOTMGR is run twice then the second time will already see the capped value and won’t be able to unseal VEK. Normally the BOOTMGR will use the VEK to unlock the BitLocker disk directly, but since our first invocation of BOOTMGR is on the physical machine, which isn’t encrypted, that doesn’t work. 
 
So this sounds like it could be related to the problems we’re seeing. 
 
The only problem is that we also had problems enabling BitLocker when RHEL was booted with GRUB. In that case BOOTMGR will only be run once, inside the VM, so it’s not clear that PCR 11 would be capped too early. Maybe there’s a different reason that BitLocker was failing in this case… 
 
 

DMA/SLAT
So one idea we had was; is it possible that the TPM pass through is being messed up by second layer address translation (SLAT) in the virtual machine when it attempts to perform DMA? If the OS running in the VM passes a ‘physical’ address range to the TPM, but that address range has actually been mapped through a SLAT lookup table to a different physical address, then could the TPM end up performing the measure against the wrong bit of memory? 
If this is the case then we would expect to see completely different random values for the PCRs on each boot. However, after further experimentation this doesn't seem to be the case. 
If this does prove to be the case, is this something we can do anything about? I think it depends on how the TPM ‘pass through’ works with the QEMU implementation, and if there’s any chance of controlling the data that’s given to the TPM. If the TPM ‘pass through’ goes through QEMU code then I guess in theory it should be possible to apply a SLAT mapping to the addresses before passing them to the hardware. Then the TPM would be looking at the correct memory and give the correct result. However, if the pass through literally exposes the TPM hardware directly to the VM then there wouldn’t be any opportunity to correct the addresses.  
 
Multiple PCR updates out of sequence. 
The second idea, which I think Russ already touched on in the call, depends on the order that the PCRs are updated. I hadn’t previously understood exactly how the PCRs are updated (sorry for being slow,) but we now understand that the PCR values are reset to zero when the physical machine reboots, and then each update to a PCR is ‘merged’ with the existing value. (Or “extend” in the jargon.) This means that the exact order in which the PCRs are updated is very important – if a PCR is extended multiple times with the same value then this will give a completely different value for the PCR. 
Assuming that Windows in the VM runs the same sequence of updates each time it reboots, but the PCRs are only reset when the physical machine reboots, then this means that Windows will create a different set of values on each restart of the VM. 
We have already tested with a physical reboot after starting BitLocker, but we’re not sure if we’ve also physical rebooted before starting BitLocker, so it’s possible that we’ve rebooted the VM multiple times before enabling BitLocker and started with invalid values.

Comment 44 Marc-Andre Lureau 2019-09-02 08:01:39 UTC
(In reply to Kit Patterson from comment #43)
> 
> DMA/SLAT
> So one idea we had was; is it possible that the TPM pass through is being
> messed up by second layer address translation (SLAT) in the virtual machine
> when it attempts to perform DMA? If the OS running in the VM passes a
> ‘physical’ address range to the TPM, but that address range has actually
> been mapped through a SLAT lookup table to a different physical address,
> then could the TPM end up performing the measure against the wrong bit of
> memory? 

TPM don't do DMA, afaik.

> If this is the case then we would expect to see completely different random
> values for the PCRs on each boot. However, after further experimentation
> this doesn't seem to be the case. 
> If this does prove to be the case, is this something we can do anything
> about? I think it depends on how the TPM ‘pass through’ works with the QEMU
> implementation, and if there’s any chance of controlling the data that’s
> given to the TPM. If the TPM ‘pass through’ goes through QEMU code then I
> guess in theory it should be possible to apply a SLAT mapping to the
> addresses before passing them to the hardware. Then the TPM would be looking
> at the correct memory and give the correct result. However, if the pass
> through literally exposes the TPM hardware directly to the VM then there
> wouldn’t be any opportunity to correct the addresses.  

(fwiw, I think modifying the TPM command stream would mean in practice implementing something close to a TPM emulator)


> We have already tested with a physical reboot after starting BitLocker, but
> we’re not sure if we’ve also physical rebooted before starting BitLocker, so
> it’s possible that we’ve rebooted the VM multiple times before enabling
> BitLocker and started with invalid values.

That would be indeed wrong. Always use the passthrough-TPM VM after a host reset/power-on (ie, never use a rebooted VM without reseting host).

Comment 46 Giuseppe 2019-09-03 19:21:51 UTC
Hi,
there is an interesting update. Disabling the Linux IMA on kernel is possible to encrypt the disk with Bitlocker using all the default PCRs.
IMA uses the PCR 10 that is also used by Windows, which had no exclusive access. Disabling the IMA the PCRs from 8 to 11 are used exclusively by Windows permitting to Bitlocker to work correctly.
Disabling the IMA do you see any side effect on linux security? Or is possible to configure it using another PCR bank?
Eventually is possible to disable the IMA using a grub parameter at boot instead to recompile the kernel?

Thanks

Comment 47 Giuseppe 2019-09-05 14:08:44 UTC
Created attachment 1611993 [details]
Windows PCRs reader

Hi,
I attach the windows tool to read the PCRs values

Comment 48 Marc-Andre Lureau 2019-09-06 14:00:46 UTC
Actually the PCR values can be read directly (even when TPM is passed through) from host /sys, like /sys/devices/pnp0/00:05/tpm/tpm0/pcrs (and they look like they match what windows tools read)

Before manage-bde -on C:, and after restart, only PCR12 is changed. And according to the default policy, that shouldn't matter.

Yet, bitlocker fails. So maybe the PCR values are not the only problem.

Could a Windows developer dig from the bitlocker error logs, or tell us where to find it. Seems like I am running out of ideas :).

Wrt IMA and PCR10, I am quite surprise that this can break bitlocker since the value isn't changed by the host after boot. But I am recompiling a custom rhel8 kernel to disable IMA to confirm Giuseppe findings (since it can't be disabled from boot cmdline).

Comment 49 Marc-Andre Lureau 2019-09-06 18:08:47 UTC
I confirm Giuseppe findings, disabling (removing) IMA from kernel allows bitlocker to be enabled in guest. It seems bitlocker doesn't like PCR10 being != 0 when VM boots.

Comment 50 Marc-Andre Lureau 2019-09-16 10:19:14 UTC
Giuseppe, is there anything I can help you with? Have you tried changing Windows TPM platform validation profile (in the group policy) to exclude pcr 10?

Comment 51 Kit Patterson 2019-09-20 11:07:28 UTC
(Previously sent by email) 
It’s probably useful if I summarise our current understanding of the BitLocker issue so that everyone’s fully informed. 

It seems that there are two issues which are stopping BitLocker working in a VM with a TPM passthrough; 1) Currently we’re running the Microsoft “bootmgr” component twice during start up, and that breaks BitLocker, 2) RHEL is using various PCR values which clash with BitLocker. If we resolve both of these issues then we can enable BitLocker in the VM with the default configuration, which presumably should be secure. 

For the first issue, KAL will change the way that we handle the boot sequence so that MS bootmgr is only run once. (We’ll find a way to boot directly into GRUB without loading bootmgr first. 

For the second issue, we’ve worked around this by recompiling various components to avoid using the PCRs that Windows uses. That includes GRUB to avoid using PCR 8/9 and the Kernel to disable IMA from using PCR 10. Note that even if BitLocker is not configured to use these PCR values it seems that MS bootmgr is extending them and checking the values anyway, causing encryption to fail. 
This works as an experiment, but obviously these recompiled components wouldn’t be supported by Red Hat for production use. So we need to find a solution which Red Hat is happy to support while still avoiding clashing with the BitLocker PCR values.  

What's the best way to proceed?

Comment 52 Kit Patterson 2019-09-20 11:08:44 UTC
Information from Microsoft (Kartikay Sharma <ksharma@microsoft.com>) 

"
Hello All

From our last call I have the answers to the queries as below:-

1.	Does the BOOTMGR perform a check if the PCR 11 is 0 or not during Boot, or it does something different? Please elaborate.
Security of BitLocker in bootmgr does not rely on a functional check of the PCR11 state, of course. Cryptographically, once PCR11 is capped it is no longer possible to unseal BitLocker key for the OS volume using TPM.

2.	How does Windows reseal the PCRs when we are online and PCR 11 is with a Static Value that cannot be changed to 0 unless the PC is rebooted? What component changes the value to 0?
When sealing with TPM code can setup whichever PCR values it wants *for sealing purposes*. Which does not affect real/current PCR values. So BitLocker code in OS can set PCR11 value for sealing purposes to 0.

3.	Can we use SRK \ TPM Owner Auth in any way to alter the PCRs and bypass the TPM Capping to get their solution to work.
There is no way to bypass BitLocker access control mechanism, otherwise we have a security issue which we will need to figure out a way to close.
"

Comment 53 Marc-Andre Lureau 2019-09-23 12:43:59 UTC
Hi

(In reply to Kit Patterson from comment #51)
> (Previously sent by email) 
> For the second issue, we’ve worked around this by recompiling various
> components to avoid using the PCRs that Windows uses. That includes GRUB to
> avoid using PCR 8/9 and the Kernel to disable IMA from using PCR 10. Note
> that even if BitLocker is not configured to use these PCR values it seems
> that MS bootmgr is extending them and checking the values anyway, causing
> encryption to fail. 

oh, that's quite a shame.

> This works as an experiment, but obviously these recompiled components
> wouldn’t be supported by Red Hat for production use. So we need to find a
> solution which Red Hat is happy to support while still avoiding clashing
> with the BitLocker PCR values.  
> 
> What's the best way to proceed?

I would open new bugs to both grub and the kernel to see what could be done, instead of a custom build. I will do that.

Comment 55 Ademar Reis 2020-02-05 23:01:42 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks

Comment 56 John Ferlan 2020-07-06 18:52:39 UTC
It's been a while since there's been any update to this bug - can we figure out where things stand and come to some sort of resolution for this bug?

I see from history that the customer case https://access.redhat.com/support/cases/02424672 was updated, but it's erroring out for me to see what the update was (go figure)

The remaining depends on bug has an update from months ago (https://bugzilla.redhat.com/show_bug.cgi?id=1754906#c1), but it doesn't seem there's any planned movement. In fact I wouldn't be surprised to see it CLOSEd as NOTABUG at some point from just reading the response.

The totality of the information available seems to point towards CLOSEing as NOTABUG like bug 1754508.

Comment 57 Kit Patterson 2020-07-08 16:46:02 UTC
We've now managed to develop a viable workaround for the issues covered in this bug, so this bug has become moot. I'm ok to close it. 

For reference; we've used a combination of a UEFI guest (which effects which PCRs are used by Windows,) and a customised version of GRUB to ensure that the physical TPM has the correct PCR values when it's passed through to Windows. With this combination BitLocker is usable. 

Thanks for your efforts.

Comment 59 John Ferlan 2020-07-14 15:38:17 UTC
Per recent comments, closing this bug.

Comment 61 Kit Patterson 2020-07-23 10:11:54 UTC
I don't see any further info needed.


Note You need to log in before you can comment on or make changes to this bug.