Bug 1858821 - Fail to test HLK "TPM 2.0" for win10 (1903) guest - phase II
Summary: Fail to test HLK "TPM 2.0" for win10 (1903) guest - phase II
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libtpms
Version: 8.1
Hardware: x86_64
OS: Windows
medium
high
Target Milestone: rc
: 8.0
Assignee: Marek Kedzierski
QA Contact: Qinghua Cheng
URL:
Whiteboard:
Depends On: 1744045 1809676 1809778
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-20 13:41 UTC by Meirav Dean
Modified: 2021-08-25 01:23 UTC (History)
19 users (show)

Fixed In Version: libtpms-0.7.4-3.20201106git2452a24dab
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1744045
Environment:
Last Closed: 2021-05-25 06:42:26 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Comment 1 Marek Kedzierski 2020-07-20 14:03:05 UTC
The list of currently failing tests (HLK 2004) can be found here:

https://github.com/stefanberger/libtpms/wiki/Testing-of-libtpms-Functionality

Comment 4 Marek Kedzierski 2020-09-02 21:10:09 UTC
Regarding two failing tests from https://github.com/stefanberger/libtpms/wiki/Testing-of-libtpms-Functionality:

ECDSASignValidate	
ECDHSecretAgreementTest

Both tests are failing for the same reason. Here is snippet of code reverse-engineered from failing test
and translated from ASM to C:

//start -------------------------------------------
        NCRYPT_PROV_HANDLE provider;
	NCRYPT_KEY_HANDLE key;
	SECURITY_STATUS status = ERROR_SUCCESS;

	// Open TPM provider
	status = NCryptOpenStorageProvider(&provider,
		MS_PLATFORM_CRYPTO_PROVIDER,
		0);
	
        if (SUCCEEDED(status))
	{
		printf("NCryptOpenStorageProvider opened\n");

		status = NCryptCreatePersistedKey(provider,
			&key,
			BCRYPT_ECDH_ALGORITHM,
			L"ECDH Key (TPM)",
			0,
			NCRYPT_OVERWRITE_KEY_FLAG);

		if (SUCCEEDED(status))
		{
			printf("NCryptCreatePersistedKey successful\n");

			DWORD curveNameLen = (wcslen(BCRYPT_ECC_CURVE_NISTP521) + 1) * 2;

			status = NCryptSetProperty(key,
				BCRYPT_ECC_CURVE_NAME,
				(PBYTE)BCRYPT_ECC_CURVE_NISTP521,
				curveNameLen,
				0);

			if (SUCCEEDED(status))
			{
				printf("NCryptSetProperty successful\n");

				status = NCryptFinalizeKey(key, 0);
				if (SUCCEEDED(status))
				{
					printf("NCryptFinalizeKey successful\n");

					DWORD bufferSize = 0;
					status = NCryptGetProperty(key,
						NCRYPT_ECC_PARAMETERS_PROPERTY,
						NULL,
						0,
						&bufferSize,
						0);
//rest is skipped -------------------------------------------------------------------------------------------------

Summary:

With our vTPM implementation code fails in NCryptGetProperty call with error code 0x80290407 
('TPM_E_PCP_INTERNAL_ERROR 0x80290407 An unexpected internal error has occurred in the Platform Crypto Provider').

When real (hardware) TPM is used, the code fails in NCryptSetProperty() with NTE_NOT_SUPPORTED (error code 0x80090029).

On Hyper-V the code also fails in NCryptSetProperty() with NTE_NOT_SUPPORTED error.

However when HLK tests are run under Hyper-V platform, the code above doesn't return any error and test is passed.
The reasons is that Microsoft provides (among vTPM) also software implementation ('Microsoft Software Key Storage Provider').
When this software provider is used, the code from the snippet works. Apparently HLK uses this provider when this test is run. And 
that is the reason the test is passed.

Comment 5 Stefan Berger 2020-09-02 22:02:53 UTC
Thanks so much Marek!

HLK seems to have a problem with ECC_NIST_P521! None of the other curves enabled for the TPM 2 [ECC_{NIST_P192,NIST_P224,NIST_P256,NIST_P384,BN_P256,BN_P638,SM2_P256}] seems to be a reason for failure. So, here's a patch that solves this issue by hiding support for this particular curve:

From 35f56fd7f8cfd13dc423178c086f79d053d2b076 Mon Sep 17 00:00:00 2001
From: Stefan Berger <stefanb.com>
Date: Wed, 2 Sep 2020 16:28:12 -0400
Subject: [PATCH] tpm2: Hide NIST P521 support to pass HLK 2004 Platform Crypto
 Provider Test

HLK 2004 fails the Platform Crypto Provider Key Storage Provider Test if
ECC_NIST_P521 is enabled in TPM 2. To solve the issue we hide the support for
this algorithm rather than disabling it. [Disabling it would cause issues
with reading previous TPM 2 state files where the algorithm was enabled.]

Signed-off-by: Stefan Berger <stefanb.com>
---
 src/tpm2/crypto/openssl/CryptEccMain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/tpm2/crypto/openssl/CryptEccMain.c b/src/tpm2/crypto/openssl/CryptEccMain.c
index 655564b..763bb0d 100644
--- a/src/tpm2/crypto/openssl/CryptEccMain.c
+++ b/src/tpm2/crypto/openssl/CryptEccMain.c
@@ -802,6 +802,8 @@ CryptEccIsCurveRuntimeUsable(
                             TPMI_ECC_CURVE curveId
                            )
 {
+    if (curveId == TPM_ECC_NIST_P521) /* make HLK 2004 happy; FIXME: remove once NIST P521 accepted */
+        return FALSE;
     CURVE_INITIALIZED(E, curveId);
     if (E == NULL)
        return FALSE;
--
2.26.2

I have not applied this patch upstream, and I am not sure yet whether I will.

Comment 6 Marek Kedzierski 2020-09-03 10:16:44 UTC
Hi, 

I've merged your patch locally and checked with code reverse engineered test. Now the result
is consistent with Hyper-V (vTPM) and real TPM. 


Thanks, 

Marek

Comment 8 John Ferlan 2020-12-22 19:49:56 UTC
Considering Stefan's comment 5 and a successful test run from comment 6 with a workaround that won't be merged upstream, do you have any objections to closing this bug as WONTFIX?

Comment 9 Marek Kedzierski 2021-01-03 23:48:21 UTC
(In reply to John Ferlan from comment #8)
> Considering Stefan's comment 5 and a successful test run from comment 6 with
> a workaround that won't be merged upstream, do you have any objections to
> closing this bug as WONTFIX?

No objections. The only thing to remember is to apply this patch before running 2004 HLK TPM tests.

Comment 10 Marek Kedzierski 2021-01-05 10:08:20 UTC
(In reply to Marek Kedzierski from comment #9)
> (In reply to John Ferlan from comment #8)
> > Considering Stefan's comment 5 and a successful test run from comment 6 with
> > a workaround that won't be merged upstream, do you have any objections to
> > closing this bug as WONTFIX?
> 
> No objections. The only thing to remember is to apply this patch before
> running 2004 HLK TPM tests.

Update: according to our procedure we can't apply the patch before running tests or so.
A configuration flag or so must be added to libtpms or swtpm that will enable this fix dynamically for TPM HLK tests only.

Comment 11 John Ferlan 2021-01-08 13:18:34 UTC
(In reply to Marek Kedzierski from comment #10)
> (In reply to Marek Kedzierski from comment #9)
> > (In reply to John Ferlan from comment #8)
> > > Considering Stefan's comment 5 and a successful test run from comment 6 with
> > > a workaround that won't be merged upstream, do you have any objections to
> > > closing this bug as WONTFIX?
> > 
> > No objections. The only thing to remember is to apply this patch before
> > running 2004 HLK TPM tests.
> 
> Update: according to our procedure we can't apply the patch before running
> tests or so.
> A configuration flag or so must be added to libtpms or swtpm that will
> enable this fix dynamically for TPM HLK tests only.

That seems to be a feature request... 

Something that hopefully Stefan can chime in on over the relative need / importance or possibility (thus the needinfo).

FWIW: Clearing private flag from comments 8-10 as I'm not sure why private was set when I responded since there's nothing but process...

Comment 12 Stefan Berger 2021-01-08 13:33:35 UTC
Can we make this a compile-time [configuration] option rather than an option for swtpm on the command line?

Comment 13 Stefan Berger 2021-01-08 14:03:47 UTC
To achieve HLK compliance *this time* we needed to deactivate an elliptic curve that for some reason is not supported by Windows or by the HLK test suite. The unfortunate problem is that we find these things out much later than for example the latest libtpms code may be out in Fedora and people using that version could possibly already have keys in the vTPM using this particular NIST curve. So far I never waited for HLK to come out with a test of a particular TPM version. So if we are following the HLK compliance part for RHEL we will necessarily have different behavior of the vTPM for RHEL versus Fedora or others that already picked up the code. From my perspective the RHEL version should use a ./configure ... --enable-hlk-compliance compile-time option , at least for the libtpms-0.7.x versions that RHAV is using. Development is currently for 0.8 that is not out [due to some Linux driver bug that cannot deal with 3k RSA keys]. *Maybe* that we would have an API there, but who is to choose this option then. And do we want to propagate such command line options all the way into libvirt and virt-manager to choose between HLK compliance and non-compliance. I am not sure.

Comment 14 Stefan Berger 2021-01-08 23:23:11 UTC
PR for compile time option: https://github.com/stefanberger/libtpms/pull/174

Comment 15 John Ferlan 2021-01-11 18:16:23 UTC
Change ITR=8.4.0 - at this point 8.3.1 is restricted to fixing bugs without introducing new features. 

For the 8.3.1 testing it'll just have to be a "known" that this test fails, but there may be a solution. 

Whether a future solution is via a build or run time switch...

I'm not sure we should do a build time in our downstream world since how it's built is how it gets shipped, but that's an engineering decision I'll look to avoid.

Comment 16 Stefan Berger 2021-01-11 18:25:57 UTC
Here's the compile-time stop gap measure (single patch): https://github.com/stefanberger/libtpms/commits/hlk_compliant

The dynamic solution goes through libtpms + swtpm + libvirt + ....
- https://github.com/stefanberger/libtpms/commits/dynamic_hlk_compliance
- https://github.com/stefanberger/swtpm/commits/dynamic_hlk_compliance
- https://github.com/stefanberger/libvirt-tpm/commits/hlk-compliant

I would blame HLK for having a bug related to NIST P521 curve. Do we really need to work around bugs in HLK in the layers underneath?

Comment 17 John Ferlan 2021-01-12 17:12:25 UTC
Stefan - thanks for the patches/effort!

I agree with the theory that having patches to work around a problem in some other product doesn't seem right. It's also ripe for having a problem once/if the other problem is fixed. How does one know it isn't fixed?

What's not clear is how one would go about "depending" upon that to be fixed in this context where we have some downstream windows guest based acceptance test. I think we've "proven" that the bug is not in the libptms/swtpm (vTPM for RHEL-AV) code.

It would seem to me that a bug is filed elsewhere and the test harness is altered to not run the specific test until the bug is fixed. How/If that gets (or needs to be) documented is another one of those unknowns.

So Marek back to you on this - is there a Windows bug filed somewhere? I assume against the HLK test.

Still inclined to call this a WONTFIX from a vTPM perspective at this point.

Comment 18 Marek Kedzierski 2021-01-12 20:55:17 UTC
(In reply to John Ferlan from comment #17)
> Stefan - thanks for the patches/effort!
> 
> I agree with the theory that having patches to work around a problem in some
> other product doesn't seem right. It's also ripe for having a problem
> once/if the other problem is fixed. How does one know it isn't fixed?
> 
> What's not clear is how one would go about "depending" upon that to be fixed
> in this context where we have some downstream windows guest based acceptance
> test. I think we've "proven" that the bug is not in the libptms/swtpm (vTPM
> for RHEL-AV) code.
> 
> It would seem to me that a bug is filed elsewhere and the test harness is
> altered to not run the specific test until the bug is fixed. How/If that
> gets (or needs to be) documented is another one of those unknowns.
> 
> So Marek back to you on this - is there a Windows bug filed somewhere? I
> assume against the HLK test.
> 
No, there is not. The problem is that we don't have support from Microsoft in this
case. There is no way to fill a bug. So solving such problems usually leads to more or less 
obscure form of reverse engineering. And applying workarounds leads finally to introducing
ugly "hacks" in a code.

> Still inclined to call this a WONTFIX from a vTPM perspective at this point.

Comment 19 John Ferlan 2021-01-30 14:00:11 UTC
Removing the ITM=8.4.0 since there is nothing that can/should be done from an engineering PoV.

I'll let Marek decide whether to keep for reference on a personal testonly backlog or just close.

Comment 20 Marek Kedzierski 2021-02-15 16:28:10 UTC
Hi, 

I got a contact to Microsoft. I created a ticked on their support line provided for them files etc. for investigation.

Today they provided results of their investigation which I passed to Stefan for analysis.

Stefan quickly created a fix that solves this issue: https://github.com/stefanberger/libtpms/pull/180
This is the fix that doesn't require disabling any functionality or so. 
 
I've merged the fix locally and HLK test doesn't complain anymore.

Thanks, 

Marek

Comment 21 Marc-Andre Lureau 2021-02-15 21:39:44 UTC
thanks guys, Microsoft & Stefan!, I can do the fix backport then.

Comment 28 Stefan Berger 2021-02-18 16:57:11 UTC
Great for the backport and so on!

News from the TCG TPM working group is that Microsoft will fix the test case ... !

Comment 29 Marek Kedzierski 2021-02-18 17:10:28 UTC
(In reply to Stefan Berger from comment #28)
> Great for the backport and so on!
> 
> News from the TCG TPM working group is that Microsoft will fix the test case
> ... !

So this is really a bug in HLK and the whole analysis that MS support did is wrong...

Comment 30 Marc-Andre Lureau 2021-02-18 20:34:32 UTC
(In reply to Marek Kedzierski from comment #29)
> (In reply to Stefan Berger from comment #28)
> > Great for the backport and so on!
> > 
> > News from the TCG TPM working group is that Microsoft will fix the test case
> > ... !
> 
> So this is really a bug in HLK and the whole analysis that MS support did is
> wrong...

Hrmpf.. should I revert the https://github.com/stefanberger/libtpms/pull/180 fix from dist-git?

Btw, where are those informations coming from? is there any public details from MS?

Comment 31 Stefan Berger 2021-02-18 21:25:35 UTC
(In reply to Marc-Andre Lureau from comment #30)
> (In reply to Marek Kedzierski from comment #29)

> > 
> > So this is really a bug in HLK and the whole analysis that MS support did is
> > wrong...
> 
> Hrmpf.. should I revert the https://github.com/stefanberger/libtpms/pull/180
> fix from dist-git?
> 
The patch now returns 66 bytes for the b parameter of the NIST P521 curve and allows the test to pass. Previously the TPM 2 returned 65 bytes and it had removed the leading 0 from the b parameter. I do not think that it is wrong to return leading 0s, so I think we should be able to keep this patch.

> Btw, where are those informations coming from? is there any public details
> from MS?

I had forwarded Marek an email I received from a colleague (Ken Goldman) after today's TPM working group meeting where my posting of the fix was discussed and a Microsoft employee said they were going to fix the test case.

Comment 35 Qinghua Cheng 2021-03-08 02:03:08 UTC
Verified on:


RHEL.8.4.0-av

Host:

kernel: 4.18.0-291.el8.x86_64

qemu-kvm: qemu-kvm-5.2.0-8.module+el8.4.0+10093+e085f1eb.x86_64

libtpms: libtpms-0.7.4-3.20201106git2452a24dab.module+el8.4.0+10093+e085f1eb.x86_64

swtpm: swtpm-0.4.2-1.20201201git2df14e3.module+el8.4.0+9341+96cf2672.x86_64

edk2: edk2-ovmf-20200602gitca407c7246bf-4.el8.noarch

Guest:

Win10: 20h2

HLK 2004 studio: 10.1.19041.1

Two test cases failed:
TPM Auxiliary Test: VerifySpecVersion
EKCertificate Tests: VerifyEkCertisKnownAuthority

Other tests PASS.

Comment 37 errata-xmlrpc 2021-05-25 06:42:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2098


Note You need to log in before you can comment on or make changes to this bug.