Bug 1782546

Summary: No support for TPM1.2 devices
Product: OpenShift Container Platform Reporter: Scott Dodson <sdodson>
Component: RHCOSAssignee: Colin Walters <walters>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.3.0CC: bbreard, chris.liles, dsanzmor, dustymabe, imcleod, jligon, miabbott, mifiedle, mnguyen, nstielau, pehunt, sdodson, walters
Target Milestone: ---Keywords: TestBlocker
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1775388 Environment:
Last Closed: 2020-01-23 11:18:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1775388    
Bug Blocks: 1773108, 1776011    

Description Scott Dodson 2019-12-11 20:09:46 UTC
+++ This bug was initially created as a clone of Bug #1775388 +++

We're seeing rhcos-encrypt.service fail in a Packet.net provisioning - early analysis looks like some sort of race condition between rhcos-encrypt.service and the opener service around UUIDs.

--- Additional comment from Colin Walters on 2019-11-22 07:34:15 EST ---

Scott, can you give 
https://storage.cloud.google.com/walters-scratch/rhcos-walters-luks-43.81.201911212229.0-metal.x86_64.raw.gz
a try ?

--- Additional comment from Scott Dodson on 2019-11-22 16:03:10 EST ---

Got to an emergency shell again

# systemctl --failed
  UNIT                   LOAD   ACTIVE SUB    DESCRIPTION                      
● coreos-encrypt.service loaded failed failed CoreOS Firstboot encryption of ro>


:/# journalctl -b -u coreos-encrypt.service --no-pager
-- Logs begin at Fri 2019-11-22 21:00:33 UTC, end at Fri 2019-11-22 21:01:17 UTC. --
Nov 22 21:00:49 localhost systemd[1]: Starting CoreOS Firstboot encryption of root device...
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: coreos-cryptfs: Fetching clevis config
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: coreos-cryptfs: No Clevis config provided
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: coreos-cryptfs: Detected bare metal system (virt none)
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: coreos-cryptfs: Enabling TPM requirement by default
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: coreos-cryptfs: detected pin=tpm2
Nov 22 21:00:49 localhost coreos-cryptfs[1655]: Token 0 is not in use.
Nov 22 21:00:49 localhost systemd[1]: coreos-encrypt.service: Main process exited, code=exited, status=1/FAILURE
Nov 22 21:00:49 localhost systemd[1]: coreos-encrypt.service: Failed with result 'exit-code'.
Nov 22 21:00:49 localhost systemd[1]: Failed to start CoreOS Firstboot encryption of root device.
Nov 22 21:00:49 localhost systemd[1]: coreos-encrypt.service: Triggering OnFailure= dependencies.

--- Additional comment from Colin Walters on 2019-11-22 16:11:33 EST ---

> Nov 22 21:00:49 localhost coreos-cryptfs[1655]: Token 0 is not in use.

Is the operative thing here...

:/# cryptsetup luksDump /dev/disk/by-partlabel/luks_root 
LUKS header information
Version:       	2
Epoch:         	5
Metadata area: 	16384 [bytes]
Keyslots area: 	16744448 [bytes]
UUID:          	00000000-0000-4000-a000-000000000002
Label:         	(no label)
Subsystem:     	(no subsystem)
Flags:       	(no flags)

Data segments:
  0: crypt
	offset: 16777216 [bytes]
	length: (whole device)
	cipher: aes-cbc-essiv:sha256
	sector: 512 [bytes]

Keyslots:
  0: luks2
	Key:        256 bits
	Priority:   normal
	Cipher:     aes-cbc-essiv:sha256
	Cipher key: 256 bits
	PBKDF:      argon2i
	Time cost:  10
	Memory:     1048576
	Threads:    4
	Salt:       3a e4 b6 81 bd f2 7d d2 ee d0 ec ec 94 e8 58 6a 
	            9d cf 45 55 23 76 ca 33 1a 91 75 5b 9b 6b d6 8b 
	AF stripes: 4000
	AF hash:    sha256
	Area offset:32768 [bytes]
	Area length:131072 [bytes]
	Digest ID:  0
Tokens:
Digests:
  0: pbkdf2
	Hash:       sha256
	Iterations: 277107
	Salt:       e8 e6 d7 6e 57 a4 f6 46 fb d0 59 5b 80 2d 69 e0 
	            f1 9b ce 58 9f b4 9d 83 9b 6a 64 74 bb ce 81 e0 
	Digest:     93 a8 db 24 d7 76 ce 2d 8a 51 ce 61 17 3d 14 f1 
	            3e b4 e3 a9 3c 44 1f ca 92 e4 02 e5 e3 0c 8e 3d 
:/# 

Interesting; the disk is encrypted, but with no tokens at all?

--- Additional comment from Scott Dodson on 2019-11-22 17:16:22 EST ---

Current attempt where I got serial console setup on first boot.

# systemctl --failed
  UNIT                   LOAD   ACTIVE SUB    DESCRIPTION                      
● coreos-encrypt.service loaded failed failed CoreOS Firstboot encryption of ro>

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

:/# journalctl -b --no-pager -u coreos-encrypt.service                                                                                                                                                                                        
-- Logs begin at Fri 2019-11-22 22:09:05 UTC, end at Fri 2019-11-22 22:10:30 UTC. --
Nov 22 22:09:20 localhost systemd[1]: Starting CoreOS Firstboot encryption of root device...
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: Fetching clevis config
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: No Clevis config provided
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: Detected bare metal system (virt none)
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: Enabling TPM requirement by default
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: detected pin=tpm2
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: Cleared token LUKS token on /dev/disk/by-partlabel/luks_root
Nov 22 22:09:20 localhost coreos-cryptfs[1670]: coreos-cryptfs: generating new key
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Reencryption will change: volume key, set cipher to aes-cbc-essiv:sha256.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: LUKS2 header backup of device /dev/disk/by-partlabel/luks_root created.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: New LUKS header for device /dev/disk/by-partlabel/luks_root created.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Key slot 0 created.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Setting LUKS2 offline reencrypt flag on device /dev/disk/by-partlabel/luks_root.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Activating temporary device using old LUKS header.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Activating temporary device using new LUKS header.
Nov 22 22:09:38 localhost coreos-cryptfs[1670]: Progress:  36.4%, ETA 00:08,  960 MiB written, speed 191.6 MiB/s
Nov 22 22:09:43 localhost coreos-cryptfs[1670]: Progress:  72.7%, ETA 00:03, 1920 MiB written, speed 191.8 MiB/s
Nov 22 22:09:46 localhost coreos-cryptfs[1670]: Finished, time 00:13.761, 2639 MiB written, speed 191.8 MiB/s
Nov 22 22:09:47 localhost coreos-cryptfs[1670]: LUKS2 header on device /dev/disk/by-partlabel/luks_root restored.
Nov 22 22:09:47 localhost coreos-cryptfs[1670]: A TPM2 device with the in-kernel resource manager is needed!
Nov 22 22:09:47 localhost systemd[1]: coreos-encrypt.service: Main process exited, code=exited, status=1/FAILURE
Nov 22 22:09:47 localhost systemd[1]: coreos-encrypt.service: Failed with result 'exit-code'.
Nov 22 22:09:47 localhost systemd[1]: Failed to start CoreOS Firstboot encryption of root device.
Nov 22 22:09:47 localhost systemd[1]: coreos-encrypt.service: Triggering OnFailure= dependencies.

:/# cryptsetup luksDump /dev/disk/by-partlabel/luks_root
LUKS header information
Version:        2
Epoch:          5
Metadata area:  16384 [bytes]
Keyslots area:  16744448 [bytes]
UUID:           00000000-0000-4000-a000-000000000002
Label:          (no label)
Subsystem:      (no subsystem)
Flags:          (no flags)

Data segments:
  0: crypt
        offset: 16777216 [bytes]
        length: (whole device)
        cipher: aes-cbc-essiv:sha256
        sector: 512 [bytes]

Keyslots:
  0: luks2
        Key:        256 bits
        Priority:   normal
        Cipher:     aes-cbc-essiv:sha256
        Cipher key: 256 bits
        PBKDF:      argon2i
        Time cost:  10
        Memory:     1048576
        Threads:    4
        Salt:       1d 90 aa 15 96 08 06 34 e0 07 95 5c 45 65 4f 12 
                    ec 15 9d ca f5 52 e1 ed db 01 29 eb 5c a0 11 a2 
        AF stripes: 4000
        AF hash:    sha256
        Area offset:32768 [bytes]
        Area length:131072 [bytes]
        Digest ID:  0
Tokens:
Digests:
  0: pbkdf2
        Hash:       sha256
        Iterations: 276523
        Salt:       ce fd 46 fb ab a9 94 97 9c 52 a5 37 59 79 49 1e 
                    89 32 a8 b1 b6 96 f6 f4 60 76 a6 6a d1 89 9d 98 
        Digest:     67 66 35 8d 34 0a 62 79 35 80 4e c9 03 33 d6 48 
                    4c 1d 0e f9 32 8a d6 89 27 00 ad 33 be 63 43 ed

--- Additional comment from Abhinav Dahiya on 2019-11-26 14:39:48 EST ---



--- Additional comment from Colin Walters on 2019-12-02 15:13:26 EST ---

We should decide whether we want to support TPM1.2, and if not...let's make the error more obvious at least.

--- Additional comment from Ben Breard on 2019-12-03 18:48:21 EST ---

Realistically we should be supporting TPM 2.0 for clevis. .....but we still have to boot on hardware with a 1.2 chip.

Comment 1 Scott Dodson 2019-12-13 13:57:53 UTC
*** Bug 1776011 has been marked as a duplicate of this bug. ***

Comment 2 Mike Fiedler 2019-12-13 14:08:09 UTC
Marking as test blocker for bare metal installs based on latest comments in https://bugzilla.redhat.com/show_bug.cgi?id=1776011

Comment 4 Michael Nguyen 2019-12-20 21:18:35 UTC
Verified on 43.81.201912201253.0.  Successfully booted RHCOS using swtpm TPM1.2.  The disk is not encrypted as expected.

Red Hat Enterprise Linux CoreOS 43.81.201912201253.0
  Part of OpenShift 4.3, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.3/architecture/architecture-rhcos.l

---
[core@ibm-p8-kvm-03-guest-02 ~]$ journalctl --list-boots
-1 b7161a5b5ee64aa2a3371c5652b656e5 Fri 2019-12-20 21:03:21 UTC—Fri 2019-12-20 >
 0 7188b245b2c8464091ad13b5e9454ee3 Fri 2019-12-20 21:10:36 UTC—Fri 2019-12-20 >
[core@ibm-p8-kvm-03-guest-02 ~]$ journalctl -b 0 | grep -i tpm
Dec 20 21:10:36 localhost kernel: tpm_tis 00:05: 1.2 TPM (device-id 0x1, rev-id 1)
[core@ibm-p8-kvm-03-guest-02 ~]$ journalctl -b -1 | grep -i tpm
Dec 20 21:03:21 localhost kernel: tpm_tis 00:05: 1.2 TPM (device-id 0x1, rev-id 1)
/luks_root 8-kvm-03-guest-02 ~]$ sudo cryptsetup luksDump /dev/disk/by-partlabel/
LUKS header information
Version:       	2
Epoch:         	5
Metadata area: 	16384 [bytes]
Keyslots area: 	16744448 [bytes]
UUID:          	00000000-0000-4000-a000-000000000002
Label:         	crypt_rootfs
Subsystem:     	(no subsystem)
Flags:       	(no flags)

Data segments:
  0: crypt
	offset: 16777216 [bytes]
	length: (whole device)
	cipher: cipher_null-ecb
	sector: 512 [bytes]

Keyslots:
  0: luks2
	Key:        256 bits
	Priority:   normal
	Cipher:     cipher_null-ecb
	Cipher key: 256 bits
	PBKDF:      argon2i
	Time cost:  4
	Memory:     524288
	Threads:    1
	Salt:       ca 47 41 ce 4a ea 6c 4d 5e c2 f9 38 6b b4 9e 9a 
	            da 91 5c b8 9a 48 ce 34 40 fd 12 b6 f4 87 a9 1b 
	AF stripes: 4000
	AF hash:    sha256
	Area offset:32768 [bytes]
	Area length:131072 [bytes]
	Digest ID:  0
Tokens:
  9: coreos
	Keyslot:  0
Digests:
  0: pbkdf2
	Hash:       sha256
	Iterations: 233639
	Salt:       83 41 bf 8e d1 24 0e ec 87 06 8e fd d6 8f 28 90 
	            fe 37 20 92 35 98 97 74 7e 2c c1 5e 05 b4 97 88 
	Digest:     83 58 c7 e3 25 bd 08 48 3d fa f0 0c 66 76 2b 30 
	            f4 36 3e 25 dd f0 cf 03 c7 61 1e 87 a0 34 2f 86 
[core@ibm-p8-kvm-03-guest-02 ~]$ lsblk
NAME                         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sr0                           11:0    1 1024M  0 rom  
vda                          252:0    0   16G  0 disk 
|-vda1                       252:1    0  384M  0 part /boot
|-vda2                       252:2    0  127M  0 part /boot/efi
|-vda3                       252:3    0    1M  0 part 
`-vda4                       252:4    0 15.5G  0 part 
  `-coreos-luks-root-nocrypt 253:0    0 15.5G  0 dm   /sysroot
[core@ibm-p8-kvm-03-guest-02 ~]$ 
[core@ibm-p8-kvm-03-guest-02 ~]$ rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* ostree://55dd68051ff5ba92436b6e5c79bb2d1c9abcf8ca34c0409cbf67fad9290dea5c
                   Version: 43.81.201912201253.0 (2019-12-20T12:58:42Z)

Comment 6 errata-xmlrpc 2020-01-23 11:18:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062