Bug 1947490

Summary: If Clevis on a managed LUKs volume with Ignition enables, the system will fails to automatically open the LUKs volume on system boot
Product: OpenShift Container Platform Reporter: Vinu K <vkochuku>
Component: RHCOSAssignee: Jonathan Lebon <jlebon>
Status: CLOSED ERRATA QA Contact: Michael Nguyen <mnguyen>
Severity: medium Docs Contact:
Priority: high    
Version: 4.7CC: bbreard, bgilbert, imcleod, jligon, miabbott, nstielau
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The clevis-luks-askpass.path unit was not enabled by default. Consequence: Non-root LUKS Clevis devices failed to unlock automatically on reboot. Fix: The clevis-luks-askpass.path unit is now enabled by default. Result: Non-root LUKS Clevis devices now unlock automatically on reboot.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:58:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vinu K 2021-04-08 15:37:33 UTC
Description of problem:
clevis-luks-askpass.path systemd unit not enabled and the system will fail to automatically open the LUKs volume on system boot.

Version-Release number of selected component (if applicable):
v4u7

How reproducible:
Possible

Steps to Reproduce:
1. Create machineconfig as follows:
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 98-worker-disk-varlib-lukstpm
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      disks:
        - device: /dev/sdb
          partitions:
            - label: varlib
      luks:
        - clevis:
            custom:
              config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
              needsNetwork: true
              pin: tpm2
          device: /dev/disk/by-partlabel/varlib
          keyFile:
            source: 'data:,passphrase'
          name: varlib
      filesystems:
        - device: /dev/disk/by-id/dm-name-varlib
          format: xfs
          path: /var/lib
---
2. It will not open the LUKS volume on system boot.

3. If we add systemd with machineconfig as follow it works.
---
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 98-worker-disk-varlib-lukstpm
spec:
  config:
    ignition:
      version: 3.2.0
    storage:
      disks:
        - device: /dev/sdb
          partitions:
            - label: varlib
      luks:
        - clevis:
            custom:
              config: '{"pcr_bank":"sha256","pcr_ids":"7"}'
              needsNetwork: true
              pin: tpm2
          device: /dev/disk/by-partlabel/varlib
          keyFile:
            source: 'data:,passphrase'
          name: varlib
      filesystems:
        - device: /dev/disk/by-id/dm-name-varlib
          format: xfs
          path: /var/lib
    systemd:
      units:
        - enabled: true
          name: clevis-luks-askpass.path
        - contents: |
            [Mount]
            What=/dev/disk/by-id/dm-name-varlib
            Where=/var/lib
            Type=xfs
            [Install]
            WantedBy=local-fs.target
          enabled: true
          name: var-lib.mount
---

Actual results:
LUKS will not open on system boot.

Expected results:
clevis-luks-askpass.path systemd unit needs to be up and opens LUKS volume

Additional info:

Comment 1 Yu Qi Zhang 2021-04-08 18:03:54 UTC
Reassigning to the RHCOS team to take a look. The MCO doesn't directly manage disks and encryption

Comment 2 Jonathan Lebon 2021-04-09 21:23:16 UTC
Indeed, clevis-luks-askpass.path should be enabled. This was tracked upstream, but we initially hit issues trying to enable it. Those issues are gone now.

Comment 3 Jonathan Lebon 2021-04-13 18:39:32 UTC
This is slated to be fixed in 4.8. For 4.7, since the workaround is trivial and it would require a bootimage bump, we'll leave it alone for now.

Comment 4 Micah Abbott 2021-04-19 14:17:34 UTC
This landed in RHCOS 48.84.202104170900-0; latest OCP 4.8 nightly payloads should include this fix

Comment 6 Michael Nguyen 2021-04-22 16:40:51 UTC
Verified that the preset is enabled on  4.8.0-0.nightly-2021-04-22-061234

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-0.nightly-2021-04-22-061234   True        False         75m     Cluster version is 4.8.0-0.nightly-2021-04-22-061234


$ oc get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
ci-ln-92bm8r2-f76d1-gs5fk-master-0         Ready    master   98m   v1.21.0-rc.0+3ced7a9
ci-ln-92bm8r2-f76d1-gs5fk-master-1         Ready    master   98m   v1.21.0-rc.0+3ced7a9
ci-ln-92bm8r2-f76d1-gs5fk-master-2         Ready    master   99m   v1.21.0-rc.0+3ced7a9
ci-ln-92bm8r2-f76d1-gs5fk-worker-b-mwkll   Ready    worker   91m   v1.21.0-rc.0+3ced7a9
ci-ln-92bm8r2-f76d1-gs5fk-worker-c-m56bz   Ready    worker   92m   v1.21.0-rc.0+3ced7a9
ci-ln-92bm8r2-f76d1-gs5fk-worker-d-r6vzg   Ready    worker   92m   v1.21.0-rc.0+3ced7a9
[mnguyen@pet32 4.8]$ oc debug node/ci-ln-92bm8r2-f76d1-gs5fk-worker-b-mwkll
Starting pod/ci-ln-92bm8r2-f76d1-gs5fk-worker-b-mwkll-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# cd /usr/lib/systemd/system-preset/
sh-4.4# cat 40-coreos.preset 
# Presets here that eventually should live in the generic fedora presets

# console-login-helper-messages - https://github.com/coreos/console-login-helper-messages
enable console-login-helper-messages-gensnippet-os-release.service
enable console-login-helper-messages-gensnippet-ssh-keys.service
# CA certs (probably to add to base fedora eventually)
enable coreos-update-ca-trust.service
# This one is from https://github.com/coreos/ignition-dracut
enable ignition-firstboot-complete.service
# Boot checkin services for cloud providers.
enable afterburn-checkin.service
enable afterburn-firstboot-checkin.service
# Target to write SSH key snippets from cloud providers.
enable afterburn-sshkeys.target
# Update agent
enable zincati.service
# Testing aid
enable coreos-liveiso-success.service
# See bootupd.yaml
enable bootupd.socket
# Enable rtas_errd for ppc64le to discover dynamically attached pci devices - https://bugzilla.redhat.com/show_bug.cgi?id=1811537
# The event for the attached device comes as a diag event.
# Ideally it should have been added as part of base Fedora - but since it was arch specific, it was not added: https://bugzilla.redhat.com/show_bug.cgi?id=1433859
enable rtas_errd.service
enable clevis-luks-askpass.path
sh-4.4# 
Removing debug pod ...
$ oc debug node/ci-ln-92bm8r2-f76d1-gs5fk-worker-b-mwkll -- chroot /host rpm-ostree status
Starting pod/ci-ln-92bm8r2-f76d1-gs5fk-worker-b-mwkll-debug ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:61227d143899680daefb2475fc58c8e044fa4d80bf6b6f3af76c2d87567b37c4
              CustomOrigin: Managed by machine-config-operator
                   Version: 48.84.202104220217-0 (2021-04-22T02:20:53Z)

  ostree://328a44d7c259ca1e3ed31ae020f09d922f460be998657a92f684f6760443077b
                   Version: 48.83.202103221318-0 (2021-03-22T13:22:02Z)

Removing debug pod ...

Comment 7 Michael Nguyen 2021-04-22 16:42:50 UTC
@vkochuku can you verify that the LUKS volume is automatically unlocked with your configuration now?

Comment 12 errata-xmlrpc 2021-07-27 22:58:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Comment 13 Red Hat Bugzilla 2023-09-15 01:04:49 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days