Created attachment 1760219 [details] mcd log Description of problem: when enabling NBBDE, according to doc https://github.com/openshift/openshift-docs/blob/enterprise-4.7/modules/installation-special-config-encrypt-disk-tang.adoc rootfs is too small (3G) which then causes MCD to panic when trying to rebase the os (see attached log) the hidden msg is: ``` # rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-257500977/srv/repo:c6ccbc4764826ef8ddecf083945a3b0172b015494d7b59c09b7c840045bd4565 --custom-origin-url pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:96be060a7824bed1eae6431f2209457a0263aa7cb70f495a68d23f734ba384d3 --custom-origin-description "Managed by machine-config-operator" error: Pulling commit c6ccbc4764826ef8ddecf083945a3b0172b015494d7b59c09b7c840045bd4565 from local repo: Writing content object: min-free-space-percent '3%' would be exceeded, at least 5.3 MB requested ``` Version-Release number of selected component (if applicable): 4.8-nightly How reproducible: everytime Steps to Reproduce: 1. follow the TANG encryption doc
Will look to fix the panic, in the meantime reassigning to RHCOS to see if the script is correct for general use
see that I've also opened: https://bugzilla.redhat.com/show_bug.cgi?id=1934174
Ah sorry, moving back, panic fix in https://github.com/openshift/machine-config-operator/pull/2449
Verified on 4.8.0-0.nightly-2021-03-08-092651. MCD no longer panics when hitting the error. The RHCOS resize bz mentioned https://bugzilla.redhat.com/show_bug.cgi?id=1934113#c2 is still present so I was able to capture the error (with no panic) but once that fixed is in a build, this error should not happen anymore. Verification steps: - Create a Tang Server - openshift install create manifests - Add the following two files, replacing tang server and thumbprint with yours $ cat << EOF > ./99-openshift-master-tang-encryption.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: worker-tang labels: machineconfiguration.openshift.io/role: worker spec: config: ignition: version: 3.2.0 storage: luks: - name: root device: /dev/disk/by-partlabel/root clevis: tang: - url: https://tang.example.com thumbprint: PLjNyRdGw03zlRoGjQYMahSZGu9 options: [--cipher, aes-cbc-essiv:sha256] wipeVolume: true filesystems: - device: /dev/mapper/root format: xfs wipeFilesystem: true label: root kernelArguments: - rd.neednet=1 EOF $ cat << EOF > ./99-openshift-master-tang-encryption.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: master-tang labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: luks: - name: root device: /dev/disk/by-partlabel/root clevis: tang: - url: https://tang.example.com thumbprint: PLjNyRdGw03zlRoGjQYMahSZGu9 options: [--cipher, aes-cbc-essiv:sha256] wipeVolume: true filesystems: - device: /dev/mapper/root format: xfs wipeFilesystem: true label: root kernelArguments: - rd.neednet=1 EOF - openshift install create cluster - verify there are no panics in the mcd logs -- Logs begin at Mon 2021-03-08 16:14:27 UTC, end at Mon 2021-03-08 17:09:42 UTC. -- Mar 08 16:24:18 ip-10-0-131-245 systemd[1]: Starting Machine Config Daemon Firstboot... Mar 08 16:24:18 ip-10-0-131-245 sh[4198]: sed: can't read /etc/yum.repos.d/*.repo: No such file or directory Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.094481 4200 rpm-ostree.go:258] Running captured: rpm-ostree status --json Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.317745 4200 daemon.go:218] Booted osImageURL: (47.83.202102090044-0) Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.319066 4200 update.go:597] Checking Reconcilable for config mco-empty-mc to rendered-master-123e469c42eaa65bce01288e5c7aa6fc Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.319601 4200 update.go:1905] Starting update from mco-empty-mc to rendered-master-123e469c42eaa65bce01288e5c7aa6fc: &{osUpdate:true kargs:true fips:false passwd:false files:false units:false kernelType:false extensions:false} Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.323664 4200 update.go:1220] Updating files Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.324298 4200 update.go:1293] Deleting stale data Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.326720 4200 run.go:18] Running: nice -- ionice -c 3 oc image extract --path /:/run/mco-machine-os-content/os-content-965836767 --registry-config /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.601183 4200 update.go:1783] Updating OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.601396 4200 rpm-ostree.go:258] Running captured: rpm-ostree status --json Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.630625 4200 rpm-ostree.go:184] Current origin is not custom Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221356 4200 rpm-ostree.go:211] Pivoting to: 48.83.202103080317-0 (5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e) Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221377 4200 rpm-ostree.go:243] Executing rebase from repo path /run/mco-machine-os-content/os-content-965836767/srv/repo with customImageURL pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 and checksum 5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221391 4200 rpm-ostree.go:258] Running captured: rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-965836767/srv/repo:5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e --custom-origin-url pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 --custom-origin-description Managed by machine-config-operator Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:19.153797 4200 update.go:1220] Updating files Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:19.154211 4200 update.go:1293] Deleting stale data Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: error: failed to update OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 : error running rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-965836767/srv/repo:5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e --custom-origin-url pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 --custom-origin-description Managed by machine-config-operator: error: Pulling commit 5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e from local repo: Writing content object: min-free-space-percent '3%' would be exceeded, at least 123.8 MB requested Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: : exit status 1 Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Main process exited, code=exited, status=1/FAILURE Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Failed with result 'exit-code'. Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: Failed to start Machine Config Daemon Firstboot. Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Consumed 16.260s CPU time
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438