Bug 1934113
Summary: | mcd panic when there's not enough free disk space | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Yuval Kashtan <ykashtan> | ||||
Component: | Machine Config Operator | Assignee: | Yu Qi Zhang <jerzhang> | ||||
Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.8 | CC: | bbreard, imcleod, jerzhang, jligon, nstielau | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.8.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-07-27 22:49:00 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Yuval Kashtan
2021-03-02 14:28:29 UTC
Will look to fix the panic, in the meantime reassigning to RHCOS to see if the script is correct for general use see that I've also opened: https://bugzilla.redhat.com/show_bug.cgi?id=1934174 Ah sorry, moving back, panic fix in https://github.com/openshift/machine-config-operator/pull/2449 Verified on 4.8.0-0.nightly-2021-03-08-092651. MCD no longer panics when hitting the error. The RHCOS resize bz mentioned https://bugzilla.redhat.com/show_bug.cgi?id=1934113#c2 is still present so I was able to capture the error (with no panic) but once that fixed is in a build, this error should not happen anymore. Verification steps: - Create a Tang Server - openshift install create manifests - Add the following two files, replacing tang server and thumbprint with yours $ cat << EOF > ./99-openshift-master-tang-encryption.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: worker-tang labels: machineconfiguration.openshift.io/role: worker spec: config: ignition: version: 3.2.0 storage: luks: - name: root device: /dev/disk/by-partlabel/root clevis: tang: - url: https://tang.example.com thumbprint: PLjNyRdGw03zlRoGjQYMahSZGu9 options: [--cipher, aes-cbc-essiv:sha256] wipeVolume: true filesystems: - device: /dev/mapper/root format: xfs wipeFilesystem: true label: root kernelArguments: - rd.neednet=1 EOF $ cat << EOF > ./99-openshift-master-tang-encryption.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: master-tang labels: machineconfiguration.openshift.io/role: master spec: config: ignition: version: 3.2.0 storage: luks: - name: root device: /dev/disk/by-partlabel/root clevis: tang: - url: https://tang.example.com thumbprint: PLjNyRdGw03zlRoGjQYMahSZGu9 options: [--cipher, aes-cbc-essiv:sha256] wipeVolume: true filesystems: - device: /dev/mapper/root format: xfs wipeFilesystem: true label: root kernelArguments: - rd.neednet=1 EOF - openshift install create cluster - verify there are no panics in the mcd logs -- Logs begin at Mon 2021-03-08 16:14:27 UTC, end at Mon 2021-03-08 17:09:42 UTC. -- Mar 08 16:24:18 ip-10-0-131-245 systemd[1]: Starting Machine Config Daemon Firstboot... Mar 08 16:24:18 ip-10-0-131-245 sh[4198]: sed: can't read /etc/yum.repos.d/*.repo: No such file or directory Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.094481 4200 rpm-ostree.go:258] Running captured: rpm-ostree status --json Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.317745 4200 daemon.go:218] Booted osImageURL: (47.83.202102090044-0) Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.319066 4200 update.go:597] Checking Reconcilable for config mco-empty-mc to rendered-master-123e469c42eaa65bce01288e5c7aa6fc Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.319601 4200 update.go:1905] Starting update from mco-empty-mc to rendered-master-123e469c42eaa65bce01288e5c7aa6fc: &{osUpdate:true kargs:true fips:false passwd:false files:false units:false kernelType:false extensions:false} Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.323664 4200 update.go:1220] Updating files Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.324298 4200 update.go:1293] Deleting stale data Mar 08 16:24:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:24:18.326720 4200 run.go:18] Running: nice -- ionice -c 3 oc image extract --path /:/run/mco-machine-os-content/os-content-965836767 --registry-config /var/lib/kubelet/config.json quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.601183 4200 update.go:1783] Updating OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.601396 4200 rpm-ostree.go:258] Running captured: rpm-ostree status --json Mar 08 16:25:16 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:16.630625 4200 rpm-ostree.go:184] Current origin is not custom Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221356 4200 rpm-ostree.go:211] Pivoting to: 48.83.202103080317-0 (5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e) Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221377 4200 rpm-ostree.go:243] Executing rebase from repo path /run/mco-machine-os-content/os-content-965836767/srv/repo with customImageURL pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 and checksum 5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e Mar 08 16:25:18 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:18.221391 4200 rpm-ostree.go:258] Running captured: rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-965836767/srv/repo:5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e --custom-origin-url pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 --custom-origin-description Managed by machine-config-operator Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:19.153797 4200 update.go:1220] Updating files Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: I0308 16:25:19.154211 4200 update.go:1293] Deleting stale data Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: error: failed to update OS to quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 : error running rpm-ostree rebase --experimental /run/mco-machine-os-content/os-content-965836767/srv/repo:5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e --custom-origin-url pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e6a29805478181c58ee8922085fc919cd19a15617f6e32ca9b0580c086fcfb41 --custom-origin-description Managed by machine-config-operator: error: Pulling commit 5633f70d06713fab5da5b884c1637b2bc6b0de7cc76967e9b7d75fcde315692e from local repo: Writing content object: min-free-space-percent '3%' would be exceeded, at least 123.8 MB requested Mar 08 16:25:19 ip-10-0-131-245 machine-config-daemon[4200]: : exit status 1 Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Main process exited, code=exited, status=1/FAILURE Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Failed with result 'exit-code'. Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: Failed to start Machine Config Daemon Firstboot. Mar 08 16:25:19 ip-10-0-131-245 systemd[1]: machine-config-daemon-firstboot.service: Consumed 16.260s CPU time Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |