Bug 1866702
Summary: | Facing warning messages "failed to get imageFs info: non-existent label "crio-images" after upgrading to OCP 4.4.12 | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Asheth <asheth> | |
Component: | Node | Assignee: | Peter Hunt <pehunt> | |
Node sub component: | CRI-O | QA Contact: | Sunil Choudhary <schoudha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | alchan, aos-bugs, dshumake, harpatil, jhou, jokerman, mharri, oarribas, obulatov, pehunt, rdomnu, weihuang, wzheng | |
Version: | 4.4 | |||
Target Milestone: | --- | |||
Target Release: | 4.6.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1878264 (view as bug list) | Environment: | ||
Last Closed: | 2020-10-27 16:25:22 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1878264, 1878265 |
Description
Asheth
2020-08-06 07:44:21 UTC
what's `systemctl status crio`? We have seen this problem before when kubelet comes up before crio does. *** Bug 1866045 has been marked as a duplicate of this bug. *** I have yet to have a chance to look at this. I am going to work my team in the coming sprint to get to the bottom of it. I got it! When the MCO applies a ContainerRuntimeConfig, it takes the ignition template and populates it with some defaults and the overridden values (in 4.4 and 4.3, this behavior has been changed in 4.5 slightly). CRI-O's default containers/storage options (root, runroot, storage_driver, storage_option) are all commented out by default. This is because we usually want to inherit options from `/etc/containers/storage.conf` However, due to limitations in ignition, the newly created crio config prints all options, even ones that are empty (it doesn't know if it's supposed to be empty or not). This causes those values to all be empty. CRI-O then serves that information directly on its `/info` endpoint, which cadvisor uses to populate its information about where the crio images are. Thus, if we apply a ctrcfg, crio is lying to cadvisor about where the images are, and cadvisor gets confused and spits out that error. The solution is to properly inherit the defaults from containers/storage that come from the storage.json. The master version of that PR is attached. Once it's approved, I'll back port all the way back to 4.3 I've verified that this fixes all cases up through 4.4. I am not sure why a customer is facing it in 4.5.5, I wasn't able to reproduce, but this may also fix it there. technically, this is already fixed in 4.6/4.7, so I'm marking it as modified. I'll clone back to 4.4 where the issue actually occurs verified in version : 4.6.0-0.nightly-2020-09-12-230035 create a ContainerRuntimeConfig[1] changing the pod PID limit, and find no error messages in the event log, kubelet log and crio log. [1]ContainerRuntimeConfig.yaml: apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: name: set-pids-limit spec: machineConfigPoolSelector: matchLabels: custom-crio: high-pid-limit containerRuntimeConfig: pidsLimit: 4096 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |