Bug 1961472
Summary: | openshift-marketplace pods in CrashLoopBackOff state after RHACS installed with an SCC with readOnlyFileSystem set to true | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Matt Bargenquast <mbargenq> | |
Component: | OLM | Assignee: | Joe Lanford <jlanford> | |
OLM sub component: | OperatorHub | QA Contact: | Bruno Andrade <bandrade> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | jcall, jlanford, neilcar | |
Version: | 4.7 | Keywords: | Triaged | |
Target Milestone: | --- | |||
Target Release: | 4.8.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: Catalog registry pods do not have `readOnlyRootFileSystem: false` explicitly set in their securityContext field.
Consequence: If an SCC exists that enforces `readOnlyRootFileSystem: true` and otherwise matches the catalog registry pod securityContext, it will be assigned to the catalog registry pod, causing it to fail in a crash loop.
Fix: Explicitly set `readOnlyRootFileSystem: false` when creating catalog registry pods.
Result: Catalog registry pods are no longer matched to SCCs that enforce a read-only root filesystem, and thus no longer fail.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1962314 (view as bug list) | Environment: | ||
Last Closed: | 2021-07-27 23:08:46 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1962314 |
Description
Matt Bargenquast
2021-05-18 04:02:18 UTC
> When evaluating SCCs, the admission controller runs through them by priority. A priority of 'nil' is equal to a priority of 0, which is the highest priority; therefore, all of these SCCs are at the top of the list. From there, they're evaluated from most restrictive to least restrictive until an SCC matches the requests in the pod's SecurityContext and applies the first one that matches. > > The root cause here is that the API server's security context specifies it needs to be privileged but it does _not_ specify that it needs a read/write root file system. So, if the StackRox SCC is in place, that's the most restrictive, priority 0 SCC and it gets applied. Later, when the API server tries to write something, it fails and bad things happen. It appears that we need to explicitly set `securityContext.readOnlyRootFilesystem` to false to avoid matching to the StackRox SCC. This seems somewhat unexpected to me since the default value of `securityContext.readOnlyRootFilesystem` is false, so it seems unnecessary to set it explicitly, but it is very possible that I don't understand the background and reasoning. Looks like this is the pod that needs to be updated with an explicit container securityContext: https://github.com/operator-framework/operator-lifecycle-manager/blob/15790a8a2f07fe65a3dbf5a45a54d35e20f2cce9/pkg/controller/registry/reconciler/reconciler.go#L94 OCP Version: 4.8.0-0.nightly-2021-05-21-101954% OLM version: 0.17.0 git commit: ca1f0b69c3e2eb06ab4e62517fe5bd11e59a3239 1) Confirmed that catalog pod has attribute readOnlyRootFilesystem set to false oc get pods redhat-operators-xl5jb -n openshift-marketplace -o yaml spec: containers: - image: registry.redhat.io/redhat/redhat-operator-index:v4.8 imagePullPolicy: Always [...] securityContext: capabilities: drop: - MKNOD readOnlyRootFilesystem: false 2) Installed the Advanced Cluster Management for Kubernetes Operator oc get csv -n open-cluster-management NAME DISPLAY VERSION REPLACES PHASE advanced-cluster-management.v2.2.3 Advanced Cluster Management for Kubernetes 2.2.3 advanced-cluster-management.v2.2.2 Succeeded 3) Check if catalog pod are healthy: oc get pods -n openshift-marketplace NAME READY STATUS RESTARTS AGE 14bbd46d68f3ddd50b9328cee6854a36807ef784dac2bded9cc20638fbv7f5f 0/1 Completed 0 5m51s certified-operators-jcpbp 1/1 Running 0 49m community-operators-5qt64 1/1 Running 0 49m marketplace-operator-99db68d8d-czzwm 1/1 Running 0 52m redhat-marketplace-5q5vk 1/1 Running 0 49m redhat-operators-xl5jb 1/1 Running 0 49m LGTM, marking as VERIFIED. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |