Description of problem: OCP Upgrade failing Version-Release number of the following components: oc version Client Version: 4.8.0-202108312109.p0.git.0d10c3f.assembly.stream-0d10c3f Server Version: 4.10.13 Kubernetes Version: v1.23.5+b463d71 How reproducible: Always Steps to Reproduce: 1. Create the following SCC (that has `with readOnlyRootFilesystem: true`): ~~~ cat << EOF | oc create -f - allowHostDirVolumePlugin: true allowHostIPC: false allowHostNetwork: false allowHostPID: false allowHostPorts: false allowPrivilegeEscalation: true allowPrivilegedContainer: true allowedCapabilities: [] apiVersion: security.openshift.io/v1 defaultAddCapabilities: [] fsGroup: type: MustRunAs groups: [] kind: SecurityContextConstraints metadata: annotations: meta.helm.sh/release-name: azure-arc meta.helm.sh/release-namespace: default labels: app.kubernetes.io/managed-by: Helm name: kube-aad-proxy-scc priority: null readOnlyRootFilesystem: true requiredDropCapabilities: [] runAsUser: type: RunAsAny seLinuxContext: type: MustRunAs supplementalGroups: type: RunAsAny users: - system:serviceaccount:azure-arc:azure-arc-kube-aad-proxy-sa volumes: - configMap - hostPath - secret EOF ~~~ 2. oc adm upgrade --to=4.10.20 Actual results: SCC kube-aad-proxy-scc, which has readOnlyRootFilesystem is injected inside the pod version-4.10.20-smvt9-6vqwc, causing it to fail. ~~~ # oc get po -n openshift-cluster-version NAME READY STATUS RESTARTS AGE cluster-version-operator-6b5c8ff5c8-4bmxx 1/1 Running 0 33m version-4.10.20-smvt9-6vqwc 0/1 Error 0 10s # oc logs version-4.10.20-smvt9-6vqwc -n openshift-cluster-version oc logs version-4.10.20-smvt9-6vqwc mv: cannot remove '/manifests/0000_00_cluster-version-operator_00_namespace.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_adminack_configmap.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_admingate_configmap.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_clusteroperator.crd.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_clusterversion.crd.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_02_roles.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_03_deployment.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_00_prometheusrole.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_01_prometheusrolebinding.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_02_servicemonitor.yaml': Read-only file system mv: cannot remove '/manifests/0001_00_cluster-version-operator_03_service.yaml': Read-only file system ~~~ Expected results: Pod version-4.10.20-smvt9-6vqwc should run fine Additional info: I don't know why, but SCC kube-aad-proxy-scc is injected inside pod version-4.10.20-smvt9-6vqwc: ~~~ apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.0.70" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.0.70" ], "default": true, "dns": {} }] openshift.io/scc: kube-aad-proxy-scc ### HERE creationTimestamp: "2022-07-25T16:47:39Z" generateName: version-4.10.20-5xqtv- labels: controller-uid: ba707bbe-1825-4f80-89ce-f6bf2301a812 job-name: version-4.10.20-5xqtv name: version-4.10.20-5xqtv-9gcwk namespace: openshift-cluster-version ownerReferences: - apiVersion: batch/v1 blockOwnerDeletion: true controller: true kind: Job name: version-4.10.20-5xqtv uid: ba707bbe-1825-4f80-89ce-f6bf2301a812 resourceVersion: "40040" uid: 0d668d3d-7452-463f-a421-4dfee9c89c23 spec: containers: - args: - -c - mkdir -p /etc/cvo/updatepayloads/KsrCX7X9QbtoXkW3TkPcww && mv /manifests /etc/cvo/updatepayloads/KsrCX7X9QbtoXkW3TkPcww/manifests && mkdir -p /etc/cvo/updatepayloads/KsrCX7X9QbtoXkW3TkPcww && mv /release-manifests /etc/cvo/updatepayloads/KsrCX7X9QbtoXkW3TkPcww/release-manifests command: - /bin/sh image: quay.io/openshift-release-dev/ocp-release@sha256:b89ada9261a1b257012469e90d7d4839d0d2f99654f5ce76394fa3f06522b600 imagePullPolicy: IfNotPresent name: payload resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-fwblb readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: default-dockercfg-smmf4 nodeName: ip-10-0-215-206.eu-central-1.compute.internal nodeSelector: node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 1000000000 priorityClassName: openshift-user-critical restartPolicy: OnFailure schedulerName: default-scheduler securityContext: fsGroup: 1000030000 seLinuxOptions: level: s0:c6,c0 serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - key: node-role.kubernetes.io/master - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - hostPath: path: /etc/cvo/updatepayloads type: "" name: payloads - name: kube-api-access-fwblb projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-07-25T16:47:39Z" status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-07-25T16:47:39Z" message: 'containers with unready status: [payload]' reason: ContainersNotReady status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2022-07-25T16:47:39Z" message: 'containers with unready status: [payload]' reason: ContainersNotReady status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-07-25T16:47:39Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://ac6f6a5d8925620f1a2835a50fe26ea02d35e3a5c2d033015f38fde5206daf8c image: quay.io/openshift-release-dev/ocp-release@sha256:b89ada9261a1b257012469e90d7d4839d0d2f99654f5ce76394fa3f06522b600 imageID: quay.io/openshift-release-dev/ocp-release@sha256:b89ada9261a1b257012469e90d7d4839d0d2f99654f5ce76394fa3f06522b600 lastState: terminated: containerID: cri-o://fdac85e975eb00a3abd08e18061ae3673a857769ddfc87ca94a3527a8c7b83f3 exitCode: 1 finishedAt: "2022-07-25T16:47:42Z" reason: Error startedAt: "2022-07-25T16:47:42Z" name: payload ready: false restartCount: 2 started: false state: terminated: containerID: cri-o://ac6f6a5d8925620f1a2835a50fe26ea02d35e3a5c2d033015f38fde5206daf8c exitCode: 1 finishedAt: "2022-07-25T16:47:56Z" reason: Error startedAt: "2022-07-25T16:47:56Z" hostIP: 10.0.215.206 phase: Running podIP: 10.129.0.70 podIPs: - ip: 10.129.0.70 qosClass: Burstable startTime: "2022-07-25T16:47:39Z" ~~~
Associating a restrictive SCC with the version-... pods is not a supported operation. We have [1] tracking an RFE for clearer reporting when this happens, but ideally folks fix impacted customers by fixing whatever component is making these SCC associations. But from [2]: > When the complete set of available SCCs are determined they are ordered by: > > 1. Highest priority first, nil is considered a 0 priority > 2. If priorities are equal, the SCCs will be sorted from most restrictive to least restrictive > 3. If both priorities and restrictions are equal the SCCs will be sorted by name And from [3]: > The set of SCCs that admission uses to authorize a pod are determined by the user identity and groups that the user belongs to. Additionally, if the pod specifies a service account, the set of allowable SCCs includes any constraints accessible to the service account. > > Admission uses the following approach to create the final security context for the pod: > > 1. Retrieve all SCCs available for use. > 2. Generate field values for security context settings that were not specified on the request. > 3. Validate the final settings against the available constraints. Your kube-aad-proxy-scc has 'priority: null', which should put it at the bottom of the ranking of relevant-to-this-pod SCCs. I'll poke around and see what the default SCCs look like... [1]: https://issues.redhat.com/browse/OTA-680 [2]: https://docs.openshift.com/container-platform/4.10/authentication/managing-security-context-constraints.html#scc-prioritization_configuring-internal-oauth [3]: https://docs.openshift.com/container-platform/4.10/authentication/managing-security-context-constraints.html#admission_configuring-internal-oauth
Also likely relevant, 4.10 both grew pod-security.kubernetes.io/* annotations [1] and cleared the openshift.io/run-level annotation [2]. $ git --no-pager log --oneline -3 origin/release-4.10 -- install/0000_00_cluster-version-operator_00_namespace.yaml 539e9449 (origin/pr/623) Fix run-level label to empty string. f58dd1c5 (origin/pr/686) install: Add description annotations to manifests 6e5e23e3 (origin/pr/668) podsecurity: enforce privileged for openshift-cluster-version namespace None of those were in 4.9: $ git --no-pager log --oneline -1 origin/release-4.9 -- install/0000_00_cluster-version-operator_00_namespace.yaml 70097361 (origin/pr/543) Add management workload annotations And all of them landed in 4.10 via master (so they're in 4.10 before it GAed, and in 4.11 and later too): $ git --no-pager log --oneline -4 origin/master -- install/0000_00_cluster-version-operator_00_namespace.yaml 539e9449 (origin/pr/623) Fix run-level label to empty string. [1]: https://github.com/openshift/cluster-version-operator/pull/668 [2]: https://github.com/openshift/cluster-version-operator/pull/623
Hi Trevor, "Associating a restrictive SCC with the version-... pods is not a supported operation." This is something that is done automatically, I don't know how, but the pod version-... is created with this SCC associated, this is not something under our control, and this is the motivation for which I opened this Bug, to investigate why this SCC is associated with the version-... pod. I hope I have clarified the situation. I remain available in case you have any other questions, thank you!
*** Bug 2108631 has been marked as a duplicate of this bug. ***
Reproducing it from qe side: 1. Install a 4.10 cluster 2. Create SCC as mentioned in the Description 3. Upgrade the cluster # oc adm upgrade --to-latest # oc adm upgrade info: An upgrade is in progress. Working towards 4.10.13: 60 of 771 done (7% complete) ReleaseAccepted=False Reason: RetrievePayload Message: Retrieving payload failed version="4.10.23" image="quay.io/openshift-release-dev/ocp-release@sha256:e40e49d722cb36a95fa1c03002942b967ccbd7d68de10e003f0baa69abad457b" failure=Unable to download and prepare the update: deadline exceeded, reason: "DeadlineExceeded", message: "Job was active longer than specified deadline" Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.10 (available channels: candidate-4.10, candidate-4.11, eus-4.10, fast-4.10, stable-4.10) # oc get pod/version-4.10.23-bqwql-mhcgc -n openshift-cluster-version -oyaml | grep scc openshift.io/scc: kube-aad-proxy-scc # oc logs pod/version-4.10.23-bqwql-mhcgc -n openshift-cluster-version mv: cannot remove '/manifests/0000_00_cluster-version-operator_00_namespace.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_adminack_configmap.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_admingate_configmap.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_clusteroperator.crd.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_01_clusterversion.crd.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_02_roles.yaml': Read-only file system mv: cannot remove '/manifests/0000_00_cluster-version-operator_03_deployment.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_00_prometheusrole.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_01_prometheusrolebinding.yaml': Read-only file system mv: cannot remove '/manifests/0000_90_cluster-version-operator_02_servicemonitor.yaml': Read-only file system mv: cannot remove '/manifests/0001_00_cluster-version-operator_03_service.yaml': Read-only file system Okay, it's reproduced.
I am not sure how the version-4.10.20-smvt9-6vqwc got created in the openshift-cluster-verison namespace. Also why do customers need the SCC (which caused the issue) is required?
> I am not sure how the version-4.10.20-smvt9-6vqwc got created in the openshift-cluster-verison namespace. CVO creates this as part of the upgrade process. I was not aware of this. But I am not still clear why SCC was attached to it.
I'm still a bit fuzzy about the corners (which or both change from comment 2 made this an issue in 4.10+?), but Oscar's point at [1] was enough for me to open a pull request. If I'm still misunderstanding something about the motivation, wording suggestions for the commit message are welcome :) [1]: https://github.com/openshift/cluster-openshift-apiserver-operator/pull/437/files
And it's pretty clear that this is a new-in-4.10 issue, so blocker- and we'll keep shipping releases until this fix goes out (hopefully soon).
Verifying on 4.12.0-0.nightly-2022-07-31-185642 1. Install a cluster with 4.12.0-0.nightly-2022-07-31-185642 payload 2. Create SCC # cat << EOF | oc create -f - > allowHostDirVolumePlugin: true > allowHostIPC: false > allowHostNetwork: false > allowHostPID: false > allowHostPorts: false > allowPrivilegeEscalation: true > allowPrivilegedContainer: true > allowedCapabilities: [] > apiVersion: security.openshift.io/v1 > defaultAddCapabilities: [] > fsGroup: > type: MustRunAs > groups: [] > kind: SecurityContextConstraints > metadata: > annotations: > meta.helm.sh/release-name: azure-arc > meta.helm.sh/release-namespace: default > labels: > app.kubernetes.io/managed-by: Helm > name: kube-aad-proxy-scc > priority: null > readOnlyRootFilesystem: true > requiredDropCapabilities: [] > runAsUser: > type: RunAsAny > seLinuxContext: > type: MustRunAs > supplementalGroups: > type: RunAsAny > users: > - system:serviceaccount:azure-arc:azure-arc-kube-aad-proxy-sa > volumes: > - configMap > - hostPath > - secret > EOF securitycontextconstraints.security.openshift.io/kube-aad-proxy-scc created 3. Check all the SCC # oc get scc NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP PRIORITY READONLYROOTFS VOLUMES anyuid false <no value> MustRunAs RunAsAny RunAsAny RunAsAny 10 false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] hostaccess false <no value> MustRunAs MustRunAsRange MustRunAs RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","hostPath","persistentVolumeClaim","projected","secret"] hostmount-anyuid false <no value> MustRunAs RunAsAny RunAsAny RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","hostPath","nfs","persistentVolumeClaim","projected","secret"] hostnetwork false <no value> MustRunAs MustRunAsRange MustRunAs MustRunAs <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] hostnetwork-v2 false ["NET_BIND_SERVICE"] MustRunAs MustRunAsRange MustRunAs MustRunAs <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] kube-aad-proxy-scc true [] MustRunAs RunAsAny MustRunAs RunAsAny <no value> true ["configMap","hostPath","secret"] machine-api-termination-handler false <no value> MustRunAs RunAsAny MustRunAs MustRunAs <no value> false ["downwardAPI","hostPath"] node-exporter true <no value> RunAsAny RunAsAny RunAsAny RunAsAny <no value> false ["*"] nonroot false <no value> MustRunAs MustRunAsNonRoot RunAsAny RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] nonroot-v2 false ["NET_BIND_SERVICE"] MustRunAs MustRunAsNonRoot RunAsAny RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] privileged true ["*"] RunAsAny RunAsAny RunAsAny RunAsAny <no value> false ["*"] restricted false <no value> MustRunAs MustRunAsRange MustRunAs RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] restricted-v2 false ["NET_BIND_SERVICE"] MustRunAs MustRunAsRange MustRunAs RunAsAny <no value> false ["configMap","downwardAPI","emptyDir","persistentVolumeClaim","projected","secret"] 3. Upgrade the cluster # oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 --allow-explicit-upgrade --force warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Requesting update to release image registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 version pod is completed. # oc get po -n openshift-cluster-version NAME READY STATUS RESTARTS AGE cluster-version-operator-5c4695fb4b-5mj7f 1/1 Terminating 0 35m version--zfh5q-gp5kb 0/1 Completed 0 17s node-exporter is selected and readOnlyRootFilesystem is set to false. # oc get pod/version--zfh5q-gp5kb -n openshift-cluster-version -oyaml apiVersion: v1 kind: Pod metadata: annotations: k8s.v1.cni.cncf.io/network-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.0.49" ], "default": true, "dns": {} }] k8s.v1.cni.cncf.io/networks-status: |- [{ "name": "openshift-sdn", "interface": "eth0", "ips": [ "10.129.0.49" ], "default": true, "dns": {} }] openshift.io/scc: node-exporter #### node-exporter is selected creationTimestamp: "2022-08-01T03:20:13Z" generateName: version--zfh5q- labels: controller-uid: 6300536b-1975-4716-9e81-7dfa32d0cbb8 job-name: version--zfh5q name: version--zfh5q-gp5kb namespace: openshift-cluster-version ownerReferences: - apiVersion: batch/v1 blockOwnerDeletion: true controller: true kind: Job name: version--zfh5q uid: 6300536b-1975-4716-9e81-7dfa32d0cbb8 resourceVersion: "34008" uid: 0d6daee6-2b07-4cf1-915b-a2a664620d8a spec: containers: - command: - mv - /etc/cvo/updatepayloads/jyGyVbpNDo9kcnOEanKBeg-cvtb6 - /etc/cvo/updatepayloads/jyGyVbpNDo9kcnOEanKBeg image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imagePullPolicy: IfNotPresent name: rename-to-final-location resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: false #### It's explicitly false terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-zpfsh readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: true imagePullSecrets: - name: default-dockercfg-nv6r4 initContainers: - command: - sh - -c - rm -fR ./* image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imagePullPolicy: IfNotPresent name: cleanup resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: false terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-zpfsh readOnly: true workingDir: /etc/cvo/updatepayloads/ - command: - mkdir - /etc/cvo/updatepayloads/jyGyVbpNDo9kcnOEanKBeg-cvtb6 image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imagePullPolicy: IfNotPresent name: make-temporary-directory resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: false terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-zpfsh readOnly: true - command: - mv - /manifests - /etc/cvo/updatepayloads/jyGyVbpNDo9kcnOEanKBeg-cvtb6/manifests image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imagePullPolicy: IfNotPresent name: move-operator-manifests-to-temporary-directory resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: false terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-zpfsh readOnly: true - command: - mv - /release-manifests - /etc/cvo/updatepayloads/jyGyVbpNDo9kcnOEanKBeg-cvtb6/release-manifests image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imagePullPolicy: IfNotPresent name: move-release-manifests-to-temporary-directory resources: requests: cpu: 10m ephemeral-storage: 2Mi memory: 50Mi securityContext: privileged: true readOnlyRootFilesystem: false terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/cvo/updatepayloads name: payloads - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-zpfsh readOnly: true nodeName: yanyang-0801a-49hnl-master-1.c.openshift-qe.internal nodeSelector: node-role.kubernetes.io/master: "" preemptionPolicy: PreemptLowerPriority priority: 1000000000 priorityClassName: openshift-user-critical restartPolicy: OnFailure schedulerName: default-scheduler securityContext: {} serviceAccount: default serviceAccountName: default terminationGracePeriodSeconds: 30 tolerations: - key: node-role.kubernetes.io/master - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 - effect: NoSchedule key: node.kubernetes.io/memory-pressure operator: Exists volumes: - hostPath: path: /etc/cvo/updatepayloads type: "" name: payloads - name: kube-api-access-zpfsh projected: defaultMode: 420 sources: - serviceAccountToken: expirationSeconds: 3607 path: token - configMap: items: - key: ca.crt path: ca.crt name: kube-root-ca.crt - downwardAPI: items: - fieldRef: apiVersion: v1 fieldPath: metadata.namespace path: namespace - configMap: items: - key: service-ca.crt path: service-ca.crt name: openshift-service-ca.crt status: conditions: - lastProbeTime: null lastTransitionTime: "2022-08-01T03:20:21Z" reason: PodCompleted status: "True" type: Initialized - lastProbeTime: null lastTransitionTime: "2022-08-01T03:20:13Z" reason: PodCompleted status: "False" type: Ready - lastProbeTime: null lastTransitionTime: "2022-08-01T03:20:13Z" reason: PodCompleted status: "False" type: ContainersReady - lastProbeTime: null lastTransitionTime: "2022-08-01T03:20:13Z" status: "True" type: PodScheduled containerStatuses: - containerID: cri-o://fa45c5aec28ab203d12e4103c7b9664e2bc2b097c67216f4a4b72c0acb2f1cf1 image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imageID: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 lastState: {} name: rename-to-final-location ready: false restartCount: 0 started: false state: terminated: containerID: cri-o://fa45c5aec28ab203d12e4103c7b9664e2bc2b097c67216f4a4b72c0acb2f1cf1 exitCode: 0 finishedAt: "2022-08-01T03:20:21Z" reason: Completed startedAt: "2022-08-01T03:20:21Z" hostIP: 10.0.0.3 initContainerStatuses: - containerID: cri-o://8dc3e78b1a1511ac00c487a5a3cca65acd11a4e7fc4e7bff64c885a1a47ba7fe image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imageID: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 lastState: {} name: cleanup ready: true restartCount: 0 state: terminated: containerID: cri-o://8dc3e78b1a1511ac00c487a5a3cca65acd11a4e7fc4e7bff64c885a1a47ba7fe exitCode: 0 finishedAt: "2022-08-01T03:20:18Z" reason: Completed startedAt: "2022-08-01T03:20:18Z" - containerID: cri-o://9824f8a3b742ad8c208c6efe257adae4808d3ce5bb6312e1b40cfccf2ad33209 image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imageID: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 lastState: {} name: make-temporary-directory ready: true restartCount: 0 state: terminated: containerID: cri-o://9824f8a3b742ad8c208c6efe257adae4808d3ce5bb6312e1b40cfccf2ad33209 exitCode: 0 finishedAt: "2022-08-01T03:20:18Z" reason: Completed startedAt: "2022-08-01T03:20:18Z" - containerID: cri-o://0019d96fc44b5b8fa0456ce88e1f278b2e06224158693246050e26ce77764d17 image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imageID: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 lastState: {} name: move-operator-manifests-to-temporary-directory ready: true restartCount: 0 state: terminated: containerID: cri-o://0019d96fc44b5b8fa0456ce88e1f278b2e06224158693246050e26ce77764d17 exitCode: 0 finishedAt: "2022-08-01T03:20:19Z" reason: Completed startedAt: "2022-08-01T03:20:19Z" - containerID: cri-o://1858651805c9fb7216622a0270ec3de38a1cd83810b83afcf50becc063ac1041 image: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 imageID: registry.ci.openshift.org/ocp/release@sha256:36a78e1b1d004f1acc8ddc9ce5e5294a09719426ad977e7ec296779446380fd4 lastState: {} name: move-release-manifests-to-temporary-directory ready: true restartCount: 0 state: terminated: containerID: cri-o://1858651805c9fb7216622a0270ec3de38a1cd83810b83afcf50becc063ac1041 exitCode: 0 finishedAt: "2022-08-01T03:20:21Z" reason: Completed startedAt: "2022-08-01T03:20:20Z" phase: Succeeded podIP: 10.129.0.49 podIPs: - ip: 10.129.0.49 qosClass: Burstable startTime: "2022-08-01T03:20:13Z" Upgrade is successful. # oc adm upgrade warning: Cannot display available updates: Reason: VersionNotFound Message: Unable to retrieve available updates: currently reconciling cluster version 4.12.0-0.nightly-2022-07-31-235028 not found in the "stable-4.11" channel Cluster version is 4.12.0-0.nightly-2022-07-31-235028 Upstream is unset, so the cluster will use an appropriate default. Channel: stable-4.11 All cos are happy. # oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.12.0-0.nightly-2022-07-31-235028 True False False 68m baremetal 4.12.0-0.nightly-2022-07-31-235028 True False False 82m cloud-controller-manager 4.12.0-0.nightly-2022-07-31-235028 True False False 85m cloud-credential 4.12.0-0.nightly-2022-07-31-235028 True False False 85m cluster-autoscaler 4.12.0-0.nightly-2022-07-31-235028 True False False 82m config-operator 4.12.0-0.nightly-2022-07-31-235028 True False False 83m console 4.12.0-0.nightly-2022-07-31-235028 True False False 73m csi-snapshot-controller 4.12.0-0.nightly-2022-07-31-235028 True False False 83m dns 4.12.0-0.nightly-2022-07-31-235028 True False False 82m etcd 4.12.0-0.nightly-2022-07-31-235028 True False False 81m image-registry 4.12.0-0.nightly-2022-07-31-235028 True False False 75m ingress 4.12.0-0.nightly-2022-07-31-235028 True False False 74m insights 4.12.0-0.nightly-2022-07-31-235028 True False False 76m kube-apiserver 4.12.0-0.nightly-2022-07-31-235028 True False False 78m kube-controller-manager 4.12.0-0.nightly-2022-07-31-235028 True False False 80m kube-scheduler 4.12.0-0.nightly-2022-07-31-235028 True False False 79m kube-storage-version-migrator 4.12.0-0.nightly-2022-07-31-235028 True False False 83m machine-api 4.12.0-0.nightly-2022-07-31-235028 True False False 76m machine-approver 4.12.0-0.nightly-2022-07-31-235028 True False False 82m machine-config 4.12.0-0.nightly-2022-07-31-235028 True False False 76m marketplace 4.12.0-0.nightly-2022-07-31-235028 True False False 82m monitoring 4.12.0-0.nightly-2022-07-31-235028 True False False 73m network 4.12.0-0.nightly-2022-07-31-235028 True False False 84m node-tuning 4.12.0-0.nightly-2022-07-31-235028 True False False 40m openshift-apiserver 4.12.0-0.nightly-2022-07-31-235028 True False False 76m openshift-controller-manager 4.12.0-0.nightly-2022-07-31-235028 True False False 79m openshift-samples 4.12.0-0.nightly-2022-07-31-235028 True False False 43m operator-lifecycle-manager 4.12.0-0.nightly-2022-07-31-235028 True False False 83m operator-lifecycle-manager-catalog 4.12.0-0.nightly-2022-07-31-235028 True False False 83m operator-lifecycle-manager-packageserver 4.12.0-0.nightly-2022-07-31-235028 True False False 76m service-ca 4.12.0-0.nightly-2022-07-31-235028 True False False 83m storage 4.12.0-0.nightly-2022-07-31-235028 True False False 82m Looks good to me. Moving it to verified state.
As an easy workaround for the issue, remove the SCCs for the time period when CVO pods are being redeployed until they are running. You can recreate the SCC afterwards.
is there any plans to get a fix in OCP 4.10 ?
If you click "Show advanced fields", you can see: Blocks: 2114602 Heading back to the 4.11.z bug 2114602, you can see it shipped in 4.11.1 [1]. And there's also an exciting transition to Jira for 4.10.z [2]. Following along, [3] has a comment linking to [4]. And clicking through to [4] (no convenient inline version numbers in Jira errata link comments yet), we can see that the fix shipped in 4.10.30. Folks can use comment 16's mitigation to get themselves out to release with 'readOnlyRootFilesystem: false', but after that, no further workarounds should be required. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=2114602#c9 [2]: https://bugzilla.redhat.com/show_bug.cgi?id=2114602#c7 [3]: https://issues.redhat.com//browse/OCPBUGS-233 [4]: https://access.redhat.com/errata/RHSA-2022:6133
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7399
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days