We're really close to 4.11 GA, and this is a 4.10 issue, not a new-in-4.11 issue, so I'm punting it out to 4.11.z.
Verifying it before PR is merging. 1. Install a cluster using cluster-bot # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.ci.test-2022-08-03-082604-ci-ln-i9hxzi2-latest True False 24m Cluster version is 4.11.0-0.ci.test-2022-08-03-082604-ci-ln-i9hxzi2-latest 2. Create a SCC # cat << EOF | oc create -f - > allowHostDirVolumePlugin: true > allowHostIPC: false > allowHostNetwork: false > allowHostPID: false > allowHostPorts: false > allowPrivilegeEscalation: true > allowPrivilegedContainer: true > allowedCapabilities: [] > apiVersion: security.openshift.io/v1 > defaultAddCapabilities: [] > fsGroup: > type: MustRunAs > groups: [] > kind: SecurityContextConstraints > metadata: > annotations: > meta.helm.sh/release-name: azure-arc > meta.helm.sh/release-namespace: default > labels: > app.kubernetes.io/managed-by: Helm > name: kube-aad-proxy-scc > priority: null > readOnlyRootFilesystem: true > requiredDropCapabilities: [] > runAsUser: > type: RunAsAny > seLinuxContext: > type: MustRunAs > supplementalGroups: > type: RunAsAny > users: > - system:serviceaccount:azure-arc:azure-arc-kube-aad-proxy-sa > volumes: > - configMap > - hostPath > - secret > EOF securitycontextconstraints.security.openshift.io/kube-aad-proxy-scc created 3. Upgrade the cluster # oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:2050173e8113faae2faadcff7b77346dab996705a68f2384fd5a2674c6e2a2ff --force --allow-explicit-upgrade warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Requesting update to release image registry.ci.openshift.org/ocp/release@sha256:2050173e8113faae2faadcff7b77346dab996705a68f2384fd5a2674c6e2a2ff # oc get all NAME READY STATUS RESTARTS AGE pod/cluster-version-operator-68d8868586-gxgl5 1/1 Running 0 6s pod/version--n4dqx-2vl6d 0/1 Completed 0 23s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cluster-version-operator ClusterIP 172.30.205.94 <none> 9099/TCP 59m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cluster-version-operator 1/1 1 1 58m NAME DESIRED CURRENT READY AGE replicaset.apps/cluster-version-operator-5c8fd57fc8 0 0 0 58m replicaset.apps/cluster-version-operator-68d8868586 1 1 1 6s NAME COMPLETIONS DURATION AGE job.batch/version--n4dqx 1/1 14s 23s # oc get pod/version--n4dqx-2vl6d -oyaml | grep scc openshift.io/scc: node-exporter # oc adm upgrade info: An upgrade is in progress. Working towards 4.12.0-0.nightly-2022-08-01-151317: 105 of 802 done (13% complete) warning: Cannot display available updates: Reason: NoChannel Message: The update channel has not been configured. Upgrade is proceeded. Looks good to me.
Hello, We have a new comment from this case-->03272159 --- This cluster has had this SCC installed since the initial install (4.8.13 back in November 2021), and has successfully upgraded without any issues through 4.8, 4.9 and to 4.10, until now. As the RCA has discovered, this is obviously a 4.10 bug, and a workaround or fix should be provided to enable the customer to upgrade. Not just when 4.12 lands, but also so that they can keep the platform up-to-date through 4.10 and 4.11. Adjusting the custom SCC is not a viable workaround. This SCC has been deployed and configured by a 3rd party application (MS Azure Sentinel). This is used for monitoring of their Azure bound clusters, and hence a change will need to be raised with Microsoft. As their product is continuing to work as expected (and uses a pattern of least privilege that we recommend), it will be difficult to convince the customer to adjust this, to suit a bug on our part. It will also mean for any updates they will need to manually update the SCC and roll the change back afterwards. This is not a good experience. Am raising an ACE to get some more eyes on this issue. --- I see that the priority and severity are high, so is it possible to execute the backporting to 4.10? Many thanks in advance, Adrián.
Based on comment#2, moving it to verified state.
We're considering a 4.10.z backport in [1]. [1]: https://issues.redhat.com//browse/OCPBUGS-233
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.11.1 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:6103