Created attachment 1785179 [details] update-history Description of problem: openshift-kube-storage-version-migrator container failed to start and so when I performed `oc describe` on the pod it shows: - Successfully assigned openshift-kube-storage-version-migrator/migrator-5f77bc7f9-nqknj to 10.112.78.51 - Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:6c9edfc5c399afece4c206182974b0bb2b525b20a2886a36a91173355edaf954" - Error: container has runAsNonRoot and image will run as root This is because when it was deployed, the deployed run via an SCC someone in the team had created. Upon further digging, it appears that it was unable to run under this SCC because it failed the runAsUser security context check: "Pods which have specified neither runAsNonRoot nor runAsUser settings will be mutated to set runAsNonRoot=true, thus requiring a defined non-zero numeric USER directive in the container." Which would imply there is no "defined non-zero numeric USER directive in the container". So either the deployment/pod specification needs to state the container needs to run as root (uid=0) and then the correct SCC should be selected, or the container needs to provide a "non-zero numeric USER directive in the container". Version-Release number of selected component (if applicable): Our cluster has been impacted since somewhere between 4.5.24 and 4.6.23. How reproducible: Create an SCC priority >10 that permits "non-root" containers then deploy the migrator. Steps to Reproduce: 1. 2. 3. Actual results: Pod never launched. Expected results: Pod and upgrade should run without fault Additional info: The attachment shows that our upgrades have been impacted since 4.5.24 - however this could be due to creation date of the SCC.
generally: do you have a case number so we can get the must-gather? that would be helpful in debugging. > Create an SCC priority >10 that permits "non-root" containers then deploy the migrator. Did you create an scc overriding the default restricted SCC?
Hi, I don't have a case number as this isn't a client project. The SCC was a copy of "nonroot" but with priority 11. Matt
I believe this was the SCC used: apiVersion: security.openshift.io/v1 kind: SecurityContextConstraints metadata: annotations: kubernetes.io/description: "This policy allows pods to run with any UID and GID except root and prevents access to the host." "helm.sh/hook": pre-install name: {{ .Release.Name }}-nonroot-scc allowHostDirVolumePlugin: false allowHostIPC: false allowHostNetwork: false allowHostPID: false allowHostPorts: false allowPrivilegedContainer: false allowPrivilegeEscalation: false allowedCapabilities: [] allowedFlexVolumes: [] allowedUnsafeSysctls: [] defaultAddCapabilities: [] defaultAllowPrivilegeEscalation: false forbiddenSysctls: - "*" fsGroup: type: RunAsAny readOnlyRootFilesystem: false requiredDropCapabilities: - ALL runAsUser: type: MustRunAsNonRoot # This can be customized for your host machine seLinuxContext: type: RunAsAny # seLinuxOptions: # level: # user: # role: # type: supplementalGroups: type: RunAsAny # This can be customized for your host machine volumes: - configMap - downwardAPI - emptyDir - persistentVolumeClaim - projected - secret # If you want a priority on your SCC -- set for a value more than 0 priority: 11 users: - system:serviceaccount:{{ .Release.Namespace }}:{{ $fullName }}
thank you, i can reproduce this on a fresh cluster by 1. creating a copy of the builtin nonroot scc with priority 11 set and 2. simply deleting the existing openshift-kube-storage-version-migrator/migrator pod to enforce redployment the new pod gets assigned to the nonroot-copy SCC: ``` $ kubectl -n openshift-kube-storage-version-migrator get pod migrator-788699c75c-tpgfx -o yaml | grep scc openshift.io/scc: nonroot-copy ``` I need to find out why this deployment is affected specifically.
Great. Did the pod fail to start with the same error message? The thing I don't understand is why the serviceAccount of migrator is being picked up by this SCC, given that we specified `users` on it.
yes, it's the same error, i need to investigate, using a clean copy with `users: []` it picks that one up as well. assigning to work on this one this sprint.
I still need to dive deeper here, but the migrator (and the operator too) are matched against the elevated nonroot SCC. This causes their pod spec to be modified and `runAsNonRoot: true` being added to the security context pod spec. Since no stable user ID is specified, the pods fail to launch. Other workload images which have a nonroot user specified in the docker image are not affected (random example is the openshift-marketplace/certified-operators pod) because the have a nonroot user specified in the image already. I still want to understand why the openshift workload is being matched against the elevated SCC at all. Optimally we should not permit mutating existing core payload by user SCC changes.
Tested in fresh cluster 4.8.0-0.nightly-2021-06-06-164529, new deployment pod status is expected. 1. create nonroot SCC with priority 11 set using the YAML in comment 3(replace the scc name value with test-nonroot-scc and users value with serviceaccount:default) 2. delete existing openshift-kube-storage-version-migrator/migrator pod $ oc delete pod migrator-55f7fdd8c8-5shks -n openshift-kube-storage-version-migrator pod "migrator-55f7fdd8c8-5shks" deleted 3. check redeployment pod status $ oc get pods -n openshift-kube-storage-version-migrator NAME READY STATUS RESTARTS AGE migrator-55f7fdd8c8-zgrjr 1/1 Running 0 19s 4. check scc assigned to the new pod $ oc get pods migrator-55f7fdd8c8-zgrjr -n openshift-kube-storage-version-migrator -o yaml | grep scc openshift.io/scc: test-nonroot-scc
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438