Description of problem: Root cause of https://bugzilla.redhat.com/show_bug.cgi?id=1381336 On pod update, SCC admission ensures that the pod can be validates by one of the SCC's the updating user has access to. If the validating SCC requires mutating the pod in order to validate it (for example, forcing a uid into a pod's security context), it will pass SCC admission, but fail update validation, since pod specs aren't allowed to mutate on update. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: to recreate: # create a project as a normal user oc login -u bob -p password oc new-project bob-project # create a pod as a privileged user, verify it was admitted with "openshift.io/scc: anyuid" oc login -u system:admin oc create -f pod.yaml oc get pod/testpod -o yaml # label the pod as a normal user oc login -u bob -p password oc label pod/testpod foo=bar pod.yaml: --- apiVersion: v1 kind: Pod metadata: name: testpod spec: containers: - command: ["sh","-c","sleep 100000"] image: busybox imagePullPolicy: Always name: busybox Actual results: Error about podspec not being mutable Expected results: Error related to the podspec security context Additional info:
initial prototype of omitting mutating sccs when validating a pod update at https://github.com/liggitt/origin/commits/scc-error we should really do the following, to fix this and to position us for admission phase splits that are coming: separate SCC evaluations into three checks: - does the pod spec require anything the SCC disallows (no mutation, just a check) - apply SCC defaults (force uid fields, etc) - validate a pod spec (no mutation, just validation)
*** Bug 1383408 has been marked as a duplicate of this bug. ***
we really need to split out these operations for an SCC: 1. does SCC x forbid pod y? 2. does SCC x allow pod y? 3. use SCC x to default pod y when creating, we would: 1. find sccs the user is allowed to use 2. filter to ones that do not forbid the pod 3. select the prioritized scc 4. use the prioritized scc to default the pod 5. verify the prioritized scc allows the defaulted pod when updating, we would: 1. find sccs the user is allowed to use 2. if any allow the pod as-is, succeed 3. otherwise, reject
How is it important in terms of deadline? Should we increase priority/severity?
I have seen at least 2 instances of this being reported and it is preventing folks from using normal functions of openshift in some cases so I'm in favor of trying to address this in the next release.
> I have seen at least 2 instances of this being reported and it is preventing folks from using normal functions of openshift in some cases so I'm in favor of trying to address this in the next release. In most cases, fixing this would just give those folks a better error message (you cannot update this pod because you don't have a SCC that allows it) In rare cases, they do have access to a SCC that would allow updating, but the mutation aspect is selecting a more restrictive one and attempting to mutate the pod to match it. Fixing the bug would allow them to edit the pod.
> Expected results: > Error related to the podspec security context Which of the SCCs are violated when we're modifying labels in our example? I don't see any. > when creating, we would: > 1. find sccs the user is allowed to use > 2. filter to ones that do not forbid the pod > 3. select the prioritized scc > 4. use the prioritized scc to default the pod > 5. verify the prioritized scc allows the defaulted pod > > when updating, we would: > 1. find sccs the user is allowed to use > 2. if any allow the pod as-is, succeed > 3. otherwise, reject Could you explain, why we aren't use prioritized SCCs when we update a pod?
It is not that the SCC is violated it is that during the update the user who performs the update may have access to different SCCs than were originally used. This may trigger an update to the pod spec based on defaulting rules that result in the update being rejected.
If the original pod was created by a user with access to anyuid, then it would have been allowed to be persisted without setting a runAsUser in the pod's security context. If a user without access to anyuid tries to annotate it, a more restrictive scc (like "restricted") is chosen and tries to force the runAsUser to a specific value. That is a disallowed mutation.
*** Bug 1482824 has been marked as a duplicate of this bug. ***
*** Bug 1508027 has been marked as a duplicate of this bug. ***
@Jordan,
This PR (https://github.com/openshift/origin/pull/16934) will fix this bug. But because it has a couple unsolved comments and it's not merged yet, QE wouldn't be able to test it properly for 3.7 release, hence I moved the target to 3.8
This worked for me to work around this: ``` ssh <MASTER_1> oc login -u system:admin oc adm policy add-scc-to-user privileged system:serviceaccount:default:registry oc edit scc/priority <set priority to 100> oc adm --config=/etc/origin/master/admin.kubeconfig migrate storage --include=* --confirm ```
One work around for the upgrade playbooks is to add the following to your hosts file. [all:vars] openshift_upgrade_pre_storage_migration_enabled==true openshift_upgrade_pre_storage_migration_fatal==true openshift_upgrade_post_storage_migration_enabled==true openshift_upgrade_post_storage_migration_fatal==false A gotcha that we found is that depending on where you define this that they *can* be interpreted as strings rather than booleans this is why it's double '='. I did submit a PR, I can't remember if it was backported.
On case 01998377, customer used the following procedure: oc adm policy add-scc-to-user privileged system:serviceaccount:core-rhmap:default oc adm policy remove-scc-from-user anyuid-with-chroot system:serviceaccount:core-rhmap:default oc edit scc privileged --> <set priority to 100> oc rollout latest millicore But they have some questions: after our migration, can we rollback and set the priority to null ? with these ocp commands : oc adm policy add-scc-to-user anyuid-with-chroot system:serviceaccount:core-rhmap:default oc adm policy remove-scc-from-user privileged system:serviceaccount:core-rhmap:default oc edit scc privileged --> <set priority to null> oc rollout latest millicore is there an impact if we let this settings like that ?
> 3.8 is not a release (publicly), as a result, re-setting target release to 3.9 the correct public release. Confirm. Also the aforementioned PR was merged only in 3.9: $ git tag --contains 239f50bd3322afef38c8582cc809f65b5092db93 v3.9.0-alpha.2
Verified. # openshift version openshift v3.9.0-0.53.0 kubernetes v1.9.1+a0ce1bc657 etcd 3.2.8 $ oc label pod/testpod foo=bar Error from server (Forbidden): pods "testpod" is forbidden: unable to validate against any security context constraint: []
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489
> customer would like to know if in order to upgrade then from 3.7 (to 3.8) -> 3.9 they could remove those lines from their inventory or they should be still needed? Scott, could you tell us, when we upgrade storage, pre- and post-migrations is happening with a binary from a new version or with old one? > I think they're still needed since the errata includes updates for 3.9 version. This fix should be included in 3.9 version already: https://bugzilla.redhat.com/show_bug.cgi?id=1383707#c51 but I need to ensure that migration will use a binary from a new version.
Slava, pre is always using the currently installed version, post is after having upgraded both the API and client to the target version. 3.7 to 3.9 includes two sets of those migrations though, so you'd get pre on 3.7, post on 3.8, pre on 3.8, post on 3.9.
As per my understanding this would still be needed while upgrading _at least_ from 3.7 to 3.8, but not from 3.8 to 3.9, is that correct?
Yes, you should not skip the migration that happens while running 3.8 so therefore this would need to be fixed in 3.8.
(In reply to Scott Dodson from comment #63) > Yes, you should not skip the migration that happens while running 3.8 so > therefore this would need to be fixed in 3.8. Hi Scott, This errata https://access.redhat.com/errata/RHBA-2018:0489 has released fix for version 3.9 What needs to be fixed in 3.8 ? Are you saying that the errata needs to be back-ported to 3.8 or, are you referring to the following ansible variables: " openshift_upgrade_pre_storage_migration_fatal==true openshift_upgrade_post_storage_migration_fatal==false " Are these variables required to be in place during upgrade from 3.8 to 3.9 ? - Ameya
(In reply to Joel Rosental R. from comment #62) > As per my understanding this would still be needed while upgrading _at > least_ from 3.7 to 3.8, but not from 3.8 to 3.9, is that correct? I think that no, you still need this workaround also during 3.8->3.9 upgrade.
(In reply to Ameya Sathe from comment #64) > Are these variables required to be in place during upgrade from 3.8 to 3.9 ? Yes, they are still required.