Description of problem: If a cluster admin changes any default SCC, cluster upgrade is prevented and we see the following message in `version` - lastTransitionTime: "2020-03-11T06:05:31Z" message: 'Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid hostmount-anyuid privileged]' reason: DefaultSecurityContextConstraints_Mutated This message is not helpful, it does not instruct the admin on how to resolve the issue. How reproducible: Always Steps to Reproduce: 1. Take a 4.3 nightly cluster 2. Change any default SCC that ships with OpenShift Actual results: Upgradeable is set to False with the above error message Expected results: Upgradeable is set to False with a brief message describing how a cluster-admin can clear any changes if he desires and a command that gives him a diff and a link to how to configure access without changing the SCC to avoid the problem in the future will do
This feature is enabled in 4.3 only. From 4.4 and onward cvo will stomp any changes to default SCC mutation.
This is not an issue in 4.4, since CVO manages the default SCC. It's not reproducible in 4.4, but qe can mutate any default SCC and validate - this bug is not present, no DefaultSecurityContextConstraints_Mutated in `Upgradeable` condition. - CVO will stomp the changes made to the default SCC.
The update is blocked just because an admin has added accounts/service accounts to the mentioned SCC. The version operator doesn't make a difference whether the actual SCC was modified or not. This is imho wrong. Adding an account to any SCC should be no reason to block the update. Nor should memberships be removed automatically. This will break e.g. storage provisioners who rely on privileged operations. To me this check needs to be relaxed to just check relevant fields of the SCCS.
Hi sople, armin.kunaschik, In 4.4, any changes to a default SCC will be stomped by CVO. So, if we allowed an upgrade (from 4.3) with mutated SCC, CVO will stomp those changes once 4.4 upgrade is underway. We are preventing upgrade in 4.3 so that the admin has a chance to fix the issue. Otherwise customer will complain that upgrade has broken their applications (supported by mutated SCC in 4.3). A workaround to "adding an account to any SCC" is to use RBAC to give a user access to the default SCC. This way the customer can avoid changing the default SCC. Hope this clarifies the situation, please let us know if you have any additional questions. Thanks!
First: This is an issue with 4.3 and therefore needs to be fixed in 4.3! Second: You introduce big compatibility issues with such a change. It is nowhere documented that adding accounts to e.g. the privileged account is forbidden. There is even lots of documentation that advises to use the default SCCs to achieve e.g. privileged containers! E.g. https://docs.openshift.com/container-platform/3.11/admin_guide/manage_scc.html#grant-access-to-the-privileged-scc or https://docs.openshift.com/container-platform/4.1/cli_reference/administrator-cli-commands.html#policy The command "oc adm policy add-scc-to-user privileged -z myserviceaccount" as described in the above links adds(!) myserviceaccount to the privileged SCC. This is used by openshift admins since the beginning! It is ok to check every definition of an SCC, but NOT the members. You can not change this in the middle of a release without telling nobody about it!
Hitting this when trying to upgrade from 4.3.8 -> 4.3.9 after installing NetApp Trident dynamic storage provisioner. It adds itself to the privileged scc.
It's not just Trident. I also ran into this when I tried to upgrade a cluster with Trident installed. There is several (commercial) software from Redhat partners which requires membership in the privileged SCC. Monitoring applications like Dynatrace, log collection software like Splunk collectors, etc. all require that their sa add to the privileged scc.
Created trident issue here: https://github.com/NetApp/trident/issues/374
Also keep in mind that during upgrades the added software might be required to work and thus removing the added serviceaccount from the SCC is not an option. Imagine the example with Splunk, where disabling it would mean that during the upgrade phase no audit trails are collected. This would make upgrades impossible in various environments. Adding a serviceaccount to a SCC is not mutating the SCC. What must be ensured is that a) the privileges are not changed and b) none of the built-in (Service)Accounts are removed from the SCC.
I have a cluster that is already on 4.3.9 and am running into this error as well trying to upgrade to 4.3.10. Thus, this error is lurking around in 4.3.8+ which affects 4.3.9 as well. Here is the logs from my cluster-version-operator pod: I0408 15:13:34.220636 1 sync_worker.go:471] Running sync 4.3.10 (force=false) on generation 72 in state Updating at attempt 32 I0408 15:13:34.220700 1 sync_worker.go:477] Loading payload I0408 15:13:34.267678 1 payload.go:210] Loading updatepayload from "/etc/cvo/updatepayloads/Wu01xRb7K7Vz9hQJYPhGjg" E0408 15:13:34.560105 1 precondition.go:49] Precondition "ClusterVersionUpgradeable" failed: Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid] E0408 15:13:34.560228 1 sync_worker.go:329] unable to synchronize image (waiting 2m52.525702462s): Precondition "ClusterVersionUpgradeable" failed because of "DefaultSecurityContextConstraints_Mutated": Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid]
This workaround did the trick while upgrading from 4.3.8 to 4.3.9: * Remove all users from privileged SCC except: - system:admin - system:serviceaccount:openshift-infra:build-controller (this might differ on your cluster) * Start the update * Add the removed users back a few moments later when the control plane update is in progress Update will finish without problems.
In my cases I had a modification in anyuid not in privileged, to be able to run Oracle12g. The other issue is that I'm using nfs-provisioner which is modifying hostmount-anyuid. In this case I don't think the workaround will work, since I'm using NFS backing storage for all the cluster (including the registry).
@luca: It depends. I'm using Trident with NFS as backing storage and it was working. But probably because it runs just a few seconds without the necessary SCC and there were no pod restarts or pvc creations. As always: It's a workaround and might not work on any cluster :-)
Nothing to check here for QE. Moving to VERIFIED.
I removed the 2 users added by nfs-provisioner in hostmount-anyuid but I'm still getting the same error in the operator version log: 1 precondition.go:49] Precondition "ClusterVersionUpgradeable" failed: Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [hostmount-anyuid] 1 sync_worker.go:329] unable to synchronize image (waiting 2m52.525702462s): Precondition "ClusterVersionUpgradeable" failed because of "DefaultSecurityContextConstraints_Mutated": Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [hostmount-anyuid]
Hello, I just wanted to say I am currently using OpenShift in a corporate environment, and we are affected by this at the moment. We are using trident currently.
Run into the same problem when upgrading from Openshift 4.3.9 to Openshift 4.3.10. >> 4.3.10 cannot be applied: It may not be safe to apply this update The procedure mentioned in the RedHat solution 4972291 works. This is not nice, but until this has been fixed by the development, one has to live with it for better or worse. I've simply cloned the privileged scc and removed the user I've added for "playground" purposes from the privileged scc , all users belonging to the openshift universe should not be removed. The cluster-version-operator pod is recreated and the upgrade to version 4.3.10 starts.
I noticed a potential bug today. When running the update via web console, my updates would fail (even after resetting the scc's). However, when updating using the "oc adm upgrade --to=4.3.10" command, it would work. I've seen this in both my dev and stage environments now. I will try on my prod env as well.
I tried to figure out how to achieve the same functionality (privileged SCC) with RBAC, but failed. I found https://github.com/openshift/pipelines-catalog/issues/9 But if this is the intentional solution, then it actually decreases security. Can somebody point me/us to the advised way of doing things with the same level of security in later OCP versions?
Please ignore my last comment #25. The described procedure is working.
Please help me to restore the default SCC, I'm a novice in OpenShift administration :( I have a bare-metatal OpenShift 4.3.10, trying to upgrade to 4.3.13. I did a "oc adm policy add-scc-to-group anyuid system:authenticated" to reuse a Docker Image and now I'm getting the upgrade error. I'm looking to documentation to undo the operation I did, but it's no so clear: could it be a simple "remove-scc-from-group"? oc adm policy remove-scc-from-group anyuid system:authenticated Thank for any support Carlo
For OCP 4.3, Default SCC likes below, you just leave following users. $oc get scc privileged -o json | jq .users [ "system:admin", "system:serviceaccount:openshift-infra:build-controller" ]
(In reply to Luca from comment #19) > I removed the 2 users added by nfs-provisioner in hostmount-anyuid but I'm > still getting the same error in the operator version log: > > 1 precondition.go:49] Precondition "ClusterVersionUpgradeable" failed: > Cluster operator kube-apiserver cannot be upgraded: > DefaultSecurityContextConstraintsUpgradeable: Default > SecurityContextConstraints object(s) have mutated [hostmount-anyuid] > 1 sync_worker.go:329] unable to synchronize image (waiting 2m52.525702462s): > Precondition "ClusterVersionUpgradeable" failed because of > "DefaultSecurityContextConstraints_Mutated": Cluster operator kube-apiserver > cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default > SecurityContextConstraints object(s) have mutated [hostmount-anyuid] Two comments to my own comment: - the checks to make sure there is no conflict in the SCC configuration is not instant, so it might takes a minute to clear - node-exporter SCC is recreated automatically (as expected) so no issue there
Hi vjaypurk, https://access.redhat.com/solutions/4972291
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409