+++ This bug was initially created as a clone of Bug #1827336 +++ +++ This bug was initially created as a clone of Bug #1827335 +++ Description of problem: In OpenShift 4.3.14 we have reverted DefaultSecurityContextConstraints_Mutated. We removed the controller that sets Upgradeable to False if any default SCC has been mutated. But on an affected cluster (pre 4.3.14) that already has user-modified default SCCs the stale condition does not get removed after upgrade. Version-Release number of selected component (if applicable): OpenShift 4.3.14 How reproducible: Always Steps to Reproduce: 1. install ocp v4.3.13 2. trigger upgradeable=false by mutating default scc Change the default SCC $ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]' $ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]' # ./oc get scc privileged -o json|jq .users [ "system:admin", "system:serviceaccount:openshift-infra:build-controller", "e2e-user" ] 3. With path 4.3.13-4.3.14 and do upgrade. $ oc adm upgrade --to=4.3.14 Updating to 4.3.14 $ oc get clusterversion version -o json|jq .status.conditions[-1] { "lastTransitionTime": "2020-04-23T04:07:33Z", "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]", "reason": "DefaultSecurityContextConstraints_Mutated", "status": "False", "type": "Upgradeable" } $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.14 True False 34m Cluster version is 4.3.14 Checking the changed the default SCC, still be there. $ oc get scc privileged -o json | jq .users [ "system:admin", "system:serviceaccount:openshift-infra:build-controller", "e2e-user" ] $ oc get scc anyuid -o json | jq .users [ "e2e-user" ] Actual results: $ oc get clusterversion version -o json|jq .status.conditions[-1] { "lastTransitionTime": "2020-04-23T04:07:33Z", "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]", "reason": "DefaultSecurityContextConstraints_Mutated", "status": "False", "type": "Upgradeable" } Expected results: "Upgradeable" condition of clusterversion/version should not have DefaultSecurityContextConstraints_Mutated. Additional info:
This 4.3.z bug should only depend on the 4.4 bug. The 4.4 bug depends on the 4.5 bug, so we do not need or want a direct 4.3 -> 4.5 link.
For QA, we believe the test would be 1. Install 4.3.8-4.3.12 2. Modify a default SCC (say add a user to it) 3. Update to 4.3.13 with --force 4. Update to 4.3.15 without --force 5. Update to 4.4 without --force (this will fail) 1. Install 4.3.8-4.3.12 2. Modify a default SCC 3. Update to 4.3.13 with --force 4. Update to 4.3.16 without --force 5. Update to 4.4 without --force (this should work)
> 4. Update to 4.3.15 without --force If you do this via --to, you'll have to change your channel to candidate-4.4 first (4.4 to also set up for the next step's 4.4 RC attempt): $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "candidate-4.4"}]' > 4. Update to 4.3.16 without --force 4.3.16 is getting built right now, so it should exist by the time you run this. Otherwise you may be able to use a recent nightly [1]. > 5. Update to 4.4 without --force (this should work) If 4.3.16 is built and in candidate-4.4 when you test, this will just be '--to 4.3.16'. If you test earlier, you'll have to use '--allow-explicit-upgrade --to-image $BY_DIGEST_PULLSPEC'. [1]: https://openshift-release.svc.ci.openshift.org/#4.3.0-0.nightly
https://openshift-release.svc.ci.openshift.org/releasestream/4.3.0-0.nightly/release/4.3.0-0.nightly-2020-04-23-225015 is an accepted nightly which can stand in for 4.1.16 in these tests.
4.1.16 is out [1]. Once it gets accepted (hopefully in the next hour), we'll drop it into candidate-4.4 [2]. [1]: https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/release/4.3.16 [2]: https://github.com/openshift/cincinnati-graph-data/pull/205
(In reply to W. Trevor King from comment #7) > 4.1.16 is out [1]. Once it gets accepted (hopefully in the next hour), I think Here is 4.3.16? > we'll drop it into candidate-4.4 [2]. > > [1]: > https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/ > release/4.3.16 > [2]: https://github.com/openshift/cincinnati-graph-data/pull/205
(In reply to Eric Paris from comment #2) > For QA, we believe the test would be > 1. Install 4.3.8-4.3.12 > 2. Modify a default SCC (say add a user to it) > 3. Update to 4.3.13 with --force Done as expected. $ oc adm upgrade --to=4.3.13 --force=true Updating to 4.3.13 > 4. Update to 4.3.15 without --force $ oc adm upgrade Cluster version is 4.3.13 Updates: VERSION IMAGE 4.3.14 registry.svc.ci.openshift.org/ocp/release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b Unable to upgrade to 4.3.15 directly, first to 4.3.14, then 4.3.15. $ oc adm upgrade --to=4.3.14 Updating to 4.3.14 $ oc get clusterversion version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.14 True False 24m Cluster version is 4.3.14 $ oc adm upgrade --to=4.3.15 Updating to 4.3.15 After done as expected, the v4.3.15 has been dropped from upgrade path. > 5. Update to 4.4 without --force (this will fail) Set the correct channel for OCP 4.4 by using the web console $ oc adm upgrade Cluster version is 4.3.15 Updates: VERSION IMAGE ... 4.4.0-rc.11 registry.svc.ci.openshift.org/ocp/release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07 $ oc adm upgrade --to=4.4.0-rc.11 Updating to 4.4.0-rc.11 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.15 True True 54s Working towards 4.4.0-rc.11: 11% complete This step is not as expected. -------------------------------- > > > 1. Install 4.3.8-4.3.12 > 2. Modify a default SCC > 3. Update to 4.3.13 with --force > 4. Update to 4.3.16 without --force > 5. Update to 4.4 without --force (this should work) this case is able to go on, since 4.3.16 is unavailable. Detail see https://coreos.slack.com/archives/CJARLA942/p1587719406162100
As discussed this is an upgrade blocker for 4.3.Z stream
Removing the upgradeblocker as the upgrade works fine.
> Unable to upgrade to 4.3.15 directly, first to 4.3.14, then 4.3.15. Did you switch into candidate-4.4? Using [1]: $ CHANNEL=candidate-4.4 ~/src/openshift/cincinnati/hack/available-updates.sh 4.3.13 4.3.14 quay.io/openshift-release-dev/ocp-release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b https://access.redhat.com/errata/RHBA-2020:1529 4.3.15 quay.io/openshift-release-dev/ocp-release@sha256:0e9642d28c12f5f54c1ab0fffbfd866daa6179a900e745a935f17f8e6e1e28fc https://access.redhat.com/errata/RHBA-2020:1529 4.3.17 quay.io/openshift-release-dev/ocp-release@sha256:1bc57b872cb878d8cfa43da4da30726d8367f8439934cd35797bde5fbaa76f15 https://access.redhat.com/errata/RHBA-2020:1529 4.4.0-rc.10 quay.io/openshift-release-dev/ocp-release@sha256:565b5ddcfebaeb83489570c28bdbc1b47a11f2b26a29b6b8f453d6fc10f068e9 https://access.redhat.com/errata/RHBA-2020:0581 4.4.0-rc.11 quay.io/openshift-release-dev/ocp-release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07 https://access.redhat.com/errata/RHBA-2020:0581 4.4.0-rc.9 quay.io/openshift-release-dev/ocp-release@sha256:f3342423306f95a524357dd71d832dec6274fb46d560696d9df0a3af40dd7820 https://access.redhat.com/errata/RHBA-2020:0581 so you should have been able to go straight from 4.3.13 -> 4.3.15 (and now from 4.3.13 -> 4.3.17). > $ oc get clusterversion > NAME VERSION AVAILABLE PROGRESSING SINCE STATUS > version 4.3.15 True True 54s Working towards 4.4.0-rc.11: 11% complete > > This step is not as expected. I dunno what happened here. Would be good to double-check your ClusterVersion conditions before launching the update. I have some CVO PRs up around bug 1827166 to add logging that will help understand why an update that we expect preconditions to block fails to get blocked, although you have to follow the original CVOs logs to collect those messages (examples in my comments in that bug), and my PRs haven't landed in any branches yet. But... any lack-of-block here would be a CVO bug like bug 1827166. To verify the kube-apiserver-operator change, you should just look at: $ oc get -o json clusteroperator kube-apiserver | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + " " + .status + " " + .message' and see that 4.3.15 does not clear the Upgradeable=False condition while continuing on to 4.3.17 will clear that condition. Then we can mark this on VERIFIED and file any follow-up bugs against the CVO about anything that is not making sense there. [1]: https://github.com/openshift/cincinnati/blob/master/hack/available-updates.sh
Tried with latest 4.3.17, something as expected, something not, detail see below, 1. Install 4.3.9 2. Modify a default SCC $ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user1"}]' $ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user1"}]' Confirmed changes, $ oc get scc privileged -o json | jq .users [ "system:admin", "system:serviceaccount:openshift-infra:build-controller", "e2e-user1" ] $ oc get scc anyuid -o json | jq .users [ "e2e-user1" ] Checking Upgradeable status, $ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")' { "lastTransitionTime": "2020-04-26T09:01:37Z", "message": "DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]", "reason": "DefaultSecurityContextConstraints_Mutated", "status": "False", "type": "Upgradeable" } Checking kube-apiserver logs, $ oc get -o json clusteroperator kube-apiserver | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + " " + .status + " " + .message' 2020-04-26T08:45:23Z Degraded False NodeControllerDegraded: All master nodes are ready 2020-04-26T08:49:00Z Progressing False Progressing: 3 nodes are at revision 6 2020-04-26T08:36:47Z Available True Available: 3 nodes are active; 3 nodes are at revision 6 2020-04-26T09:01:37Z Upgradeable False DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged] 3. Update to 4.3.13 with --force //as expected. $ oc adm upgrade Cluster version is 4.3.9 Updates: VERSION IMAGE 4.3.13 quay.io/openshift-release-dev/ocp-release@sha256:e1ebc7295248a8394afb8d8d918060a7cc3de12c491283b317b80b26deedfe61 Change channel to candidate-4.4 first: $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "candidate-4.4"}]' clusterversion.config.openshift.io/version patched $ oc adm upgrade Cluster version is 4.3.9 Updates: VERSION IMAGE 4.3.13 quay.io/openshift-release-dev/ocp-release@sha256:e1ebc7295248a8394afb8d8d918060a7cc3de12c491283b317b80b26deedfe61 4.3.14 quay.io/openshift-release-dev/ocp-release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b 4.3.15 quay.io/openshift-release-dev/ocp-release@sha256:0e9642d28c12f5f54c1ab0fffbfd866daa6179a900e745a935f17f8e6e1e28fc 4.3.17 quay.io/openshift-release-dev/ocp-release@sha256:1bc57b872cb878d8cfa43da4da30726d8367f8439934cd35797bde5fbaa76f15 4.4.0-rc.6 quay.io/openshift-release-dev/ocp-release@sha256:2532227a868fca11a0cb7563232a26ab9a682d8ee1bb72fd416c4e7789d7ce11 4.4.0-rc.7 quay.io/openshift-release-dev/ocp-release@sha256:df3b7a74c8590a932c00fd9b1ef6c1fb2a0bfd1c3643b78ae378cadee3258c03 4.4.0-rc.8 quay.io/openshift-release-dev/ocp-release@sha256:1d1254b27532ceefabef4b94d46a65baa4876de47e09f2f7c26c138691413889 4.4.0-rc.9 quay.io/openshift-release-dev/ocp-release@sha256:f3342423306f95a524357dd71d832dec6274fb46d560696d9df0a3af40dd7820 4.4.0-rc.10 quay.io/openshift-release-dev/ocp-release@sha256:565b5ddcfebaeb83489570c28bdbc1b47a11f2b26a29b6b8f453d6fc10f068e9 4.4.0-rc.11 quay.io/openshift-release-dev/ocp-release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07 Tried upgrade to 4.3.13 without --force, it will be stuck. $ oc adm upgrade --to=4.3.13 Updating to 4.3.13 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.9 True True 3m58s Unable to apply 4.3.13: it may not be safe to apply this update Clear the upgrade field, roll back to 4.3.9, $ oc adm upgrade --clear=true Cleared the update field, still at 4.3.13 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.9 True False 48s Cluster version is 4.3.9 Than upgrade with --force, $ oc adm upgrade --to=4.3.13 --force=true Updating to 4.3.13 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.13 True False 23s Cluster version is 4.3.13 4. Update to 4.3.17 without --force //as expected. $ oc adm upgrade --to=4.3.17 Updating to 4.3.17 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.17 True False 13s Cluster version is 4.3.17 $ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")' { "lastTransitionTime": "2020-04-26T10:03:31Z", "reason": "AsExpected", "status": "True", "type": "Upgradeable" } Add user to change default SCC again, $ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user2"}]' securitycontextconstraints.security.openshift.io/privileged patched $ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user2"}]' securitycontextconstraints.security.openshift.io/anyuid patched $ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")' { "lastTransitionTime": "2020-04-26T10:03:31Z", "reason": "AsExpected", "status": "True", "type": "Upgradeable" } We can see the 4.3.17, removed the stale condition DefaultSecurityContextConstraints_Mutated as expected. 5. Update to 4.4 without --force (this should work) // not as expected. $ oc adm upgrade Cluster version is 4.3.17 No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss. We can see no available upgrade path for 4.3.17, we have to do the following, $ oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge clusterversion.config.openshift.io/version patched $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "stable-4.4"}]' clusterversion.config.openshift.io/version patched After that, there are some available nightly builds for upgrade, $ oc adm upgrade Cluster version is 4.3.17 Updates: VERSION IMAGE ... 4.4.0-0.nightly-2020-04-26-070343 registry.svc.ci.openshift.org/ocp/release@sha256:61064e1a780a55b20bafc89e4936c317d4ac4c6fee8759b05ea0e408d3b0e7af Tried to upgrade without --force, does not work as expected, $ oc adm upgrade --to=4.4.0-0.nightly-2020-04-26-070343 Updating to 4.4.0-0.nightly-2020-04-26-070343 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.17 True True 2m6s Unable to apply 4.4.0-0.nightly-2020-04-26-070343: the image may not be safe to use Have to roll back to 4.3.17, $ oc adm upgrade --clear=true Cleared the update field, still at 4.4.0-0.nightly-2020-04-26-070343 $ oc adm upgrade --to=4.4.0-0.nightly-2020-04-26-070343 --force=true Updating to 4.4.0-0.nightly-2020-04-26-070343 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-0.nightly-2020-04-26-070343 True False 6m51s Cluster version is 4.4.0-0.nightly-2020-04-26-070343 Anyway, the 4.3.17 works as expected, the problem of upgrading to 4.4 path, I supposed the build realease problem.
> 5. Update to 4.4 without --force (this should work) // not as expected. > $ oc adm upgrade > Cluster version is 4.3.17 > > No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss. Because there are not yet 4.4 releases that include 4.3.17 as update source. So this is working as expected. > $ oc get clusterversion > NAME VERSION AVAILABLE PROGRESSING SINCE STATUS > version 4.3.17 True True 2m6s Unable to apply 4.4.0-0.nightly-2020-04-26-070343: the image may not be safe to use Because 4.3.17 only trust official RH keys, and nightlies are not signed by those keys (or at all). This would have worked if you'd used '-allow-explicit-upgrade --to-image $BY_DIGEST_PULLSPEC' to go to a 4.4 nightly.
Thanks W.Trevor's explains, it solved my confusion.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1529