Bug 1827337 - Remove stale condition DefaultSecurityContextConstraints_Mutated
Summary: Remove stale condition DefaultSecurityContextConstraints_Mutated
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.3.z
Assignee: Abu Kashem
QA Contact: Ke Wang
URL:
Whiteboard:
Depends On: 1827336
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-23 16:52 UTC by Abu Kashem
Modified: 2020-04-30 01:28 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1827336
Environment:
Last Closed: 2020-04-30 01:28:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-kube-apiserver-operator pull 838 0 None closed Bug 1827337: Remove stale condition DefaultSecurityContextConstraintsUpgradeable 2020-09-04 09:05:59 UTC
Red Hat Product Errata RHBA-2020:1529 0 None None None 2020-04-30 01:28:38 UTC

Description Abu Kashem 2020-04-23 16:52:36 UTC
+++ This bug was initially created as a clone of Bug #1827336 +++

+++ This bug was initially created as a clone of Bug #1827335 +++

Description of problem:
In OpenShift 4.3.14 we have reverted DefaultSecurityContextConstraints_Mutated. We removed the controller that sets Upgradeable to False if any default SCC has been mutated.

But on an affected cluster (pre 4.3.14) that already has user-modified default SCCs the stale condition does not get removed after upgrade.



Version-Release number of selected component (if applicable):
OpenShift 4.3.14

How reproducible:
Always

Steps to Reproduce:
1. install ocp v4.3.13
2. trigger upgradeable=false by mutating default scc
Change the default SCC 
$ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
$ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
    
# ./oc get scc privileged -o json|jq .users
[
  "system:admin",
  "system:serviceaccount:openshift-infra:build-controller",
  "e2e-user"
]

3. With path 4.3.13-4.3.14 and do upgrade.
$ oc adm upgrade --to=4.3.14
Updating to 4.3.14

$ oc get clusterversion version -o json|jq .status.conditions[-1]
{
  "lastTransitionTime": "2020-04-23T04:07:33Z",
  "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.14    True        False         34m     Cluster version is 4.3.14

Checking the changed the default SCC, still be there.

$ oc get scc privileged -o json | jq .users
[
  "system:admin",
  "system:serviceaccount:openshift-infra:build-controller",
  "e2e-user"
]

$ oc get scc anyuid -o json | jq .users
[
  "e2e-user"
]

Actual results:
$ oc get clusterversion version -o json|jq .status.conditions[-1]
{
  "lastTransitionTime": "2020-04-23T04:07:33Z",
  "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

Expected results:
"Upgradeable" condition of clusterversion/version should not have DefaultSecurityContextConstraints_Mutated.

Additional info:

Comment 1 W. Trevor King 2020-04-23 17:02:30 UTC
This 4.3.z bug should only depend on the 4.4 bug.  The 4.4 bug depends on the 4.5 bug, so we do not need or want a direct 4.3 -> 4.5 link.

Comment 2 Eric Paris 2020-04-23 21:46:22 UTC
For QA, we believe the test would be
1. Install 4.3.8-4.3.12
2. Modify a default SCC (say add a user to it)
3. Update to 4.3.13 with --force
4. Update to 4.3.15 without --force
5. Update to 4.4 without --force (this will fail)


1. Install 4.3.8-4.3.12
2. Modify a default SCC
3. Update to 4.3.13 with --force
4. Update to 4.3.16 without --force
5. Update to 4.4 without --force (this should work)

Comment 3 W. Trevor King 2020-04-23 21:55:37 UTC
> 4. Update to 4.3.15 without --force

If you do this via --to, you'll have to change your channel to candidate-4.4 first (4.4 to also set up for the next step's 4.4 RC attempt):

  $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "candidate-4.4"}]'

> 4. Update to 4.3.16 without --force

4.3.16 is getting built right now, so it should exist by the time you run this.  Otherwise you may be able to use a recent nightly [1].

> 5. Update to 4.4 without --force (this should work)

If 4.3.16 is built and in candidate-4.4 when you test, this will just be '--to 4.3.16'.  If you test earlier, you'll have to use '--allow-explicit-upgrade --to-image $BY_DIGEST_PULLSPEC'.

[1]: https://openshift-release.svc.ci.openshift.org/#4.3.0-0.nightly

Comment 6 W. Trevor King 2020-04-24 01:00:12 UTC
https://openshift-release.svc.ci.openshift.org/releasestream/4.3.0-0.nightly/release/4.3.0-0.nightly-2020-04-23-225015 is an accepted nightly which can stand in for 4.1.16 in these tests.

Comment 7 W. Trevor King 2020-04-24 03:32:33 UTC
4.1.16 is out [1].  Once it gets accepted (hopefully in the next hour), we'll drop it into candidate-4.4 [2].

[1]: https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/release/4.3.16
[2]: https://github.com/openshift/cincinnati-graph-data/pull/205

Comment 8 Ke Wang 2020-04-24 10:00:15 UTC
(In reply to W. Trevor King from comment #7)
> 4.1.16 is out [1].  Once it gets accepted (hopefully in the next hour),
I think Here is 4.3.16? 

> we'll drop it into candidate-4.4 [2].
> 
> [1]:
> https://openshift-release.svc.ci.openshift.org/releasestream/4-stable/
> release/4.3.16
> [2]: https://github.com/openshift/cincinnati-graph-data/pull/205

Comment 9 Ke Wang 2020-04-24 11:01:27 UTC
(In reply to Eric Paris from comment #2)
> For QA, we believe the test would be
> 1. Install 4.3.8-4.3.12
> 2. Modify a default SCC (say add a user to it)
> 3. Update to 4.3.13 with --force

Done as expected.
$ oc adm upgrade --to=4.3.13 --force=true
Updating to 4.3.13


> 4. Update to 4.3.15 without --force

$ oc adm upgrade
Cluster version is 4.3.13

Updates:

VERSION IMAGE
4.3.14  registry.svc.ci.openshift.org/ocp/release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b

Unable to upgrade to 4.3.15 directly, first to 4.3.14, then 4.3.15.

$ oc adm upgrade --to=4.3.14
Updating to 4.3.14

$  oc get clusterversion version
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.14    True        False         24m     Cluster version is 4.3.14

$ oc adm upgrade --to=4.3.15
Updating to 4.3.15

After done as expected, the v4.3.15 has been dropped from upgrade path.


> 5. Update to 4.4 without --force (this will fail)

Set the correct channel for OCP 4.4 by using the web console
$ oc adm upgrade
Cluster version is 4.3.15

Updates:

VERSION                           IMAGE
...
4.4.0-rc.11                       registry.svc.ci.openshift.org/ocp/release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07

$ oc adm upgrade --to=4.4.0-rc.11
Updating to 4.4.0-rc.11

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.15    True        True          54s     Working towards 4.4.0-rc.11: 11% complete

This step is not as expected.


--------------------------------
> 
> 
> 1. Install 4.3.8-4.3.12
> 2. Modify a default SCC
> 3. Update to 4.3.13 with --force
> 4. Update to 4.3.16 without --force
> 5. Update to 4.4 without --force (this should work)

this case is able to go on, since 4.3.16 is unavailable. Detail see https://coreos.slack.com/archives/CJARLA942/p1587719406162100

Comment 10 Lalatendu Mohanty 2020-04-24 11:12:04 UTC
As discussed this is an upgrade blocker for 4.3.Z stream

Comment 11 Lalatendu Mohanty 2020-04-24 16:22:13 UTC
Removing the upgradeblocker as the upgrade works fine.

Comment 12 W. Trevor King 2020-04-25 05:01:15 UTC
> Unable to upgrade to 4.3.15 directly, first to 4.3.14, then 4.3.15.

Did you switch into candidate-4.4?  Using [1]:

$ CHANNEL=candidate-4.4 ~/src/openshift/cincinnati/hack/available-updates.sh 4.3.13
4.3.14	quay.io/openshift-release-dev/ocp-release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b	https://access.redhat.com/errata/RHBA-2020:1529
4.3.15	quay.io/openshift-release-dev/ocp-release@sha256:0e9642d28c12f5f54c1ab0fffbfd866daa6179a900e745a935f17f8e6e1e28fc	https://access.redhat.com/errata/RHBA-2020:1529
4.3.17	quay.io/openshift-release-dev/ocp-release@sha256:1bc57b872cb878d8cfa43da4da30726d8367f8439934cd35797bde5fbaa76f15	https://access.redhat.com/errata/RHBA-2020:1529
4.4.0-rc.10	quay.io/openshift-release-dev/ocp-release@sha256:565b5ddcfebaeb83489570c28bdbc1b47a11f2b26a29b6b8f453d6fc10f068e9	https://access.redhat.com/errata/RHBA-2020:0581
4.4.0-rc.11	quay.io/openshift-release-dev/ocp-release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07	https://access.redhat.com/errata/RHBA-2020:0581
4.4.0-rc.9	quay.io/openshift-release-dev/ocp-release@sha256:f3342423306f95a524357dd71d832dec6274fb46d560696d9df0a3af40dd7820	https://access.redhat.com/errata/RHBA-2020:0581

so you should have been able to go straight from 4.3.13 -> 4.3.15 (and now from 4.3.13 -> 4.3.17).

> $ oc get clusterversion
> NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
> version   4.3.15    True        True          54s     Working towards 4.4.0-rc.11: 11% complete
>
> This step is not as expected.

I dunno what happened here.  Would be good to double-check your ClusterVersion conditions before launching the update.  I have some CVO PRs up around bug 1827166 to add logging that will help understand why an update that we expect preconditions to block fails to get blocked, although you have to follow the original CVOs logs to collect those messages (examples in my comments in that bug), and my PRs haven't landed in any branches yet.  But... any lack-of-block here would be a CVO bug like bug 1827166.  To verify the kube-apiserver-operator change, you should just look at:

  $ oc get -o json clusteroperator kube-apiserver | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + " " + .status + " " + .message'

and see that 4.3.15 does not clear the Upgradeable=False condition while continuing on to 4.3.17 will clear that condition.  Then we can mark this on VERIFIED and file any follow-up bugs against the CVO about anything that is not making sense there.

[1]: https://github.com/openshift/cincinnati/blob/master/hack/available-updates.sh

Comment 13 Ke Wang 2020-04-26 12:40:16 UTC
Tried with latest 4.3.17, something as expected, something not, detail see below, 


1. Install 4.3.9
   

2. Modify a default SCC
$ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user1"}]'
$ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user1"}]'

Confirmed changes,
$ oc get scc privileged -o json | jq .users
[
"system:admin",
"system:serviceaccount:openshift-infra:build-controller",
"e2e-user1"
]  
$ oc get scc anyuid -o json | jq .users
[
  "e2e-user1"
]

Checking Upgradeable status,
$ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")'
{
  "lastTransitionTime": "2020-04-26T09:01:37Z",
  "message": "DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

Checking kube-apiserver logs,
$ oc get -o json clusteroperator kube-apiserver | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + " " + .status + " " + .message'
2020-04-26T08:45:23Z Degraded False NodeControllerDegraded: All master nodes are ready
2020-04-26T08:49:00Z Progressing False Progressing: 3 nodes are at revision 6
2020-04-26T08:36:47Z Available True Available: 3 nodes are active; 3 nodes are at revision 6
2020-04-26T09:01:37Z Upgradeable False DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]


3. Update to 4.3.13 with --force  //as expected.
$ oc adm upgrade
Cluster version is 4.3.9

Updates:

VERSION IMAGE
4.3.13  quay.io/openshift-release-dev/ocp-release@sha256:e1ebc7295248a8394afb8d8d918060a7cc3de12c491283b317b80b26deedfe61

Change channel to candidate-4.4 first:
$ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "candidate-4.4"}]'
clusterversion.config.openshift.io/version patched

$ oc adm upgrade
Cluster version is 4.3.9

Updates:

VERSION     IMAGE
4.3.13      quay.io/openshift-release-dev/ocp-release@sha256:e1ebc7295248a8394afb8d8d918060a7cc3de12c491283b317b80b26deedfe61
4.3.14      quay.io/openshift-release-dev/ocp-release@sha256:751ec35a2777a629b77615c6cc50d14cc278557fa4247342945a9dbcf3fc746b
4.3.15      quay.io/openshift-release-dev/ocp-release@sha256:0e9642d28c12f5f54c1ab0fffbfd866daa6179a900e745a935f17f8e6e1e28fc
4.3.17      quay.io/openshift-release-dev/ocp-release@sha256:1bc57b872cb878d8cfa43da4da30726d8367f8439934cd35797bde5fbaa76f15
4.4.0-rc.6  quay.io/openshift-release-dev/ocp-release@sha256:2532227a868fca11a0cb7563232a26ab9a682d8ee1bb72fd416c4e7789d7ce11
4.4.0-rc.7  quay.io/openshift-release-dev/ocp-release@sha256:df3b7a74c8590a932c00fd9b1ef6c1fb2a0bfd1c3643b78ae378cadee3258c03
4.4.0-rc.8  quay.io/openshift-release-dev/ocp-release@sha256:1d1254b27532ceefabef4b94d46a65baa4876de47e09f2f7c26c138691413889
4.4.0-rc.9  quay.io/openshift-release-dev/ocp-release@sha256:f3342423306f95a524357dd71d832dec6274fb46d560696d9df0a3af40dd7820
4.4.0-rc.10 quay.io/openshift-release-dev/ocp-release@sha256:565b5ddcfebaeb83489570c28bdbc1b47a11f2b26a29b6b8f453d6fc10f068e9
4.4.0-rc.11 quay.io/openshift-release-dev/ocp-release@sha256:6f09986c2c878f9675afcf9ee5d4720cf8ec0b9b832ab9400dd8df98cd2d6f07

Tried upgrade to 4.3.13 without --force, it will be stuck.
$ oc adm upgrade --to=4.3.13
Updating to 4.3.13

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.9     True        True          3m58s   Unable to apply 4.3.13: it may not be safe to apply this update

Clear the upgrade field, roll back to 4.3.9,
$ oc adm upgrade --clear=true
Cleared the update field, still at 4.3.13

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.9     True        False         48s     Cluster version is 4.3.9
 
Than upgrade with --force,  
$ oc adm upgrade --to=4.3.13 --force=true
Updating to 4.3.13

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.13    True        False         23s     Cluster version is 4.3.13


4. Update to 4.3.17 without --force //as expected.
$ oc adm upgrade --to=4.3.17
Updating to 4.3.17

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.17    True        False         13s     Cluster version is 4.3.17

$ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")'
{
  "lastTransitionTime": "2020-04-26T10:03:31Z",
  "reason": "AsExpected",
  "status": "True",
  "type": "Upgradeable"
}

Add user to change default SCC again,
$  oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user2"}]'
securitycontextconstraints.security.openshift.io/privileged patched

$ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user2"}]'
securitycontextconstraints.security.openshift.io/anyuid patched

$ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")'
{
  "lastTransitionTime": "2020-04-26T10:03:31Z",
  "reason": "AsExpected",
  "status": "True",
  "type": "Upgradeable"
}

We can see the 4.3.17, removed the stale condition DefaultSecurityContextConstraints_Mutated as expected.


5. Update to 4.4 without --force (this should work)  // not as expected.
$  oc adm upgrade
Cluster version is 4.3.17

No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss.


We can see no available upgrade path for 4.3.17, we have to do the following, 
$  oc patch clusterversion/version --patch '{"spec":{"upstream":"https://openshift-release.svc.ci.openshift.org/graph"}}' --type=merge
clusterversion.config.openshift.io/version patched

$ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "stable-4.4"}]'
clusterversion.config.openshift.io/version patched

After that, there are some available nightly builds for upgrade,
$  oc adm upgrade
Cluster version is 4.3.17

Updates:

VERSION                           IMAGE
...
4.4.0-0.nightly-2020-04-26-070343 registry.svc.ci.openshift.org/ocp/release@sha256:61064e1a780a55b20bafc89e4936c317d4ac4c6fee8759b05ea0e408d3b0e7af

Tried to upgrade without --force, does not work as expected,
$  oc adm upgrade --to=4.4.0-0.nightly-2020-04-26-070343
Updating to 4.4.0-0.nightly-2020-04-26-070343

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.17    True        True          2m6s    Unable to apply 4.4.0-0.nightly-2020-04-26-070343: the image may not be safe to use

Have to roll back to 4.3.17,
$ oc adm upgrade --clear=true
Cleared the update field, still at 4.4.0-0.nightly-2020-04-26-070343

$  oc adm upgrade --to=4.4.0-0.nightly-2020-04-26-070343 --force=true
Updating to 4.4.0-0.nightly-2020-04-26-070343

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-04-26-070343   True        False         6m51s   Cluster version is 4.4.0-0.nightly-2020-04-26-070343

Anyway, the 4.3.17 works as expected, the problem of upgrading to 4.4 path, I supposed the build realease problem.

Comment 14 W. Trevor King 2020-04-26 16:12:54 UTC
> 5. Update to 4.4 without --force (this should work)  // not as expected.
> $  oc adm upgrade
> Cluster version is 4.3.17
>
> No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss.

Because there are not yet 4.4 releases that include 4.3.17 as update source.  So this is working as expected.

> $ oc get clusterversion
> NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
> version   4.3.17    True        True          2m6s    Unable to apply 4.4.0-0.nightly-2020-04-26-070343: the image may not be safe to use

Because 4.3.17 only trust official RH keys, and nightlies are not signed by those keys (or at all).  This would have worked if you'd used '-allow-explicit-upgrade --to-image $BY_DIGEST_PULLSPEC' to go to a 4.4 nightly.

Comment 15 Ke Wang 2020-04-27 01:29:45 UTC
Thanks W.Trevor's explains, it solved my confusion.

Comment 17 errata-xmlrpc 2020-04-30 01:28:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1529


Note You need to log in before you can comment on or make changes to this bug.