Bug 1827336

Summary: Remove stale condition DefaultSecurityContextConstraints_Mutated
Product: OpenShift Container Platform Reporter: Abu Kashem <akashem>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.3.zCC: aos-bugs, mfojtik, wking, xxia, zyu
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1827335
: 1827337 (view as bug list) Environment:
Last Closed: 2020-05-04 11:50:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1827335    
Bug Blocks: 1827337    

Description Abu Kashem 2020-04-23 16:51:31 UTC
+++ This bug was initially created as a clone of Bug #1827335 +++

Description of problem:
In OpenShift 4.3.14 we have reverted DefaultSecurityContextConstraints_Mutated. We removed the controller that sets Upgradeable to False if any default SCC has been mutated.

But on an affected cluster (pre 4.3.14) that already has user-modified default SCCs the stale condition does not get removed after upgrade.



Version-Release number of selected component (if applicable):
OpenShift 4.3.14

How reproducible:
Always

Steps to Reproduce:
1. install ocp v4.3.13
2. trigger upgradeable=false by mutating default scc
Change the default SCC 
$ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
$ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
    
# ./oc get scc privileged -o json|jq .users
[
  "system:admin",
  "system:serviceaccount:openshift-infra:build-controller",
  "e2e-user"
]

3. With path 4.3.13-4.3.14 and do upgrade.
$ oc adm upgrade --to=4.3.14
Updating to 4.3.14

$ oc get clusterversion version -o json|jq .status.conditions[-1]
{
  "lastTransitionTime": "2020-04-23T04:07:33Z",
  "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.3.14    True        False         34m     Cluster version is 4.3.14

Checking the changed the default SCC, still be there.

$ oc get scc privileged -o json | jq .users
[
  "system:admin",
  "system:serviceaccount:openshift-infra:build-controller",
  "e2e-user"
]

$ oc get scc anyuid -o json | jq .users
[
  "e2e-user"
]

Actual results:
$ oc get clusterversion version -o json|jq .status.conditions[-1]
{
  "lastTransitionTime": "2020-04-23T04:07:33Z",
  "message": "Cluster operator kube-apiserver cannot be upgraded: DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

Expected results:
"Upgradeable" condition of clusterversion/version should not have DefaultSecurityContextConstraints_Mutated.

Additional info:

Comment 1 Abu Kashem 2020-04-23 19:10:46 UTC
Hi xxia,
One way to test thisin 4.4 is to follow the steps below:

1. install ocp v4.3.13 
2. trigger upgradeable=false by mutating default scc 

(This will ensure that you have DefaultSecurityContextConstraints_Mutated condition set. )



3. force upgrade to 4.4

(Here, the expectation is that the stale condition would be removed as soon as the upgraded kube-apiserver operator starts loading )

Comment 2 Scott Dodson 2020-04-24 00:18:08 UTC
Per https://coreos.slack.com/archives/CB48XQ4KZ/p1587681707027700?thread_ts=1587666197.493100&cid=CB48XQ4KZ this is not merging until next week and is not a 4.4.0 blocker. Moving target release.

Comment 6 Xingxing Xia 2020-04-26 11:02:16 UTC
Installed fresh 4.3.13 env. Then:
$ oc patch scc privileged --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
$ oc patch scc anyuid --type json -p '[{"op": "add", "path": "/users/-", "value": "e2e-user"}]'
Check:
$ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")'
Saw Upgradeable is False:
{
  "lastTransitionTime": "2020-04-26T09:07:40Z",
  "message": "DefaultSecurityContextConstraintsUpgradeable: Default SecurityContextConstraints object(s) have mutated [anyuid privileged]",
  "reason": "DefaultSecurityContextConstraints_Mutated",
  "status": "False",
  "type": "Upgradeable"
}

Upgrade to 4.4:
$ oc adm upgrade --to-image registry.svc.ci.openshift.org/ocp/release:4.4.0-0.nightly-2020-04-26-045325 --allow-explicit-upgrade --force
After upgrade completed, check again, changed to True:
$ oc get co kube-apiserver -o json | jq -r '.status.conditions[] | select(.type == "Upgradeable")'
{
  "lastTransitionTime": "2020-04-26T09:30:49Z",
  "reason": "AsExpected",
  "status": "True",
  "type": "Upgradeable"
}

Double check:
$ oc get scc anyuid privileged -o json | jq '.items[].users'
[
  "e2e-user"
]
[
  "system:admin",
  "system:serviceaccount:openshift-infra:build-controller",
  "e2e-user"
]
The change is still there, not stomped like bug 1823934 . Check cluster basic status (pods/nodes/co's), all are normal.

Comment 8 errata-xmlrpc 2020-05-04 11:50:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581