Bug 2111979

Summary: openshift-controller-manager-operator NS runlevel needs to be set to emptystring
Product: OpenShift Container Platform Reporter: Ben Parees <bparees>
Component: openshift-controller-managerAssignee: jawed <jkhelil>
openshift-controller-manager sub component: controller-manager QA Contact: Jitendar Singh <jitsingh>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: cdaley, jitsingh, jkhelil, wking
Version: 4.12Keywords: Reopened
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:53:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ben Parees 2022-07-28 14:38:52 UTC
Description of problem:
openshift-controller-manager-operator NS runlevel needs to be set to empty string.


For the full history of this issue, see:
https://docs.google.com/document/d/16DrsqtrtZUxtl4H0wiyxtvduIWqJZkFgC4Jrc-80Pss/edit#

Currently the NS label is not set at all:
https://github.com/openshift/cluster-openshift-controller-manager-operator/blob/dd6088c019de81574cdedb4973c420ce8d7d2b7a/manifests/00_namespace.yaml#L10

but historically it was set to "0":
https://github.com/openshift/cluster-openshift-controller-manager-operator/blob/d58076f2f1a9839a5887ddd7aeb4ad163b4ada8b/manifests/00_namespace.yaml#L7

Because the CVO doesn't remove deleted labels, this means clusters that started at an older version like 4.3.18 will still have this label present on the NS, which results in different SCC admission behavior.

To ensure we are consistent in our configurations, the NS label should instead be explicitly set to "", this way the CVO will apply the setting and overwrite the runlevel 0 setting.  See https://github.com/openshift/machine-api-operator/pull/1031/files#diff-9dfafd72c269693396fc68a6b83e130c3606cb7fb2d0f7bc17c3f9c7ac159d7fR13 for an example.


To verify the fix, we need to start from a 4.3.18 cluster and upgrade it to 4.12 successfully.  If the 4.12 upgrade is successful (openshift controller manager+operator don't fail due to SCC admission or CRIO container start issues), then the fix can be considered verified.


Version-Release number of selected component (if applicable):
4.12

How reproducible:
always

Steps to Reproduce:
1. Install a cluster from version 4.3.18
2. Upgrade the cluster all the way to 4.12
3a. Prior to this change, the cluster will successfully upgrade but the NS will still be labeled as openshift.io/run-level: "0" 
3b. After this change, the cluster will successfully upgrade but the NS will be labeled as openshift.io/run-level: ""
  

Actual results:
openshift-controller-manager-operator NS runlevel label is set to 0

Expected results:
runlevel label is set to emptystring

Comment 2 jawed 2022-11-04 07:00:26 UTC
@cdaley this was fixed here https://github.com/openshift/cluster-openshift-controller-manager-operator/pull/248

the PR is not actually referecing the right BZ, it is referencing https://bugzilla.redhat.com/show_bug.cgi?id=2110629
Initially it was about removing the runlevel label but the right fix was to set it to nil which the PR is actually doing

I will close the current one as duplicate of 2110629 if you dont mind

Comment 3 jawed 2022-11-04 07:02:06 UTC

*** This bug has been marked as a duplicate of bug 2110629 ***

Comment 4 Ben Parees 2022-11-04 14:09:15 UTC
Please read the bug description again.  There are two different NSes in question here, the other bug fixed a different NS.

You still need to fix this one:
https://github.com/openshift/cluster-openshift-controller-manager-operator/blob/master/manifests/00_namespace.yaml

because once upon a time it set this:
https://github.com/openshift/cluster-openshift-controller-manager-operator/blob/d58076f2f1a9839a5887ddd7aeb4ad163b4ada8b/manifests/00_namespace.yaml#L7

which means any clusters that installed that older version and have since upgraded, will still have that label set.

Comment 5 jawed 2022-11-04 15:21:07 UTC
thank you @bparees for spotting this
fixing this

Comment 10 errata-xmlrpc 2023-01-17 19:53:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399

Comment 11 Red Hat Bugzilla 2023-09-18 04:43:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days