Bug 1973525

Summary: machine-config-operator: remove runlevel from kni-infra namespace
Product: OpenShift Container Platform Reporter: Mark Cooper <mcooper>
Component: Machine Config OperatorAssignee: MCO Team <team-mco>
Machine Config Operator sub component: Machine Config Operator QA Contact: Rio Liu <rioliu>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: low CC: aos-bugs, bnemec, mkrejci, shardy, vlaad, ykashtan
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-29 15:18:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mark Cooper 2021-06-18 05:33:28 UTC
Description of problem:

In the machine-config-operator it is setting a runlevel for the openshift-kni-infra:
```
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-kni-infra
  annotations:
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    openshift.io/node-selector: ""
    workload.openshift.io/allowed: "management"
  labels:
    name: openshift-kni-infra
    openshift.io/run-level: "1"
```

https://github.com/openshift/machine-config-operator/blob/3f6db243f2d6651720daa658e8edc830f47dd184/install/0000_80_machine-config-operator_00_namespace.yaml#L40

We're looking to get this removed as by setting this runlevel no SCC is applied to any pod within that namespace. After talking to the kni-deployment team it seems that the runlevel is not necessarym, so it would be beneficial to remove it.


Steps to Reproduce:
1. launch a cluster on baremetal


Actual results:
# oc get ns openshift-kni-infra -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ...
  labels:
    kubernetes.io/metadata.name: openshift-kni-infra
    name: openshift-kni-infra
    openshift.io/run-level: "1"

# oc get pod coredns-dhcp-55-209.lab.eng.tlv2.redhat.com -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 72a238ca06e504b512ee3b6c41a3c657
    kubernetes.io/config.mirror: 72a238ca06e504b512ee3b6c41a3c657
    kubernetes.io/config.seen: "2021-06-18T04:51:12.558621639Z"
    kubernetes.io/config.source: file


Expected results:

# oc get ns openshift-kni-infra -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ...
  labels:
    kubernetes.io/metadata.name: openshift-kni-infra
    name: openshift-kni-infra

# oc get pod coredns-cnfdt19.lab.eng.tlv2.redhat.com -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 3d16798cfd20239af2e4113b2e76f7fd
    kubernetes.io/config.mirror: 3d16798cfd20239af2e4113b2e76f7fd
    kubernetes.io/config.seen: "2021-06-16T06:25:51.658009420Z"
    kubernetes.io/config.source: file
    openshift.io/scc: privileged


Additional info:

As the change goes under the machine config operator but affects the kni-infra namespace, not sure where the best spot for this is really. Obviously the scc privileged is not ideal, but it's the first step and still better than no scc. Might belong under OpenStack on OpenShift?

Also not sure what testing is involved here, some basic testing seems to suggest that it starts ok without the runlevel and seems to function. Function as what is required or intended, no idea tho. If there is anything using a PV or inline host mount, then the pods in that ns will need to indicate the user they're operating as.

Comment 1 Mark Cooper 2021-06-18 05:34:17 UTC
FYI @shardy @bnemec

Comment 2 Mark Cooper 2021-06-18 05:51:09 UTC
Also found this https://bugzilla.redhat.com/show_bug.cgi?id=1753067 

It seems this was introduced to originally stop OOM errors from occurring, which it would because all limits would be removed. But if the issue was just that the namespace didn't exist then removing the runlevel should be ok too.

Comment 4 Ben Nemec 2021-06-18 20:55:16 UTC
I've pushed a patch to remove the runlevel from our namespaces. In my local testing it worked fine without.

Comment 5 Mark Cooper 2021-08-06 04:46:01 UTC
Only thing we may need to double check, while some pods work no worries, if the SA's for each NS don't have the required permissions (i.e. privileged) then the pods will fail to admit. 

Kni looks happy and But based off the github PR the tests for vsphere, openstack and ovirt are all looking happy. Although to me, it's not immediately obvious what components we expect to install into those related ns's.

Comment 9 errata-xmlrpc 2021-11-01 01:28:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759

Comment 10 Red Hat Bugzilla 2023-09-15 01:10:05 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days