Bug 1973525 - machine-config-operator: remove runlevel from kni-infra namespace [NEEDINFO]
Summary: machine-config-operator: remove runlevel from kni-infra namespace
Keywords:
Status: VERIFIED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.8
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: 4.9.0
Assignee: Ben Nemec
QA Contact: Rio Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-18 05:33 UTC by Mark Cooper
Modified: 2021-09-07 09:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
mcooper: needinfo? (shardy)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2627 0 None open Bug 1973525: [on-prem] Drop runlevel from infra namespaces 2021-06-18 20:55:43 UTC

Description Mark Cooper 2021-06-18 05:33:28 UTC
Description of problem:

In the machine-config-operator it is setting a runlevel for the openshift-kni-infra:
```
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-kni-infra
  annotations:
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
    openshift.io/node-selector: ""
    workload.openshift.io/allowed: "management"
  labels:
    name: openshift-kni-infra
    openshift.io/run-level: "1"
```

https://github.com/openshift/machine-config-operator/blob/3f6db243f2d6651720daa658e8edc830f47dd184/install/0000_80_machine-config-operator_00_namespace.yaml#L40

We're looking to get this removed as by setting this runlevel no SCC is applied to any pod within that namespace. After talking to the kni-deployment team it seems that the runlevel is not necessarym, so it would be beneficial to remove it.


Steps to Reproduce:
1. launch a cluster on baremetal


Actual results:
# oc get ns openshift-kni-infra -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ...
  labels:
    kubernetes.io/metadata.name: openshift-kni-infra
    name: openshift-kni-infra
    openshift.io/run-level: "1"

# oc get pod coredns-dhcp-55-209.lab.eng.tlv2.redhat.com -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 72a238ca06e504b512ee3b6c41a3c657
    kubernetes.io/config.mirror: 72a238ca06e504b512ee3b6c41a3c657
    kubernetes.io/config.seen: "2021-06-18T04:51:12.558621639Z"
    kubernetes.io/config.source: file


Expected results:

# oc get ns openshift-kni-infra -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ...
  labels:
    kubernetes.io/metadata.name: openshift-kni-infra
    name: openshift-kni-infra

# oc get pod coredns-cnfdt19.lab.eng.tlv2.redhat.com -o yaml
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/config.hash: 3d16798cfd20239af2e4113b2e76f7fd
    kubernetes.io/config.mirror: 3d16798cfd20239af2e4113b2e76f7fd
    kubernetes.io/config.seen: "2021-06-16T06:25:51.658009420Z"
    kubernetes.io/config.source: file
    openshift.io/scc: privileged


Additional info:

As the change goes under the machine config operator but affects the kni-infra namespace, not sure where the best spot for this is really. Obviously the scc privileged is not ideal, but it's the first step and still better than no scc. Might belong under OpenStack on OpenShift?

Also not sure what testing is involved here, some basic testing seems to suggest that it starts ok without the runlevel and seems to function. Function as what is required or intended, no idea tho. If there is anything using a PV or inline host mount, then the pods in that ns will need to indicate the user they're operating as.

Comment 1 Mark Cooper 2021-06-18 05:34:17 UTC
FYI @shardy@redhat.com @bnemec@redhat.com

Comment 2 Mark Cooper 2021-06-18 05:51:09 UTC
Also found this https://bugzilla.redhat.com/show_bug.cgi?id=1753067 

It seems this was introduced to originally stop OOM errors from occurring, which it would because all limits would be removed. But if the issue was just that the namespace didn't exist then removing the runlevel should be ok too.

Comment 4 Ben Nemec 2021-06-18 20:55:16 UTC
I've pushed a patch to remove the runlevel from our namespaces. In my local testing it worked fine without.

Comment 5 Mark Cooper 2021-08-06 04:46:01 UTC
Only thing we may need to double check, while some pods work no worries, if the SA's for each NS don't have the required permissions (i.e. privileged) then the pods will fail to admit. 

Kni looks happy and But based off the github PR the tests for vsphere, openstack and ovirt are all looking happy. Although to me, it's not immediately obvious what components we expect to install into those related ns's.


Note You need to log in before you can comment on or make changes to this bug.