Description of problem: In the machine-config-operator it is setting a runlevel for the openshift-kni-infra: ``` apiVersion: v1 kind: Namespace metadata: name: openshift-kni-infra annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" openshift.io/node-selector: "" workload.openshift.io/allowed: "management" labels: name: openshift-kni-infra openshift.io/run-level: "1" ``` https://github.com/openshift/machine-config-operator/blob/3f6db243f2d6651720daa658e8edc830f47dd184/install/0000_80_machine-config-operator_00_namespace.yaml#L40 We're looking to get this removed as by setting this runlevel no SCC is applied to any pod within that namespace. After talking to the kni-deployment team it seems that the runlevel is not necessarym, so it would be beneficial to remove it. Steps to Reproduce: 1. launch a cluster on baremetal Actual results: # oc get ns openshift-kni-infra -o yaml apiVersion: v1 kind: Namespace metadata: annotations: ... labels: kubernetes.io/metadata.name: openshift-kni-infra name: openshift-kni-infra openshift.io/run-level: "1" # oc get pod coredns-dhcp-55-209.lab.eng.tlv2.redhat.com -o yaml apiVersion: v1 kind: Pod metadata: annotations: kubernetes.io/config.hash: 72a238ca06e504b512ee3b6c41a3c657 kubernetes.io/config.mirror: 72a238ca06e504b512ee3b6c41a3c657 kubernetes.io/config.seen: "2021-06-18T04:51:12.558621639Z" kubernetes.io/config.source: file Expected results: # oc get ns openshift-kni-infra -o yaml apiVersion: v1 kind: Namespace metadata: annotations: ... labels: kubernetes.io/metadata.name: openshift-kni-infra name: openshift-kni-infra # oc get pod coredns-cnfdt19.lab.eng.tlv2.redhat.com -o yaml apiVersion: v1 kind: Pod metadata: annotations: kubernetes.io/config.hash: 3d16798cfd20239af2e4113b2e76f7fd kubernetes.io/config.mirror: 3d16798cfd20239af2e4113b2e76f7fd kubernetes.io/config.seen: "2021-06-16T06:25:51.658009420Z" kubernetes.io/config.source: file openshift.io/scc: privileged Additional info: As the change goes under the machine config operator but affects the kni-infra namespace, not sure where the best spot for this is really. Obviously the scc privileged is not ideal, but it's the first step and still better than no scc. Might belong under OpenStack on OpenShift? Also not sure what testing is involved here, some basic testing seems to suggest that it starts ok without the runlevel and seems to function. Function as what is required or intended, no idea tho. If there is anything using a PV or inline host mount, then the pods in that ns will need to indicate the user they're operating as.
FYI @shardy @bnemec
Also found this https://bugzilla.redhat.com/show_bug.cgi?id=1753067 It seems this was introduced to originally stop OOM errors from occurring, which it would because all limits would be removed. But if the issue was just that the namespace didn't exist then removing the runlevel should be ok too.
I've pushed a patch to remove the runlevel from our namespaces. In my local testing it worked fine without.
Only thing we may need to double check, while some pods work no worries, if the SA's for each NS don't have the required permissions (i.e. privileged) then the pods will fail to admit. Kni looks happy and But based off the github PR the tests for vsphere, openstack and ovirt are all looking happy. Although to me, it's not immediately obvious what components we expect to install into those related ns's.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days