Bug 1806915

Summary:	openshift-service-ca: Some core components are in openshift.io/run-level 1 and are bypassing SCC, but should not be
Product:	OpenShift Container Platform	Reporter:	Stefan Schimanski <sttts>
Component:	apiserver-auth	Assignee:	Standa Laznicka <slaznick>
Status:	CLOSED ERRATA	QA Contact:	scheng
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.4	CC:	aos-bugs, ccoleman, eparis, jialiu, jokerman, mfojtik, nhale, nstielau, sfowler, wsun, xiyuan, xtian, xxia
Target Milestone:	---	Keywords:	Reopened
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: The namespace openshift-service-ca was labelled with "openshift.io/run-level: 1". Consequence: The pods inside this namespace would run with extra privileges. Fix: Since the label is no longer necessary to avoid components' circular dependency, it was removed. Result: The service-ca pods had their privileges scoped down.	Story Points:	---
Clone Of:	1805488	Environment:
Last Closed:	2021-02-24 15:10:53 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1805488, 1966621

Comment 1 Standa Laznicka 2020-03-06 09:01:42 UTC

When trying to fix this issue for service-ca operator and controller, some dependency loops were identified that prevent the removal of the run-level label from the operator's and operand's namespaces.

Originally, it was observed that the DNS operator has a compulsory mount of the serving certificate provided by the service-ca controller. This prevented etcd from running, which in turn caused failures of the kube-apiserver deployment after bootstrap, which caused the cluster-policy-controller (which is not a part of the bootstrap control plane) to fail to connect to API (it connects to localhost and thus won't use the bootrap-control-plane kube-apiserver). This was fixed by removing the etcd dependency on DNS in https://github.com/openshift/cluster-etcd-operator/pull/233.

The cluster-policy-controller is unfortunately still dependent on the openshift-apiserver which provides rangeallocations.security.openshift.io resources needed by the namespace-security-allocation-controller (part of cluster-policy-controller). Without the namespace-security-allocation-controller running and annotating the namespaces with annotations needed for SCC admission, the service-ca operator and controller cannot run with any other SCC than privileged, which would be a poor fix to the issue. Note that the openshift-apiserver can't run without service-ca already running, which is creating yet another dependency loop.

A solution to the problem would be to move the rangeallocations.security.openshift.io resource group to CRD so that the controller can work even before openshift-apiserver starts, allowing any payload to use a proper SCC. I don't think the move to CRDs would be wise at this point of development phase of 4.4.

Comment 3 Stefan Schimanski 2020-03-12 15:30:12 UTC

Reopened and moved to 4.5.

Comment 4 Stefan Schimanski 2020-03-12 15:30:31 UTC

Reopened and moved to 4.5.

Comment 5 Standa Laznicka 2020-05-19 15:13:46 UTC

No progress in 4.5 about this (mirroring changes to the operator bug: https://bugzilla.redhat.com/show_bug.cgi?id=1806917#c3)

Comment 19 errata-xmlrpc 2021-02-24 15:10:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633