1959294 – openshift-operator-lifecycle-manager:olm-operator-serviceaccount should not rely on external networking for health check

Bug 1959294 - openshift-operator-lifecycle-manager:olm-operator-serviceaccount should not rely on external networking for health check

Summary: openshift-operator-lifecycle-manager:olm-operator-serviceaccount should not r...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	OLM
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.8.0
Assignee:	tflannag
QA Contact:	Jian Zhang
Docs Contact:
URL:
Whiteboard:
Depends On:	1959285 1959290 1959291 1959292 1959293
Blocks:
TreeView+	depends on / blocked

Reported:	2021-05-11 08:24 UTC by Rom Freiman
Modified:	2021-07-27 23:08 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1959293
Environment:
Last Closed:	2021-07-27 23:07:53 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift operator-framework-olm pull 77	0	None	open	Bug 1953977: Add tolerant delegating auth config for PackageServer	2021-05-12 17:13:43 UTC
Red Hat Product Errata	RHSA-2021:2438	0	None	None	None	2021-07-27 23:08:11 UTC

Description Rom Freiman 2021-05-11 08:24:14 UTC

+++ This bug was initially created as a clone of Bug #1959293 +++

+++ This bug was initially created as a clone of Bug #1959292 +++

+++ This bug was initially created as a clone of Bug #1959291 +++

+++ This bug was initially created as a clone of Bug #1959290 +++

+++ This bug was initially created as a clone of Bug #1959285 +++

Apparently, openshift-operator-lifecycle-manager:olm-operator-serviceaccount has dependency on SAR as part of it's healthcheck, which causes it to be restarted in case of kubeapi rollout in SNO.


How reproducible:

User cluster-bot:
1. launch nightly aws,single-node
2. Update audit log verbosity to: AllRequestBodies
3. Wait for api rollout (oc get kubeapiserver -o=jsonpath='{range .items[0].status.conditions[?(@.type=="NodeInstallerProgressing")]}{.reason}{"\n"}{.message}{"\n"}')
4. reboot the node to cleanup the caches (oc debug node/ip-10-0-136-254.ec2.internal)
5. Wait
6. Grep the audit log: 

oc adm node-logs ip-10-0-128-254.ec2.internal --path=kube-apiserver/audit.log | grep -i health | grep -i subjectaccessreviews | grep -v Unhealth > rbac.log
cat rbac.log  | jq . -C | less -r | grep 'username' | sort | uniq



Actual results:
~/work/installer [master]> cat rbac.log  | jq . -C | less -r | grep 'username' | sort | uniq
    "username": "system:serviceaccount:oopenshift-operator-lifecycle-manager:olm-operator-serviceaccount"

Expected results:
It should not appear

Additional info:
Affects SNO stability upon api rollout (certificates rotation)

Comment 9 errata-xmlrpc 2021-07-27 23:07:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438

Note You need to log in before you can comment on or make changes to this bug.