Bug 1809296

Summary: operator panic during gcp install
Product: OpenShift Container Platform Reporter: Luke Meyer <lmeyer>
Component: kube-schedulerAssignee: Maciej Szulik <maszulik>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.3.0CC: aos-bugs, kewang, lmohanty, mfojtik, pcameron, scuppett, sttts, wking, xxia
Target Milestone: ---Keywords: Reopened, Upgrades
Target Release: 4.3.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1774212 Environment:
Last Closed: 2020-03-10 23:54:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1774212, 1803739, 1803742    
Bug Blocks: 1781286    

Comment 6 Ke Wang 2020-03-05 03:24:05 UTC
Verified with OCP build: 4.3.0-0.nightly-2020-03-03-144847

1. Check if the bug related PR has been already merged in current release.
$ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.3.0-0.nightly-2020-03-03-144847 | grep 'kube-scheduler-operator'
  cluster-kube-scheduler-operator               https://github.com/openshift/cluster-kube-scheduler-operator               f50817dfacbd8a90ffa9a5ba53cfdbf27bcc243a
  
$ git checkout -b 4.3.0-0.nightly-2020-03-03-144847 f50817df
  
$ git log --pretty="%h %an %cd - %s" f50817df | grep bump-library-go-43
f50817df OpenShift Merge Robot Thu Feb 20 02:52:39 2020 +0100 - Merge pull request #210 from mfojtik/bump-library-go-43

We can see the PR already in.

2. Installed the OCP build on GCP platform and check if a panic can be observed in kube-apiserver-operator log,
 
Refer to https://bugzilla.redhat.com/show_bug.cgi?id=1774212#c0, did the following checking,  
$ oc logs -n openshift-kube-apiserver-operator kube-apiserver-operator-6b98fdd948-l9qhh > kube-apiserver-operator.log

$ grep -n -E 'Starting BackingResourceController|Starting UnsupportedConfigOverridesController' kube-apiserver-operator.log
28:I0304 11:30:47.647446       1 backing_resource_controller.go:138] Starting BackingResourceController
29:I0304 11:30:47.647465       1 unsupportedconfigoverrides_controller.go:151] Starting UnsupportedConfigOverridesController

$ grep -n 'panic' ./kube-apiserver-operator.log

Not found any panic error.

Comment 7 Ke Wang 2020-03-05 09:57:09 UTC
Checked openshift-kube-scheduler-operator logs, the result is as expected in the following,

$ oc logs -n openshift-kube-scheduler-operator openshift-kube-scheduler-operator-6796b47dbd-lkz5p > openshift-kube-scheduler-operator.log

$ grep -n -E 'Starting BackingResourceController|Starting UnsupportedConfigOverridesController'  ./openshift-kube-scheduler-operator.log
21:I0305 09:01:17.639833       1 backing_resource_controller.go:138] Starting BackingResourceController
22:I0305 09:01:17.639874       1 unsupportedconfigoverrides_controller.go:151] Starting UnsupportedConfigOverridesController

$ grep -n 'panic' ./openshift-kube-scheduler-operator.log

Not found any panic error.

Comment 9 errata-xmlrpc 2020-03-10 23:54:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0676

Comment 10 W. Trevor King 2020-03-20 22:07:32 UTC
Tracking down when the fix went out:

$ for X in 2 3 5 7; do echo -n "4.3.${X} "; oc adm release info --commits "quay.io/openshift-release-dev/ocp-release:4.3.${X}-x86_64" | grep cluster-kube-scheduler-operator; done
4.3.2   cluster-kube-scheduler-operator               https://github.com/openshift/cluster-kube-scheduler-operator               da32b6a109a7729678c889935ff310809263077a
4.3.3   cluster-kube-scheduler-operator               https://github.com/openshift/cluster-kube-scheduler-operator               da32b6a109a7729678c889935ff310809263077a
4.3.5   cluster-kube-scheduler-operator               https://github.com/openshift/cluster-kube-scheduler-operator               f50817dfacbd8a90ffa9a5ba53cfdbf27bcc243a
4.3.7   cluster-kube-scheduler-operator               https://github.com/openshift/cluster-kube-scheduler-operator               ffb17cb15fd1833f6118ddb3835972ca3b443f22

So the fix [1] went out in 4.3.5.

[1]: https://github.com/openshift/cluster-kube-scheduler-operator/pull/210#event-3054076693

Comment 11 W. Trevor King 2021-04-05 17:46:44 UTC
Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1].  If you feel like this bug still needs to be a suspect, please add keyword again.

[1]: https://github.com/openshift/enhancements/pull/475