Bug 1894918

Summary: [4.5] Panic output due to timeouts in openshift-apiserver
Product: OpenShift Container Platform Reporter: Simon Pasquier <spasquie>
Component: openshift-apiserverAssignee: Lukasz Szaszkiewicz <lszaszki>
Status: CLOSED ERRATA QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: low    
Version: 4.5CC: aos-bugs, lszaszki, mfojtik, skumari, slaznick, sttts, xxia
Target Milestone: ---   
Target Release: 4.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: LifecycleReset
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1894916 Environment:
Undiagnosed panic detected in pod
Last Closed: 2021-03-11 06:55:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1894916    
Bug Blocks:    

Description Simon Pasquier 2020-11-05 12:34:17 UTC
+++ This bug was initially created as a clone of Bug #1894916 +++

+++ This bug was initially created as a clone of Bug #1885644 +++

test:
Undiagnosed panic detected in pod 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=Undiagnosed+panic+detected+in+pod


Several ovn upgrade from 4.5 -> 4.6 failed.

One of the job link - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-ovn-upgrade-4.5-stable-to-4.6-ci/1313309253259235328

Error message:
pods/openshift-apiserver_apiserver-7d87777d99-lx9f4_openshift-apiserver.log.gz:E1006 04:09:10.949495       1 runtime.go:78] Observed a panic: &errors.errorString{s:"killing connection/stream because serving request timed out and response had been started"} (killing connection/stream because serving request timed out and response had been started)
pods/openshift-apiserver_apiserver-7d87777d99-lx9f4_openshift-apiserver.log.gz:E1006 04:09:11.910858       1 runtime.go:78] Observed a panic: &errors.errorString{s:"killing connection/stream because serving request timed out and response had been started"} (killing connection/stream because serving request timed out and response had been started)

--- Additional comment from Stefan Schimanski on 2020-10-07 15:09:02 UTC ---

This is not a classical panic. We have a fix in kube-apiserver to make it prettier.

--- Additional comment from Lukasz Szaszkiewicz on 2020-10-23 08:08:57 UTC ---

I have opened a WIP PR https://github.com/kubernetes/kubernetes/pull/95002, haven't got time to finish it

--- Additional comment from Simon Pasquier on 2020-11-05 12:33:17 UTC ---

I've cloned this bug so it gets linked to failing tests in Sippy.

Comment 1 Lukasz Szaszkiewicz 2020-11-16 09:44:11 UTC
The upstream PR merged last week. I'm in the process of backporting it to the earlier version.

Comment 2 Michal Fojtik 2020-12-16 10:10:09 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 3 Simon Pasquier 2020-12-16 10:24:27 UTC
This bug is for 4.5 and it depends on bug 1894916 (4.6) which is still in POST state.

Comment 4 Michal Fojtik 2020-12-16 11:10:23 UTC
The LifecycleStale keyword was removed because the bug got commented on recently.
The bug assignee was notified.

Comment 5 Lukasz Szaszkiewicz 2021-01-15 10:19:36 UTC
PRs have been already reviewed and tagged. It will be merged after https://bugzilla.redhat.com/show_bug.cgi?id=1894916 is verified.

Comment 6 Lukasz Szaszkiewicz 2021-02-05 13:35:03 UTC
Waiting for https://github.com/openshift/openshift-apiserver/pull/163 to be merged.

Comment 8 Xingxing Xia 2021-02-23 10:42:44 UTC
Verified in 4.5.0-0.nightly-2021-02-22-141205 with steps of bug 1885644#c8, got same results.

Comment 11 errata-xmlrpc 2021-03-11 06:55:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.34 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0714