Bug 1894918 - [4.5] Panic output due to timeouts in openshift-apiserver
Summary: [4.5] Panic output due to timeouts in openshift-apiserver
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: openshift-apiserver
Version: 4.5
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
: 4.5.z
Assignee: Lukasz Szaszkiewicz
QA Contact: Xingxing Xia
URL:
Whiteboard: LifecycleReset
Depends On: 1894916
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-05 12:34 UTC by Simon Pasquier
Modified: 2021-03-11 06:55 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1894916
Environment:
Undiagnosed panic detected in pod
Last Closed: 2021-03-11 06:55:27 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift oauth-server pull 65 0 None closed Bug 1894918: Panic output due to timeouts in oauth-apiserver 2021-02-10 07:56:00 UTC
Github openshift origin pull 25751 0 None closed Bug 1894918: Panic output due to timeouts in kube-apiserver 2021-02-10 07:55:59 UTC
Red Hat Product Errata RHBA-2021:0714 0 None None None 2021-03-11 06:55:35 UTC

Description Simon Pasquier 2020-11-05 12:34:17 UTC
+++ This bug was initially created as a clone of Bug #1894916 +++

+++ This bug was initially created as a clone of Bug #1885644 +++

test:
Undiagnosed panic detected in pod 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=Undiagnosed+panic+detected+in+pod


Several ovn upgrade from 4.5 -> 4.6 failed.

One of the job link - https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-ovn-upgrade-4.5-stable-to-4.6-ci/1313309253259235328

Error message:
pods/openshift-apiserver_apiserver-7d87777d99-lx9f4_openshift-apiserver.log.gz:E1006 04:09:10.949495       1 runtime.go:78] Observed a panic: &errors.errorString{s:"killing connection/stream because serving request timed out and response had been started"} (killing connection/stream because serving request timed out and response had been started)
pods/openshift-apiserver_apiserver-7d87777d99-lx9f4_openshift-apiserver.log.gz:E1006 04:09:11.910858       1 runtime.go:78] Observed a panic: &errors.errorString{s:"killing connection/stream because serving request timed out and response had been started"} (killing connection/stream because serving request timed out and response had been started)

--- Additional comment from Stefan Schimanski on 2020-10-07 15:09:02 UTC ---

This is not a classical panic. We have a fix in kube-apiserver to make it prettier.

--- Additional comment from Lukasz Szaszkiewicz on 2020-10-23 08:08:57 UTC ---

I have opened a WIP PR https://github.com/kubernetes/kubernetes/pull/95002, haven't got time to finish it

--- Additional comment from Simon Pasquier on 2020-11-05 12:33:17 UTC ---

I've cloned this bug so it gets linked to failing tests in Sippy.

Comment 1 Lukasz Szaszkiewicz 2020-11-16 09:44:11 UTC
The upstream PR merged last week. I'm in the process of backporting it to the earlier version.

Comment 2 Michal Fojtik 2020-12-16 10:10:09 UTC
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.

Comment 3 Simon Pasquier 2020-12-16 10:24:27 UTC
This bug is for 4.5 and it depends on bug 1894916 (4.6) which is still in POST state.

Comment 4 Michal Fojtik 2020-12-16 11:10:23 UTC
The LifecycleStale keyword was removed because the bug got commented on recently.
The bug assignee was notified.

Comment 5 Lukasz Szaszkiewicz 2021-01-15 10:19:36 UTC
PRs have been already reviewed and tagged. It will be merged after https://bugzilla.redhat.com/show_bug.cgi?id=1894916 is verified.

Comment 6 Lukasz Szaszkiewicz 2021-02-05 13:35:03 UTC
Waiting for https://github.com/openshift/openshift-apiserver/pull/163 to be merged.

Comment 8 Xingxing Xia 2021-02-23 10:42:44 UTC
Verified in 4.5.0-0.nightly-2021-02-22-141205 with steps of bug 1885644#c8, got same results.

Comment 11 errata-xmlrpc 2021-03-11 06:55:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.34 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0714


Note You need to log in before you can comment on or make changes to this bug.