Bug 1884192 - kuryr-controller stuck if connection to K8s API dies silently
Summary: kuryr-controller stuck if connection to K8s API dies silently
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.5.z
Assignee: Luis Tomas Bolivar
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1884139
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-01 10:03 UTC by Luis Tomas Bolivar
Modified: 2020-10-19 14:55 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1884139
Environment:
Last Closed: 2020-10-19 14:55:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Kuryr controller log (9.71 KB, text/plain)
2020-10-14 10:33 UTC, Itzik Brown
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift kuryr-kubernetes pull 360 0 None closed Bug 1884192: Set read timeout for any request in K8sClient 2020-10-28 19:16:40 UTC
Red Hat Product Errata RHBA-2020:4228 0 None None None 2020-10-19 14:55:17 UTC

Description Luis Tomas Bolivar 2020-10-01 10:03:09 UTC
+++ This bug was initially created as a clone of Bug #1884139 +++

Kuryr components are often contacting the K8s API through a loadbalancer (e.g. Octavia LB in DevStack deployments, HAProxy in OpenShift) and we've often seen they're able to drop connections silently, effectively leaving our requests hanging forever. This got fixed in `K8sClient.watch` by setting a read timeout there which helped a lot, but we now seem to see it happening with other requests that doesn't have read timeout set.

Comment 3 Itzik Brown 2020-10-14 10:31:17 UTC
4.5.0-0.nightly-2020-10-10-030038
RHOS-16.1-RHEL-8-20201005.n.0
Attached the log from kuryr controller. Verified it times out.

Comment 4 Itzik Brown 2020-10-14 10:33:00 UTC
Created attachment 1721444 [details]
Kuryr controller log

Comment 6 errata-xmlrpc 2020-10-19 14:55:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.5.15 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4228


Note You need to log in before you can comment on or make changes to this bug.