Bug 1949348 - On GCP, load balancers report kube-apiserver fails its /readyz check 50% of the time, causing load balancer backend churn and disruptions to apiservers
Summary: On GCP, load balancers report kube-apiserver fails its /readyz check 50% of t...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-apiserver
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.7.z
Assignee: Antonio Ojea
QA Contact: Ke Wang
URL:
Whiteboard: tag-ci
: 1966595 (view as bug list)
Depends On: 1925698
Blocks: 1930457
TreeView+ depends on / blocked
 
Reported: 2021-04-14 04:09 UTC by OpenShift BugZilla Robot
Modified: 2021-06-29 04:20 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Google Cloud Loadbalancer healthcheckers leave stale conntrack entries on the hosts Consequence: Stale conntrack entries cause network interruptions to the apiserver traffic using the GCP loadbalancers Fix: Don't allow healthcheck traffic to loop through the host Result: No network disruption against the apiserver
Clone Of:
Environment:
Last Closed: 2021-06-29 04:19:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2526 0 None open [release-4.7] Bug 1949348: not allow healthcheck traffic to loop through the node 2021-05-25 13:02:14 UTC
Github openshift machine-config-operator pull 2593 0 None open [release-4.7] Bug 1966595: gcp-routes should wait until network is stopped 2021-06-03 22:45:13 UTC
Red Hat Bugzilla 1949348 1 unspecified CLOSED On GCP, load balancers report kube-apiserver fails its /readyz check 50% of the time, causing load balancer backend chur... 2021-06-29 04:20:13 UTC
Red Hat Product Errata RHBA-2021:2502 0 None None None 2021-06-29 04:20:13 UTC

Internal Links: 1949348

Comment 9 Antonio Ojea 2021-06-03 22:46:39 UTC
*** Bug 1966595 has been marked as a duplicate of this bug. ***

Comment 13 Ke Wang 2021-06-04 09:16:15 UTC
This bug's PR is dev-approved and not yet merged, so I'm following issue DPTP-660 to do the pre-merge verifying for QE pre-merge verification goal of issue OCPQE-815 by using the bot to launch a cluster with the open PR. The verification steps see Comment #6 and Comment #7. So the bug is pre-merge verified. After the PR gets merged, the bug will be moved to VERIFIED by the bot automatically or, if not working, by me manually.

Comment 14 Siddharth Sharma 2021-06-04 18:38:47 UTC
This bug will be shipped as part of next z-stream release 4.7.15 on June 14th, as 4.7.14 was dropped due to a regression https://bugzilla.redhat.com/show_bug.cgi?id=1967614

Comment 18 Ke Wang 2021-06-15 06:40:42 UTC
The PR has been landed into 4.7.0-0.nightly-2021-06-12-151209 nightly release and the bug has been verified via pre-merge Comment #6 and Comment #7. but the bot likely did not move it to "verified". Hence manually the appropriate state.

Comment 19 OpenShift Automated Release Tooling 2021-06-17 12:29:08 UTC
OpenShift engineering has decided to not ship Red Hat OpenShift Container Platform 4.7.17 due a regression https://bugzilla.redhat.com/show_bug.cgi?id=1973006. All the fixes which were part of 4.7.17 will be now part of 4.7.18 and planned to be available in candidate channel on June 23 2021 and in fast channel on June 28th.

Comment 23 errata-xmlrpc 2021-06-29 04:19:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.7.18 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2502


Note You need to log in before you can comment on or make changes to this bug.