Bug 2004632
Summary: | When LE takes a large amount of time, multiple whereabouts are seen | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Martin Kennelly <mkennell> | ||||
Component: | Networking | Assignee: | Douglas Smith <dosmith> | ||||
Networking sub component: | multus | QA Contact: | Weibin Liang <weliang> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | high | ||||||
Priority: | urgent | CC: | bbennett | ||||
Version: | 4.10 | ||||||
Target Milestone: | --- | ||||||
Target Release: | 4.10.0 | ||||||
Hardware: | All | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 2009493 (view as bug list) | Environment: | |||||
Last Closed: | 2022-03-10 16:10:53 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 2009493 | ||||||
Attachments: |
|
Description
Martin Kennelly
2021-09-15 17:04:06 UTC
I think, I see the issue in the code - in pkg/storage/kubernetes/ipam.go: go func() { defer wg.Done() ctx, cancel := context.WithCancel(context.Background()) res := make(chan error) go func() { logging.Debugf("Started leader election") le.Run(ctx) logging.Debugf("Finished leader election") res <- nil }() LE never ends and needs a timeout. It should be context.WithTimeout(). Here is a possible implementation of a fix: https://github.com/k8snetworkplumbingwg/whereabouts/pull/142 Flow the steps in description, all the pods can get the unical IP addresses from two WB instances, tested in 4.10.0-0.nightly-2021-10-15-025303 [weliang@weliang whereabouts-stopwatch]$ oc get pods | grep test | awk '{print $1}' | xargs -I {} oc exec -t {} -- ip a | grep "inet 10.10" | awk '{print $2}' | sort | uniq | wc -l 406 [weliang@weliang whereabouts-stopwatch]$ oc get pod | grep Running | wc -l 406 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056 |