Bug 1550266
Summary: | SDN fails to clear NodeNetworkUnavailable node condition on GCP | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Ravi Sankar <rpenta> |
Component: | Networking | Assignee: | Ravi Sankar <rpenta> |
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.10.0 | CC: | aos-bugs, bbennett, hongli |
Target Milestone: | --- | Keywords: | NeedsTestCase |
Target Release: | 3.10.0 | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause: We may fail to clear NodeNetworkUnavailable condition sometimes on GCP
Consequence: Node can not take pod traffic for longer period until the NodeNetworkUnavailable condition is removed.
Fix: Fixed bug in clearing NodeNetworkUnavailable condition
Result: Node should be able to handle pod traffic as expected.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-07-30 19:10:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ravi Sankar
2018-02-28 21:33:06 UTC
Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/308bb2e8f4f0a198e92993f9ec7a8f5d8ca7e349 Merge pull request #18758 from pravisankar/fix-clear-nodenetwork Automatic merge from submit-queue. Bug 1550266 - Fix clearInitialNodeNetworkUnavailableCondition() in sdn master #This change fixes these 2 issues: - Currently, clearing NodeNetworkUnavailable node condition only works if we are successful in updating the node status during the first iteration. Subsequent retries will not work because: 1. knode != node 2. node.Status is updated in memory 3. UpdateNodeStatus(knode) (3) will have no effect as in step (2) node.Status is updated but not knode.Status - Node object passed to this method is pointer to an item in the informer cache and it should not be modified directly. Avoid NodeNetworkUnavailable condition check for every node status update - We know that kubelet sets NodeNetworkUnavailable condition when the node is created/registered with api server. - So we only need to call clearInitialNodeNetworkUnavailableCondition() for the first time and not during subsequent node status update events. no issue found during regression test on GCP with v3.10.0-0.54.0. OS: Red Hat Enterprise Linux Server release 7.5 (Maipo) kernel: Linux qe-310-crio-master-etcd-1 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816 |