Bug 1601813

Summary: [3.9] Could not find an allocated subnet for node
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: NodeAssignee: Avesh Agarwal <avagarwa>
Status: CLOSED ERRATA QA Contact: Weihua Meng <wmeng>
Severity: high Docs Contact:
Priority: high    
Version: 3.9.0CC: akostadi, aos-bugs, decarr, dma, ghuang, hongli, jchaloup, jialiu, jokerman, mifiedle, mmccomas, nraghava, shlao, sjenning, weshi, wmeng
Target Milestone: ---Keywords: OpsBlocker, Regression
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Recently, cloudResourceSyncManager was implemented which continuously fetched node addresses from cloud providers. Kubelet then received node addresses from the cloudResourceSyncManager. At the time of node registration or kubelet start, kubelet fetches node addresses in a blocking loop from cloudResourceSyncManager. The issue was that cloudResourceSyncManager was not started before kubelet had started fetching node addresses from it for the first time, and due to this, kubelet got stuck in the blocking loop and never returned. It caused node failures at network level, and no node could be registered. Also as kubelet blocked early, the cloudResourceSyncManager never got a chance to start. Solution: CloudResourceSyncManager is now started early in the kubelet startup process so that kubelet does not get blocked on it and cloudResourceSyncManager is always started.
Story Points: ---
Clone Of: 1601378 Environment:
Last Closed: 2018-08-09 22:13:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Derek Carr 2018-07-19 18:23:32 UTC
3.9 PR merged:
https://github.com/openshift/ose/pull/1363

Comment 3 Weihua Meng 2018-07-26 06:34:00 UTC
Fixed.

openshift v3.9.38
kubernetes v1.9.1+a0ce1bc657

Kernel Version: 3.10.0-862.9.1.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)

no build attached to errata

change status to modified.

Comment 4 Weihua Meng 2018-08-01 09:23:10 UTC
fixed.

atomic-openshift-3.9.40-1.git.0.0c9824a.el7.x86_64

openshift v3.9.40

Kernel Version: 3.10.0-862.9.1.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.5 (Maipo)

Comment 6 errata-xmlrpc 2018-08-09 22:13:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2335