1357474 – All nodes are unable to schedule pods due to the NetworkUnavailable has value True on GCE environment

Bug 1357474 - All nodes are unable to schedule pods due to the NetworkUnavailable has value True on GCE environment

Summary: All nodes are unable to schedule pods due to the NetworkUnavailable has value...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	high
Target Milestone:	---
Target Release:	3.3.1
Assignee:	Dan Williams
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-18 10:05 UTC by Wenqi He
Modified:	2016-09-27 09:40 UTC (History)
CC List:	14 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2016-09-27 09:40:27 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Origin (Github)	10545	0	None	None	None	2016-08-22 14:05:26 UTC
Red Hat Product Errata	RHBA-2016:1933	0	normal	SHIPPED_LIVE	Red Hat OpenShift Container Platform 3.3 Release Advisory	2016-09-27 13:24:36 UTC

Description Wenqi He 2016-07-18 10:05:28 UTC

Description of problem:
After installing openshift on GCE, unable to create pod due to FailedScheduling

Version-Release number of selected component (if applicable):
openshift v3.3.0.6
kubernetes v1.3.0+57fb9ac
etcd 2.3.0+git


How reproducible:
Always

Steps to Reproduce:
1.Install openshift with ansible on GCE of 2 nodes
2.Check the registry and router pods as admin in default project

Actual results:
The registry and router pods keep in pending status due to FailedScheduling

oc get nodes
qe-wehe-master-1                 Ready,SchedulingDisabled   6h
qe-wehe-node-registry-router-1   Ready                      6h

oc get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-6-deploy   0/1       Pending   0          6h
router-1-deploy            0/1       Pending   0          6h

oc describe pods 
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  6h		19s		1365	{default-scheduler }			Warning		FailedScheduling	no nodes available to schedule pods


Expected results:
The pod can be scheduling and running in nodes


Additional info:
To check the node info:
journalctl -u atomic-openshift-master | grep qe-wehe-node-registry-router-1

Jul 18 04:45:17 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:17.164555   23171 nodecontroller.go:821] Node qe-wehe-node-registry-router-1 ReadyCondition updated. Updating timestamp: {Capacity:map[alpha.kubernetes.io/nvidia-gpu:{i:{value:0 scale:0} d:{Dec:<nil>} s:0 Format:DecimalSI} cpu:{i:{value:1 scale:0} d:{Dec:<nil>} s:1 Format:DecimalSI} memory:{i:{value:3711508480 scale:0} d:{Dec:<nil>} s: Format:BinarySI} pods:{i:{value:110 scale:0} d:{Dec:<nil>} s:110 Format:DecimalSI}] Allocatable:map[alpha.kubernetes.io/nvidia-gpu:{i:{value:0 scale:0} d:{Dec:<nil>} s:0 Format:DecimalSI} cpu:{i:{value:1 scale:0} d:{Dec:<nil>} s:1 Format:DecimalSI} memory:{i:{value:3711508480 scale:0} d:{Dec:<nil>} s: Format:BinarySI} pods:{i:{value:110 scale:0} d:{Dec:<nil>} s:110 Format:DecimalSI}] Phase: Conditions:[{Type:NetworkUnavailable Status:True LastHeartbeatTime:{Time:0001-01-01 00:00:00 +0000 UTC} LastTransitionTime:{Time:2016-07-17 21:57:35 -0400 EDT} Reason:NoRouteCreated Message:Node created without a route} {Type:OutOfDisk Status:False LastHeartbeatTime:{Time:2016-07-18 04:45:04 -0400 EDT} LastTransitionTime:{Time:2016-07-17 21:57:36 -0400 EDT} Reason:KubeletHasSufficientDisk Message:kubelet has sufficient disk space available} {Type:MemoryPressure Status:False LastHeartbeatTime:{Time:2016-07-18 04:45:04 -0400 EDT} LastTransitionTime:{Time:2016-07-17 21:57:36 -0400 EDT} Reason:KubeletHasSufficientMemory Message:kubelet has sufficient memory available} {Type:Ready Status:True LastHeartbeatTime:{Time:2016-07-18 04:45:04 -0400 EDT} LastTransitionTime:{Time:2016-07-17 21:57:36 -0400 EDT} Reason:KubeletReady Message:kubelet is posting ready status}] Addresses:[{Type:InternalIP Address:10.240.0.4} {Type:ExternalIP Address:104.197.105.156}] DaemonEndpoints:{KubeletEndpoint:{Port:10250}} NodeInfo:{MachineID:4093bf66a4a4444886ac88feb9f56896 SystemUUID:452EC365-F247-2419-CF0D-E07E92D50793 BootID:29c8f130-5496-41f8-8cba-3649fa60fceb KernelVersion:3.10.0-327.el7.x86_64 OSImage:Red Hat Enterprise Linux Server 7.2 (Maipo) ContainerRuntimeVersion:docker://1.10.3 KubeletVersion:v1.3.0+5
Jul 18 04:45:19 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:19.468769   23171 factory.go:448] Ignoring node qe-wehe-node-registry-router-1 with NetworkUnavailable condition status True
Jul 18 04:45:19 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:19.468777   23171 listers.go:160] Node qe-wehe-node-registry-router-1 matches none of the conditions
Jul 18 04:45:20 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:20.390718   23171 factory.go:448] Ignoring node qe-wehe-node-registry-router-1 with NetworkUnavailable condition status True
Jul 18 04:45:20 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:20.390724   23171 listers.go:160] Node qe-wehe-node-registry-router-1 matches none of the conditions
Jul 18 04:45:21 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:21.393346   23171 factory.go:448] Ignoring node qe-wehe-node-registry-router-1 with NetworkUnavailable condition status True
Jul 18 04:45:21 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:21.393355   23171 listers.go:160] Node qe-wehe-node-registry-router-1 matches none of the conditions
Jul 18 04:45:23 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:23.396600   23171 factory.go:448] Ignoring node qe-wehe-node-registry-router-1 with NetworkUnavailable condition status True
Jul 18 04:45:23 qe-wehe-master-1 atomic-openshift-master[23171]: I0718 04:45:23.396605   23171 listers.go:160] Node qe-wehe-node-registry-router-1 matches none of the conditions

Comment 2 Ben Bennett 2016-07-20 14:20:58 UTC

This looks related to https://github.com/kubernetes/kubernetes/issues/26983

Comment 3 Dan Williams 2016-07-20 17:32:34 UTC

atomic-openshift 3.3.0.6 already includes https://github.com/kubernetes/kubernetes/pull/27525 which was the supposed fix for the GCE thing.  So either that fix is incomplete or this is a different problem.

Comment 4 Dan Williams 2016-07-20 17:47:46 UTC

Kube 26983 and 27525 actually only apply to non-GCE setups, so they don't appear to be relevant here.

Instead, this looks more like:

https://github.com/kubernetes/kubernetes/issues/27994
https://github.com/kubernetes/kubernetes/issues/27071

  - lastHeartbeatTime: null
    lastTransitionTime: 2016-07-19T05:30:08Z
    message: Node created without a route
    reason: NoRouteCreated
    status: "True"
    type: NetworkUnavailable

Wenqi He, are you sure the GCE routes to your nodes are created and correct?  The route controller should create them eventually, but it does that asynchronously.  So if there are no routes to the nodes created after 10 or 15 minutes or so, then perhaps your permissions are wrong and the routes cannot be created?

Can you look in your master-controller logs for "Could not create route" error messages?

Comment 6 Liang Xia 2016-07-28 08:24:07 UTC

Higher the priority since it is blocking testing on GCE.

Comment 10 Dan Williams 2016-08-03 21:49:16 UTC

Theory:

On startup for GCE, kubelet runs this code:

	// Initially, set NodeNetworkUnavailable to true.
	if kl.providerRequiresNetworkingConfiguration() {
		node.Status.Conditions = append(node.Status.Conditions, api.NodeCondition{
			Type:               api.NodeNetworkUnavailable,
			Status:             api.ConditionTrue,
			Reason:             "NoRouteCreated",
			Message:            "Node created without a route",
			LastTransitionTime: unversioned.NewTime(kl.clock.Now()),
		})
	}

which sets NodeNetworkUnavailable on the node in some cases.  This is determined by:

func (kl *Kubelet) providerRequiresNetworkingConfiguration() bool {
	if kl.cloud == nil || kl.cloud.ProviderName() != "gce" || kl.flannelExperimentalOverlay {
		return false
	}
	_, supported := kl.cloud.Routes()
	return supported
}

And in the case of GCE (1) the early return false does not trigger and (2) routes exist for the cluster.  Thus this function returns 'true' and the node gets NodeNetworkUnavailable set on it.

This condition is supposed to be cleared by the route controller.  But the route controller has this code:

func (rc *RouteController) reconcile(nodes []api.Node, routes []*cloudprovider.Route) error {
	... <snip>
	for _, node := range nodes {
		// Skip if the node hasn't been assigned a CIDR yet.
		if node.Spec.PodCIDR == "" {
			continue
		}
		// Check if we have a route for this node w/ the correct CIDR.
		r := routeMap[node.Name]

Thus if the node does not have a PodCIDR, it will never get the NodeNetworkUnavailable condition cleared.

openshift-sdn does not use PodCIDR, instead it uses HostSubnet resources.  Thus when running openshift-sdn on GCE, it seems expected that this condition would happen.  AWS is not affected because the kubelet code in providerRequiresNetworkingConfiguration() exempts the node from the initial network unavailable condition.  This is an expectation mismatch between upstream Kubernetes cloud support and openshift-sdn cloud support.

Comment 11 Dan Williams 2016-08-16 22:46:30 UTC

Possible fix in https://github.com/openshift/origin/pull/10471.

Is there any way you can try that build that fix in?  Or would it work better if you had some OpenShift RPMs that you could manually update the master with?

Comment 12 Wenqi He 2016-08-17 02:25:54 UTC

Discussed with QE prodution team, it is very hard for QE to build and deploy the the OCP with only a PR fix provided, maybe we can ask @sdodson (sdodson)to help here. Thanks.

Comment 13 Dan Williams 2016-08-17 14:28:17 UTC

(In reply to Wenqi He from comment #12)
> Discussed with QE prodution team, it is very hard for QE to build and deploy
> the the OCP with only a PR fix provided, maybe we can ask @sdodson
> (sdodson)to help here. Thanks.

I can build a new set of atomic-openshift RPMs for you with the candidate fix included if you can tell me the RPM name/version/release of what you are currently running.  Then you could use these RPMs for the deployment and testing.  Would that work?

Comment 16 Wenqi He 2016-08-18 03:13:52 UTC

(In reply to Dan Williams from comment #13)
> (In reply to Wenqi He from comment #12)
> > Discussed with QE prodution team, it is very hard for QE to build and deploy
> > the the OCP with only a PR fix provided, maybe we can ask @sdodson
> > (sdodson)to help here. Thanks.
> 
> I can build a new set of atomic-openshift RPMs for you with the candidate
> fix included if you can tell me the RPM name/version/release of what you are
> currently running.  Then you could use these RPMs for the deployment and
> testing.  Would that work?

I have installed a new GCE env and update the rpm include this fix, but unfortunately, this issue still repro, all the pods are in pending staus with failed scheduling.

Comment 24 Dan Williams 2016-08-19 18:41:35 UTC

Updated github PR: https://github.com/openshift/origin/pull/10545

Comment 27 Wenqi He 2016-08-25 10:58:20 UTC

I have tested this on containerized ocp on below version, this problem is fixed.
openshift v3.3.0.25+d2ac65e-dirty
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git

I have not got a chance to verify it on rpm installation, we met a blocking issue today, will try to verify it tomorrow.

Comment 28 Wenqi He 2016-08-26 00:04:45 UTC

I have verified this on below version, this bug is fixed:
openshift v3.3.0.25+d2ac65e-dirty
kubernetes v1.3.0+507d3a7
etcd 2.3.0+git

[root@qe-wehe-master-1 ~]# oc get pods
NAME              READY     STATUS    RESTARTS   AGE
hello-openshift   1/1       Running   0          8h

Comment 30 errata-xmlrpc 2016-09-27 09:40:27 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1933

Note You need to log in before you can comment on or make changes to this bug.