1502866 – 3.6 Nodes will not start with 3.7 master

Bug 1502866 - 3.6 Nodes will not start with 3.7 master

Summary: 3.6 Nodes will not start with 3.7 master

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	unspecified
Target Milestone:	---
Target Release:	3.7.0
Assignee:	Dan Winship
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1499040 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-10-16 21:26 UTC by Eric Paris
Modified:	2018-12-29 06:44 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2017-11-28 22:17:21 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
oc get clusternetwork default -o yaml (373 bytes, text/plain) 2017-10-16 21:29 UTC, Eric Paris	no flags	Details
journal of failed node start (5.13 KB, text/plain) 2017-10-16 21:29 UTC, Eric Paris	no flags	Details
default clusternetwork after manual intervention (416 bytes, text/plain) 2017-10-16 21:42 UTC, Eric Paris	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:3188	0	normal	SHIPPED_LIVE	Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update	2017-11-29 02:34:54 UTC

Description Eric Paris 2017-10-16 21:26:00 UTC

The crashloop over and over.

error: SDN node startup failed: failed to get network information: failed to parse ClusterNetwork CIDR : invalid CIDR address:

danw indicated he sees the problem.

Comment 1 Eric Paris 2017-10-16 21:27:23 UTC

17:26 < danw> "oc edit clusternetwork default" and set the "network" and "hostsubnetlength" fields to match the values in clusterNetworks
17:26 < danw> it *might* re-break it every time you restart master though

Comment 2 Eric Paris 2017-10-16 21:29:08 UTC

Created attachment 1339473 [details]
oc get clusternetwork default -o yaml

Comment 3 Eric Paris 2017-10-16 21:29:36 UTC

Created attachment 1339474 [details]
journal of failed node start

Comment 4 Eric Paris 2017-10-16 21:42:56 UTC

Created attachment 1339478 [details]
default clusternetwork after manual intervention

note the trailing 'network' and 'hostsubnetlength' fields which match the CIDR.

Comment 5 Dan Winship 2017-10-16 22:29:55 UTC

https://github.com/openshift/origin/pull/16897

Comment 6 Ravi Sankar 2017-10-17 00:49:05 UTC

*** Bug 1499040 has been marked as a duplicate of this bug. ***

Comment 8 Yan Du 2017-10-24 04:32:42 UTC

Setup 3.6 env 
openshift v3.6.173.0.56
kubernetes v1.6.1+5115d708d7

upgrade master to 3.7
openshift v3.7.0-0.176.0
kubernetes v1.7.6+a08f5eeb62

3.6 nodes could start normally with 3.7 master.

Comment 9 Yan Du 2017-10-24 04:36:45 UTC

And the clusternetowrk works well too.
# oc get node
NAME                                               STATUS                     AGE       VERSION
host-8-xxx.host.centralci.eng.rdu2.redhat.com   Ready,SchedulingDisabled   2h        v1.7.6+a08f5eeb62
host-8-yyy.host.centralci.eng.rdu2.redhat.com   Ready                      2h        v1.6.1+5115d708d7
# oc get clusternetwork
NAME      CLUSTER NETWORKS   SERVICE NETWORK   PLUGIN NAME
default   10.128.0.0/14:9    172.30.0.0/16     redhat/openshift-ovs-subnet

Comment 12 errata-xmlrpc 2017-11-28 22:17:21 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.