Bug 1508445 - [3.6] failed to start SDN plugin controller when Network CIDRS are invalid.
Summary: [3.6] failed to start SDN plugin controller when Network CIDRS are invalid.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.6.z
Assignee: Ben Bennett
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On: 1506017
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-01 12:42 UTC by Ben Bennett
Modified: 2017-12-14 21:02 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: 3.6 rejected certain invalid master-config.yaml values which 3.5 silently accepted Consequence: When upgrading from 3.5 to 3.6, the master would fail to start if the clusterNetworkCIDR or serviceNetworkCIDR value in master-config.yaml was "invalid". (eg, if you had "172.30.1.1/16" instead of "172.30.0.0/16") Fix: 3.6 now accepts the same invalid values that 3.5 accepted, but logs a warning about it Result: Upgrades will now work, and the admin is notified about the incorrect config values
Clone Of: 1506017
Environment:
Last Closed: 2017-12-14 21:02:32 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift ose pull 918 None None None 2017-11-01 12:43:07 UTC
Origin (Github) 17076 None None None 2017-11-01 12:42:15 UTC
Red Hat Product Errata RHBA-2017:3438 normal SHIPPED_LIVE OpenShift Container Platform 3.6 and 3.5 bug fix and enhancement update 2017-12-15 01:58:11 UTC

Description Ben Bennett 2017-11-01 12:42:15 UTC
+++ This bug was initially created as a clone of Bug #1506017 +++

Description of problem:

# In your master-config file:
clusterNetworkCIDR: 10.1.0.0/13
serviceNetworkCIDR: 172.30.0.0/1

# clusternetwork object is created with this:
network: 10.0.0.0/13
ServiceNetwork: 172.24.0.0/13


Version-Release number of selected component (if applicable):
3.6 

How reproducible:
100% 

Steps to Reproduce:
1. Install 3.5 cluster with ansible host values of: 

osm_cluster_network_cidr=10.1.0.0/13
openshift_portal_net=172.30.0.0/13

2. After install network gets set to 

# clusternetwork object is created with this:
network: 10.0.0.0/13
ServiceNetwork: 172.24.0.0/13

3. Upgrade to 3.6 with same ansible host values. 

osm_cluster_network_cidr=10.1.0.0/13
openshift_portal_net=172.30.0.0/13


Actual results:

Controller fails to start due to values set in master-config.yaml 

atomic-openshift-master-controllers[111528]: E1019 12:17:26.599325  111528 common.go:46] Configured clusterNetworkCIDR value "10.1.0.0/13" is invalid; treating it as "10.0.0.0/13"

atomic-openshift-master-controllers[111528]: E1019 12:17:26.599336  111528 common.go:54] Configured serviceNetworkCIDR value "172.30.0.0/13" is invalid; treating it as "172.24.0.0/13"

atomic-openshift-master-controllers[111528]: F1019 12:17:26.612560  111528 start_master.go:776] Error starting "openshift.io/sdn" (failed to start SDN plugin controller: cannot change clusterNetworkCIDR to a value that does not include the existing network.)

Expected results:

The controller to start as the values use the same netmask.

--- Additional comment from Dan Winship on 2017-10-31 16:02:22 EDT ---

(Note: In 3.7 this is fixed by the combination of https://github.com/openshift/origin/pull/17076 and https://github.com/openshift/origin/pull/17117.)

--- Additional comment from Dan Winship on 2017-10-31 17:35:02 EDT ---

https://github.com/openshift/ose/pull/918

Comment 2 Yan Du 2017-12-05 07:47:30 UTC
After upgrading OCP v3.5 to v3.6.173.0.83 with the parameters:
osm_cluster_network_cidr=10.1.0.0/13
openshift_portal_net=172.30.0.0/13

Both atomic-openshift-master and node works well after upgrade finished. And we could get the warning when using a invalid Network CIDR.
Dec 05 01:16:12 host-8-241-24.host.centralci.eng.rdu2.redhat.com atomic-openshift-master[26892]: I1205 01:16:12.213132   26892 subnets.go:97] Created HostSubnet host-8-241-24.host.centralci.eng.rdu2.redhat.com (host: "host-8-241-24.host.centralci.eng.rdu2.redhat.com", ip: "10.8.241.24", subnet: "10.1.0.0/23")
Dec 05 01:49:42 host-8-241-24.host.centralci.eng.rdu2.redhat.com atomic-openshift-master[1612]: E1205 01:49:42.561393    1612 common.go:46] Configured clusterNetworkCIDR value "10.1.0.0/13" is invalid; treating it as "10.0.0.0/13"

Comment 5 errata-xmlrpc 2017-12-14 21:02:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3438


Note You need to log in before you can comment on or make changes to this bug.