Bug 1539187 - Node startup should flush stale ovs rules when hostsubnetlength changes on restart
Summary: Node startup should flush stale ovs rules when hostsubnetlength changes on re...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.9.0
Assignee: Jacob Tanenbaum
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-26 20:54 UTC by Robert Bost
Modified: 2018-07-11 07:24 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-03-28 14:23:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 18634 0 None None None 2018-02-15 20:59:41 UTC
Red Hat Product Errata RHBA-2018:0489 0 None None None 2018-03-28 14:23:35 UTC

Description Robert Bost 2018-01-26 20:54:46 UTC
Description of problem:
Customer report:

"Upon upgrading to OpenShift 3.7, our pod IP network became unavailable across nodes. This was debugged to the point that OpenShift was handing out colliding hostsubnet values. For example, some hosts may have been given a 10.1.5.0/24 while others already had the 10.1.4.0/23 range (these two subnets collide)."

OpenShift should not allow two hostsubnet ranges to collide. 


Version-Release number of selected component (if applicable): 3.7


Expected results:
"I expect to see Openshift not give colliding subnet values if the master services can be configured in a way to hand out different subnet lengths."

Comment 1 Jacob Tanenbaum 2018-01-31 20:24:45 UTC
Could you post the master-config.yaml file?

Comment 9 Jacob Tanenbaum 2018-02-02 21:24:26 UTC
We want to allow the master to change the network if something gets messed up, that change has not been reflected in the node sdn setup rules and it should be.

Comment 10 openshift-github-bot 2018-02-24 12:25:03 UTC
Commit pushed to master at https://github.com/openshift/origin

https://github.com/openshift/origin/commit/ffc83819c44440e4e1b30aa34a2ce41e3aab8e75
Correctly flush stale ovs rules on Node startup

currently in openshift when creating a new ovs bridge it does so using

ovs-vsctl --if-exists del-br br0 -- add-br br0 -- set Bridge br0 fail-mode=secure protocols=OpenFlow13

which while it does delete the bridge does not clear the flows attached to it. Spliting bridge creation into two steps, deleting the old bridge and creating the new one correctly deletes any stale ovs flows.
Bug 1539187

Comment 12 Hongan Li 2018-03-05 08:02:07 UTC
verified in openshift v3.9.2 and ovs has updated to delete br0 then create new one as below on node startup.

I0305 07:44:52.501637   14512 ovs.go:145] Executing: ovs-vsctl --if-exists del-br br0
I0305 07:44:52.577332   14512 ovs.go:145] Executing: ovs-vsctl add-br br0 -- set Bridge br0 fail-mode=secure protocols=OpenFlow13

Comment 15 errata-xmlrpc 2018-03-28 14:23:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489


Note You need to log in before you can comment on or make changes to this bug.