Bug 1546170 - [3.7] missing node-to-node OVS flows
Summary: [3.7] missing node-to-node OVS flows
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.7.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.7.z
Assignee: Ben Bennett
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On: 1544903 1546169 1547599
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-16 13:44 UTC by Dan Winship
Modified: 2018-04-05 09:39 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: In some (as-yet-undetermined) circumstances, nodes were apparently receiving a duplicate out-of-order HostSubnet "deleted" event from the master. Consequence: When processing the duplicate event, the node could end up deleting OVS flows corresponding to an active node, causing pods on the two nodes to be unable to communicate with each other. (This was most noticeable when it happened to a node hosting the registry.) Fix: The HostSubnet event-processing code will now notice that the event is a duplicate and ignore it. Result: OVS flows are not deleted, and pods can communicate.
Clone Of: 1546169
Environment:
Last Closed: 2018-04-05 09:38:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 18617 0 None None None 2018-02-16 13:44:29 UTC
Red Hat Product Errata RHBA-2018:0636 0 None None None 2018-04-05 09:39:21 UTC

Comment 1 Dan Winship 2018-02-16 13:46:15 UTC
https://github.com/openshift/ose/pull/1073

Comment 3 Meng Bo 2018-03-12 10:34:08 UTC
Tested on 3.7.38, there is no replay of hostsubnet delete.

Comment 4 Dan Winship 2018-03-12 14:13:14 UTC
(In reply to Meng Bo from comment #3)
> Tested on 3.7.38, there is no replay of hostsubnet delete.

That's not what the patch fixes. The patch attempts to make it so that if a "replayed hostsubnet delete" occurs, that we do the right thing. But we don't know how to actually cause the "replayed hostsubnet delete" (or even if that really is the right description of what's occurring), so the fix can't really be QA'ed at this point (other than to make sure that it doesn't break anything else).

The real test will be when we get this fix deployed to Online, and we see if the node-to-node routing problem goes away.

Comment 8 errata-xmlrpc 2018-04-05 09:38:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0636


Note You need to log in before you can comment on or make changes to this bug.