Bug 1682955

Summary: Network Policy Plugin does not clean up flows from deleted namespaces
Product: OpenShift Container Platform Reporter: Stuart Auchterlonie <sauchter>
Component: NetworkingAssignee: Dan Winship <danw>
Status: CLOSED ERRATA QA Contact: zhaozhanqi <zzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, danw, wsun, zzhao
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Multiple bugs Consequence: When deleting a Namespace that contained NetworkPolicies, some of the OVS rules pertaining to those NetworkPolicies did not get cleaned up right away. While these rules were harmless in themselves (they wouldn't cause any packets to be mistakenly accepted or rejected), and they would eventually get cleaned up by a periodic resync operation, it was still confusing to have them be left behind. Fix: All OVS flows associated with a Namespace should now be deleted properly when that Namespace is deleted. Result: No stale OVS flows
Story Points: ---
Clone Of:
: 1686025 1686026 1686029 1686030 (view as bug list) Environment:
Last Closed: 2019-06-04 10:44:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Stuart Auchterlonie 2019-02-25 22:35:39 UTC
Description of problem:

Network Policy plugin flows are not cleaned up for namespaces
which have already been deleted.

Version-Release number of selected component (if applicable):

3.10, and also on the test packages from
https://bugzilla.redhat.com/show_bug.cgi?id=1656805#c26
(from the both fixes branch)

How reproducible:

High

Steps to Reproduce:
1.) creating a namespace 
2.) deploy an pod
3.) get the worker node where the pod was deployed
4.) check the flows on the worker with ovs-ofctl -O OpenFlow13 dump-flows br0 | grep "table=80" |  grep "priority=50" | wc -l
5.) delete the namespace 
6.) check agin the flows on the worker

Actual results:

We can see the rules are still growing especially with the priority=50


Expected results:

Network policy plugin should cleanup rules related to
deleted namespaces.

Additional info:

Comment 3 Dan Winship 2019-02-26 15:30:41 UTC
as a workaround, you can do

  oc delete networkpolicies -n ${NAMESPACE_NAME} --all

*before* deleting the namespace.

Comment 9 zhaozhanqi 2019-03-18 05:28:36 UTC
Verified this bug on 4.0.0-0.nightly-2019-03-15-063749

the following scenarios works well

1. Create a namespace. Create some pods and NetworkPolicies in that namespace. Confirm with "ovs-ofctl -O OpenFlow13 dump-flows br0 table=80" on a node that there are flows referencing that namespace (via "reg0" in the OVS flow referring to the NetID of the NetNamespace associated with the Namespace). Now delete the namespace and wait for it to disappear from "oc get namespaces". All of the OVS flows referencing that namespace should now be gone.

2. Create a namespace. Create some pods and NetworkPolicies in that namespace. Kill the SDN pods and wait for them to restart. Delete the namespace. Confirm that the OVS flows are deleted just like in the above case.

2. Create a namespace. Create two pods and a "default deny" NetworkPolicy in that namespace and confirm that the pods can't talk to each other. Kill the SDN pods and wait for them to restart. Add a NetworkPolicy to the namespace to allow communication between the two pods. (Don't create any new pods or re-label the existing pods after restarting the SDN; *Only* add a NetworkPolicy.) Confirm that new OVS flows have been created and the pods can talk.

Comment 11 errata-xmlrpc 2019-06-04 10:44:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758