Bug 1752636 - Networkpolicy resources not getting applied on update
Summary: Networkpolicy resources not getting applied on update
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.3.0
Assignee: Dan Winship
QA Contact: zhaozhanqi
Depends On:
Blocks: 1758232 1758233 1758235
TreeView+ depends on / blocked
Reported: 2019-09-16 18:46 UTC by rvanderp
Modified: 2020-04-14 19:13 UTC (History)
24 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: In clusters with many namespaces (and especially, with lots of namespace creation/deletion) and lots of NetworkPolicies that select namespaces, OpenShift might take a very long time to apply the NetworkPolicy rules for newly-created namespaces. Consequence: When a namespace was created, it might take an hour or more before it was correctly accessible to/from other namespaces. Fix: Improvements were made to the Namespace and NetworkPolicy handling code. Result: NetworkPolicies should be applied promptly to newly-created namespaces.
Clone Of:
: 1758232 1758233 1758235 (view as bug list)
Last Closed: 2020-01-23 11:05:53 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 36 0 'None' closed Bug 1752636: networkpolicy: add a namespaceSelector cache 2021-01-22 08:54:42 UTC
Github openshift sdn pull 42 0 'None' closed further NetworkPolicy caching fixes 2021-01-22 08:54:43 UTC
Red Hat Knowledge Base (Solution) 4984841 0 None None None 2020-04-14 19:13:50 UTC
Red Hat Product Errata RHBA-2020:0062 0 None None None 2020-01-23 11:06:22 UTC

Description rvanderp 2019-09-16 18:46:30 UTC
Description of problem:
When network policy resources are changed/applied, the network policy is not implemented until the ovs and sdn pods are deleted.  When the pods are recreated, the configured policy is active.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Apply updated network policy
2. Test that updated policy is in place

Actual results:
NetworkPolicy updates are not reflected until ovs/sdn pods are deleted

Expected results:
NetworkPolicy updates should be reflected without intervention

Additional info:
Attached are sdn pod logs - 

1. we do see a number of error messages such as the following:

E0916 08:59:15.072394   13237 proxier.go:1321] Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
I0916 08:59:15.072441   13237 proxier.go:1323] Closing local ports after iptables-restore failure

2. There are also a number errors related to not being able to find the VNID:

E0916 08:59:43.269814   13237 networkpolicy.go:471] Could not find VNID for NetworkPolicy the-namespace-e57c2e1050d14b5f9aacaa908c39b04f/allow-from-global-namespaces

Comment 1 Ricardo Carrillo Cruz 2019-09-17 07:29:01 UTC

Can you please paste 'oc get netnamespaces'?
Do you know if this cluster was initially created as Multitenant, then moved to NetworkPolicy ?

Comment 11 Ricardo Carrillo Cruz 2019-09-18 11:44:53 UTC
Yeah, thanks, that's what I was after.
I wanted to know if they were created flows at all. Since there are flows in table 80, something had to be
slowing it down, since the sync flows go code in networkpolicy in SDN syncs every second.

Comment 13 Borja Aranda 2019-09-18 12:14:25 UTC
Created attachment 1616210 [details]
AFTER deleting netpols

Comment 14 Borja Aranda 2019-09-18 12:14:51 UTC
Created attachment 1616211 [details]
BEFORE deleting netpols

Comment 49 Dan Winship 2019-09-21 00:43:56 UTC
OK, so further debugging showed that every call to networkpolicy.go:handleAddOrUpdateNamespace() was taking 2 seconds to run.

This turns out to be because they have a huge number of namespaces, and every one of them has an "allow from default namespace" policy (as created by the multitenant-to-networkpolicy migration script, and as recommended to make routers work in 3.11).

The problem is that networkpolicy.go doesn't make any effort to recognize that these are all the same policy. So every time you add a new Namespace, it sees that there are 10,000 (or whatever) NetworkPolicies with namespaceSelectors, and tests each one against the new Namespace to see if it's a match or not.

The fix is to reorganize that code to keep only a single copy of each NamespaceSelector, and apply its matches to every policy that uses that selector. This may not be easy to do.

(I don't think there are any good workarounds until there is a fix: getting rid of the allow-from-default policies would be disruptive, in that it would break routers and some other stuff.)

Comment 54 Casey Callendrello 2019-09-23 11:16:05 UTC
(In reply to Dan Winship from comment #49)
> The fix is to reorganize that code to keep only a single copy of each
> NamespaceSelector, and apply its matches to every policy that uses that
> selector. This may not be easy to do.

It would be interesting to maintain an inverted index of which network policies are interested in which label selectors. That should be a pretty good shortcut.

We would have to be clever with the deleted-label case, though.

Comment 55 Dan Winship 2019-09-23 15:36:44 UTC
Posted a patch against master; additional clones of this bug will be created for backports.

Comment 65 Anurag saxena 2019-10-01 15:26:50 UTC
Sure Dan Winship, will keep an eye on the merge. Thanks!

Comment 69 Dan Winship 2019-10-03 16:21:05 UTC
Not sure why this didn't get automatically moved to ON_QA before (are we not QA'ing 4.3 bugs yet) but this needs to get officially VERIFIED before the backports can proceed.

Comment 71 Anurag saxena 2019-10-04 00:13:57 UTC
Verifying based on comment 68 and today's scale test observations as follow


1) 3 master-4 worker cluster was brought up on 4.3 CI build 4.3.0-0.ci-2019-10-02-102400 (There are no nightly's yet)

2) 5000 projects with each containing `allow-from-default-namespace` policy were created

3) 950 pods were created randomly among 5000 projects to observe ovs flows (we are bound by 250 pods /node)

4) OVS table=80 flows across the workers totals around 974, seems good 

$ oc exec ovs-fpj9s -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l  
$ oc exec ovs-t49cn -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l
$ oc exec ovs-vzp6c -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l
$ oc exec ovs-xzp5d -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l

5) After a 6 hours longevity, OVS flow totals remain same

6) Network policy updates across the projects are also working without any sdn/ovs pod restarts.

Will verify again on 3.11 once backported. Thanks. I guess we need a bug to be opened for 3.11z as well

Comment 72 Borja Aranda 2019-10-04 08:26:34 UTC
I changed the version of this bz to 4.3.

The 3.11 bz clone is https://bugzilla.redhat.com/show_bug.cgi?id=1758235

Comment 73 Dan Winship 2019-10-04 12:02:19 UTC
(In reply to Borja from comment #72)
> I changed the version of this bz to 4.3.

"Version" is the version the bug was reported it; "Target Release" is the version it's being fixed in

Comment 77 errata-xmlrpc 2020-01-23 11:05:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.