Bug 1752636

Summary:	Networkpolicy resources not getting applied on update
Product:	OpenShift Container Platform	Reporter:	rvanderp
Component:	Networking	Assignee:	Dan Winship <danw>
Networking sub component:	openshift-sdn	QA Contact:	zhaozhanqi <zzhao>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	urgent
Priority:	unspecified	CC:	anusaxen, cdc, danw, dsafford, dyocum, erich, farandac, gparente, jack.ottofaro, jdesousa, mfiedler, mifiedle, misalunk, openshift-bugs-escalate, palonsor, piqin, rhowe, ricarril, rkshirsa, scuppett, sreber, tsmetana, wabouham, weliang
Version:	3.11.0
Target Milestone:	---
Target Release:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: In clusters with many namespaces (and especially, with lots of namespace creation/deletion) and lots of NetworkPolicies that select namespaces, OpenShift might take a very long time to apply the NetworkPolicy rules for newly-created namespaces. Consequence: When a namespace was created, it might take an hour or more before it was correctly accessible to/from other namespaces. Fix: Improvements were made to the Namespace and NetworkPolicy handling code. Result: NetworkPolicies should be applied promptly to newly-created namespaces.	Story Points:	---
Clone Of:
Clones:	1758232 1758233 1758235 (view as bug list)		Environment:
Last Closed:	2020-01-23 11:05:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1758232, 1758233, 1758235

Description rvanderp 2019-09-16 18:46:30 UTC

Description of problem:
When network policy resources are changed/applied, the network policy is not implemented until the ovs and sdn pods are deleted.  When the pods are recreated, the configured policy is active.

Version-Release number of selected component (if applicable):
3.11-104

How reproducible:
Consistently

Steps to Reproduce:
1. Apply updated network policy
2. Test that updated policy is in place
3.

Actual results:
NetworkPolicy updates are not reflected until ovs/sdn pods are deleted

Expected results:
NetworkPolicy updates should be reflected without intervention

Additional info:
Attached are sdn pod logs - 

1. we do see a number of error messages such as the following:

E0916 08:59:15.072394   13237 proxier.go:1321] Failed to execute iptables-restore: exit status 4 (Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
)
I0916 08:59:15.072441   13237 proxier.go:1323] Closing local ports after iptables-restore failure

2. There are also a number errors related to not being able to find the VNID:

E0916 08:59:43.269814   13237 networkpolicy.go:471] Could not find VNID for NetworkPolicy the-namespace-e57c2e1050d14b5f9aacaa908c39b04f/allow-from-global-namespaces

Comment 1 Ricardo Carrillo Cruz 2019-09-17 07:29:01 UTC

Hi

Can you please paste 'oc get netnamespaces'?
Do you know if this cluster was initially created as Multitenant, then moved to NetworkPolicy ?

Comment 11 Ricardo Carrillo Cruz 2019-09-18 11:44:53 UTC

Yeah, thanks, that's what I was after.
I wanted to know if they were created flows at all. Since there are flows in table 80, something had to be
slowing it down, since the sync flows go code in networkpolicy in SDN syncs every second.

Comment 13 Borja Aranda 2019-09-18 12:14:25 UTC

Created attachment 1616210 [details]
AFTER deleting netpols

Comment 14 Borja Aranda 2019-09-18 12:14:51 UTC

Created attachment 1616211 [details]
BEFORE deleting netpols

Comment 49 Dan Winship 2019-09-21 00:43:56 UTC

OK, so further debugging showed that every call to networkpolicy.go:handleAddOrUpdateNamespace() was taking 2 seconds to run.

This turns out to be because they have a huge number of namespaces, and every one of them has an "allow from default namespace" policy (as created by the multitenant-to-networkpolicy migration script, and as recommended to make routers work in 3.11).

The problem is that networkpolicy.go doesn't make any effort to recognize that these are all the same policy. So every time you add a new Namespace, it sees that there are 10,000 (or whatever) NetworkPolicies with namespaceSelectors, and tests each one against the new Namespace to see if it's a match or not.

The fix is to reorganize that code to keep only a single copy of each NamespaceSelector, and apply its matches to every policy that uses that selector. This may not be easy to do.

(I don't think there are any good workarounds until there is a fix: getting rid of the allow-from-default policies would be disruptive, in that it would break routers and some other stuff.)

Comment 54 Casey Callendrello 2019-09-23 11:16:05 UTC

(In reply to Dan Winship from comment #49)
> 
> The fix is to reorganize that code to keep only a single copy of each
> NamespaceSelector, and apply its matches to every policy that uses that
> selector. This may not be easy to do.
> 


It would be interesting to maintain an inverted index of which network policies are interested in which label selectors. That should be a pretty good shortcut.

We would have to be clever with the deleted-label case, though.

Comment 55 Dan Winship 2019-09-23 15:36:44 UTC

Posted a patch against master; additional clones of this bug will be created for backports.

Comment 65 Anurag saxena 2019-10-01 15:26:50 UTC

Sure Dan Winship, will keep an eye on the merge. Thanks!

Comment 69 Dan Winship 2019-10-03 16:21:05 UTC

Not sure why this didn't get automatically moved to ON_QA before (are we not QA'ing 4.3 bugs yet) but this needs to get officially VERIFIED before the backports can proceed.

Comment 71 Anurag saxena 2019-10-04 00:13:57 UTC

Verifying based on comment 68 and today's scale test observations as follow

Steps:

1) 3 master-4 worker cluster was brought up on 4.3 CI build 4.3.0-0.ci-2019-10-02-102400 (There are no nightly's yet)

2) 5000 projects with each containing `allow-from-default-namespace` policy were created

3) 950 pods were created randomly among 5000 projects to observe ovs flows (we are bound by 250 pods /node)

4) OVS table=80 flows across the workers totals around 974, seems good 

$ oc exec ovs-fpj9s -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l  
245
$ oc exec ovs-t49cn -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l
241
$ oc exec ovs-vzp6c -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l
244
$ oc exec ovs-xzp5d -- ovs-ofctl dump-flows br0 -O openflow13 | grep table=80 | wc -l
244

5) After a 6 hours longevity, OVS flow totals remain same

6) Network policy updates across the projects are also working without any sdn/ovs pod restarts.

Will verify again on 3.11 once backported. Thanks. I guess we need a bug to be opened for 3.11z as well

Comment 72 Borja Aranda 2019-10-04 08:26:34 UTC

I changed the version of this bz to 4.3.

The 3.11 bz clone is https://bugzilla.redhat.com/show_bug.cgi?id=1758235

Comment 73 Dan Winship 2019-10-04 12:02:19 UTC

(In reply to Borja from comment #72)
> I changed the version of this bz to 4.3.

"Version" is the version the bug was reported it; "Target Release" is the version it's being fixed in

Comment 77 errata-xmlrpc 2020-01-23 11:05:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062