Bug 1371971

Summary:	Need a minimum counterpart to iptablesSyncPeriod
Product:	OpenShift Container Platform	Reporter:	Mike Fiedler <mifiedle>
Component:	Networking	Assignee:	Timothy St. Clair <tstclair>
Status:	CLOSED DUPLICATE	QA Contact:	Meng Bo <bmeng>
Severity:	medium	Docs Contact:
Priority:	high
Version:	3.3.0	CC:	agrimm, aos-bugs, bbennett, jeder, mifiedle, tstclair
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:	aos-scalability-34
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-10-28 16:02:29 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1303130

Description Mike Fiedler 2016-08-31 14:57:26 UTC

Description of problem:

On large clusters which are changing quickly (new endpoints, pods, svc, etc) iptables can consume a full core on all systems.   This was seen when scaling up the CNCF cluster to 300 nodes and 5K projects.   Each project had 3 deployments, 3 services, 4 pods and more.

iptablesSyncPeriod provides an upper bound (default 30s) to how often iptables will synch, but when many changes are occurring there is no lower bound which can lead to the state where iptables is consuming a core on the nodes.

This bz requests creation of a parameter to set a lower bound > "whenever changes occur".   Propose to default it to 0 but allow it to be raised in large dynamic clusters.

Comment 2 Mike Fiedler 2016-09-01 12:43:11 UTC

I saw it during cluster load up and during project deletion.   Pretty sure it was in conjunction with create/delete activity.

Comment 3 Timothy St. Clair 2016-09-01 12:53:47 UTC

k, it should be batching on 30 sec intervals.  We'll need to dig.

Comment 4 Timothy St. Clair 2016-09-28 21:17:40 UTC

During a large load, or bulk rectification, the issue is that continuous service/endpoint updates are computationally expensive b/c the rules are constantly being updated.  What is more disconcerting, is that any high degree of endpoint churn will cause broadcasts to every node in the cluster to refresh their tables. 

We may need to modify this to be a bulk time-windowed operation.

Comment 5 Timothy St. Clair 2016-09-28 21:43:06 UTC

xref: https://github.com/kubernetes/kubernetes/issues/33693

Comment 6 Timothy St. Clair 2016-10-21 22:00:26 UTC

Working on some fixes - https://github.com/kubernetes/kubernetes/pull/35334