Bug 1371971

Summary: Need a minimum counterpart to iptablesSyncPeriod
Product: OpenShift Container Platform Reporter: Mike Fiedler <mifiedle>
Component: NetworkingAssignee: Timothy St. Clair <tstclair>
Status: CLOSED DUPLICATE QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: high    
Version: 3.3.0CC: agrimm, aos-bugs, bbennett, jeder, mifiedle, tstclair
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: aos-scalability-34
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-10-28 16:02:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1303130    

Description Mike Fiedler 2016-08-31 14:57:26 UTC
Description of problem:

On large clusters which are changing quickly (new endpoints, pods, svc, etc) iptables can consume a full core on all systems.   This was seen when scaling up the CNCF cluster to 300 nodes and 5K projects.   Each project had 3 deployments, 3 services, 4 pods and more.

iptablesSyncPeriod provides an upper bound (default 30s) to how often iptables will synch, but when many changes are occurring there is no lower bound which can lead to the state where iptables is consuming a core on the nodes.

This bz requests creation of a parameter to set a lower bound > "whenever changes occur".   Propose to default it to 0 but allow it to be raised in large dynamic clusters.

Comment 2 Mike Fiedler 2016-09-01 12:43:11 UTC
I saw it during cluster load up and during project deletion.   Pretty sure it was in conjunction with create/delete activity.

Comment 3 Timothy St. Clair 2016-09-01 12:53:47 UTC
k, it should be batching on 30 sec intervals.  We'll need to dig.

Comment 4 Timothy St. Clair 2016-09-28 21:17:40 UTC
During a large load, or bulk rectification, the issue is that continuous service/endpoint updates are computationally expensive b/c the rules are constantly being updated.  What is more disconcerting, is that any high degree of endpoint churn will cause broadcasts to every node in the cluster to refresh their tables. 

We may need to modify this to be a bulk time-windowed operation.

Comment 5 Timothy St. Clair 2016-09-28 21:43:06 UTC
xref: https://github.com/kubernetes/kubernetes/issues/33693

Comment 6 Timothy St. Clair 2016-10-21 22:00:26 UTC
Working on some fixes - https://github.com/kubernetes/kubernetes/pull/35334