Bug 1506396
Summary: | Increase iptables-restore timeout | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Eric Paris <eparis> |
Component: | Networking | Assignee: | Rajat Chopra <rchopra> |
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.7.0 | CC: | aos-bugs, bbennett, danw, smunilla, xtian |
Target Milestone: | --- | ||
Target Release: | 3.7.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Enhancement | |
Doc Text: |
Feature:
'tuning' rather than an enhancement.
Reason:
the previous value wasn't suitable if two operations were done at the same time.
Result:
Better wait time for iptables operation so that things finish neatly, in order, without failures.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-11-28 22:19:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eric Paris
2017-10-25 21:45:14 UTC
@eparis: Is that what we want? https://github.com/openshift/origin/pull/17062 dcbw had indicated that some of the time that iptables-restore takes is just parsing the very large number of rules. Unfortunately, it looks like it grabs the lock *before* parsing, rather than *after*, so it's staying locked longer than it needs to. We should fix that. We can file a RHEL BZ for that I guess. I'll do so. But seeing 2.4s (not waiting for the lock) I think a 5s timeout makes. Actually, it looks like fixing it would be pretty hard so maybe don't bother There is a hardcoded string in the iptables.go: https://github.com/rajatchopra/kubernetes/blob/c5740a37379aa4905c9505082212610a1ac022c6/pkg/util/iptables/iptables.go#L595 Which causes the openshift node log always shows Nov 07 15:01:18 ose-node1.bmeng.local atomic-openshift-node[97540]: I1107 15:01:18.899238 97540 iptables.go:371] running iptables-restore [--wait=2 --noflush --counters] Thanks Meng Bo. Kube PR to correct that -- https://github.com/kubernetes/kubernetes/pull/55248 Will backport shortly. The Origin PR is https://github.com/openshift/origin/pull/17222 Please test it on build 3.7.4-1 or newer version Nov 09 18:58:53 ose-node2.bmeng.local atomic-openshift-node[25845]: I1109 18:58:52.989607 25845 iptables.go:371] running iptables-restore [-w5 --noflush --counters] Verified on ocp v3.7.4-1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188 |