Bug 1795416
Summary: | iptables sync sometimes taking too long | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Hugo Cisneiros (Eitch) <hcisneir> |
Component: | Networking | Assignee: | Jacob Tanenbaum <jtanenba> |
Networking sub component: | openshift-sdn | QA Contact: | zhaozhanqi <zzhao> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | urgent | CC: | aconstan, aivaras.laimikis, alchan, anbhat, andbartl, apurty, bbennett, bfurtado, ckoep, danw, dyocum, erich, fpan, gbravi, jcrumple, jdesousa, jnordell, jtanenba, lstanton, openshift-bugs-escalate, osousa, rkhan, rsandu, sandeep.agarwal2, sdodson |
Version: | 3.11.0 | ||
Target Milestone: | --- | ||
Target Release: | 3.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-16 07:46:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Hugo Cisneiros (Eitch)
2020-01-27 22:03:41 UTC
> Tried to use the latest sdn image version (registry.redhat.io/openshift3/ose-node:v3.11.161) and it got worse: You can't just mix and match pieces. > iptablesSyncPeriod: 1m > iptables-min-sync-period: 90s Contrary to what some docs say, you should really shouldn't change min-sync-period, and you can freely raise iptablesSyncPeriod arbitrarily high. Try bumping it to "1h" and see if that helps. v3.11.59 is quite old. There have been many performances fixes since then. If a much higher iptablesSyncPeriod doesn't fix things then I would recommend that the customer upgrade to a more recent release. *** Bug 1801744 has been marked as a duplicate of this bug. *** @mike, could you help reproduce this issue and verified this bug? *** Bug 1835440 has been marked as a duplicate of this bug. *** We have been observing this issue also with OSE v3.11.88 Even updated the version of nodes to 3.11.200 but still the issue is same. @zhanqi I don't think this bug requires a scalability environment to verify. Just a cluster with hundreds of services. We will not have any large scale 3.11 cluster again. @anbhat Advice on how to verify this bug? @jtanenba Could you give some advice to verify this performance issue? Do we need cluster with many nodes or hundred of services? I tried to come up with a way to test the 4.3 backport, but it didn't work: https://bugzilla.redhat.com/show_bug.cgi?id=1801737#c4. It's possible that that test _would_ cause problems with un-patched 3.11 though, because the 3.11 code had more performance problems to begin with than the original 4.3 code. It's less important to test that the performance problems are fixed though, and more important just to make sure that the PR didn't break any other services / kube-proxy / iptables functionality. I assume plenty of tests related to that will get run as part of the 3.11.z release process. Thanks Dan. Given a regression testing on version 3.11.286 about service related testing, No issue found. Move this to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 3.11.286 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3695 *** Bug 1932651 has been marked as a duplicate of this bug. *** |