Bug 1372824
Summary: | Sporadic failures connecting to the cluster registry using the service IP | ||
---|---|---|---|
Product: | OpenShift Online | Reporter: | Thomas Wiest <twiest> |
Component: | Networking | Assignee: | Dan Williams <dcbw> |
Status: | CLOSED DUPLICATE | QA Contact: | Meng Bo <bmeng> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.x | CC: | agrimm, aloughla, anli, aos-bugs, bbennett, dcbw, eparis, sspeiche, sten, sukulkar, tstclair, twiest, wcohen |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-09-15 22:08:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1303130 |
Description
Thomas Wiest
2016-09-02 19:24:28 UTC
On 29 Aug, we changed openshift-node iptablesSyncPeriod from 5s (shipped default) to 300s, since that change we can't reliably reproduce this issue. Sounds like a combination of: https://bugzilla.redhat.com/show_bug.cgi?id=1367199 https://bugzilla.redhat.com/show_bug.cgi?id=1362661 It would be worthwhile to see how iptables-restore is spending that time and determine if there are some hotspots in the code and whether something in the iptables-restore process could be made more efficient. Flame graphs (http://www.brendangregg.com/flamegraphs.html) can give a diagram of the stack backtraces of "perf record" data showing which functions and children functions the processor is spending time in. To get user-space function names in the analysis you should uses the following command to install the associated debuginfo for iptables-restore before running the experiments: # debuginfo-install iptables I'm not sure what else OpenShift networking can do here right now, given that we have a fix to decrease the contention (installer defaults in 1367199) and issues in the kernel too (1362661). Should I dupe this issue to one of those, re-assign to iptables, or close? imho this is a dupe, and close this one. *** This bug has been marked as a duplicate of bug 1362661 *** |