| Summary: | Latency on iptables rules update after atomic-openshift-node service restart | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Miheer Salunke <misalunk> |
| Component: | Networking | Assignee: | Dan Winship <danw> |
| Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.2.1 | CC: | aos-bugs, bbennett, eparis, hongli, ndordet, tdawson |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 3.2.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: OpenShift nodes initialized some of their data structures incorrectly at startup.
Consequence: After restarting a node, pods on that node would be unable to access some service IP addresses until a change was made to that service or a resync occurred.
Fix: The buggy initialization code was fixed.
Result: All services should be accessible as expected after restarting a node.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-04-04 14:28:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Miheer Salunke
2016-09-28 20:23:49 UTC
Ok, so the problem is at: https://github.com/openshift/origin/blob/v1.2.0/Godeps/_workspace/src/github.com/openshift/openshift-sdn/plugins/osdn/registry.go#L133 We need to point to the list item, not use the pod to loop over it otherwise we are just pointing to that variable and we end up using the same pointer for all items. A little more info: - This will happen any time the atomic-openshift-node software is restated - It will self-correct after 5-10 minutes when the data structures refresh, it is only the initial initialization that is incorrect - This is resolved in 3.3 because the way all of this is tracked was completely re-done Miheer: Can you open a new bug for the new issue they are seeing with 3.3? It is different from this one that they originally hit (on 3.2). Dropping the priority since it self-corrects and is fixed in 3.3. This is fixed in 3.3. There is a PR ready for 3.2, but a merge was rejected because the urgency seemed low. @Ben Sir-> Opened https://bugzilla.redhat.com/show_bug.cgi?id=1389451 Re-Opened. Pull Request for fix: https://github.com/openshift/ose/pull/641 verified in OCP 3.2.1.28 and the issue has been fixed. After atomic-openshift-node service restarting, iptables rules (KUBE-SERVICES chain) is OK in about 15 seconds. [root@host-8-175-119 ~]# openshift version openshift v3.2.1.28 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 [root@host-8-175-119 ~]# [root@host-8-175-119 ~]# systemctl restart atomic-openshift-node [root@host-8-175-119 ~]# [root@host-8-175-119 ~]# iptables -L KUBE-SERVICES Chain KUBE-SERVICES (1 references) target prot opt source destination REJECT tcp -- anywhere 172.30.147.75 /* install-test/cakephp-mysql-example:web has no endpoints */ tcp dpt:webcache reject-with icmp-port-unreachable [root@host-8-175-119 ~]# [root@host-8-175-119 ~]# [root@host-8-175-119 ~]# [root@host-8-175-119 ~]# iptables -L KUBE-SERVICES Chain KUBE-SERVICES (1 references) target prot opt source destination [root@host-8-175-119 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0865 |