Bug 1365176
| Summary: | Duplicate addresses shown under oc describe endpoints when configure ipfailover | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Weibin Liang <weliang> |
| Component: | Networking | Assignee: | Maru Newby <mnewby> |
| Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.2.0 | CC: | aos-bugs, bbennett, hongli, mnewby, tdawson |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause: When ipfailover was configured for the router, keepalived pods were being labeled with the selector of the router service.
Consequence: The router service was selecting both router pods and keepalived pods. Since both types of pods use host networking by default, their IP addresses would be the same if deployed to the same hosts and the service would appear to be selecting duplicate endpoints.
Fix: The keepalived pods are now given a label that is distinct from that applied to the router pods.
Result: The router service no longer displays duplicate IP addresses when ipfailover is configured.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-09-27 09:42:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The keepalived pods are being labeled with the value supplied as the selector, which means they are selected by the router service (infra=ha-router in this case) for inclusion in the endpoints that it is targeting. I can't see a good reason for the keepalived pods being labeled in this way. Submitted a fix on github. PR merged to Origin. This has been merged into ose and is in OSE v3.3.0.22 or newer. Checked on ose v3.3.0.23.
Issue has been fixed.
[root@ose-master ~]# oc get po
NAME READY STATUS RESTARTS AGE
ha-router-1-ocaop 1/1 Running 0 2m
ha-router-1-xnd6p 1/1 Running 0 2m
ipf-red-1-awwg4 1/1 Running 0 1m
ipf-red-1-cf5u9 1/1 Running 0 1m
[root@ose-master ~]# oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ha-router 172.30.212.199 <none> 80/TCP,443/TCP,1936/TCP 2m
kubernetes 172.30.0.1 <none> 443/TCP,53/UDP,53/TCP 3m
[root@ose-master ~]# oc get endpoints
NAME ENDPOINTS AGE
ha-router 10.66.140.165:443,10.66.141.94:443,10.66.140.165:1936 + 3 more... 2m
kubernetes 10.66.140.11:8443,10.66.140.11:8053,10.66.140.11:8053 3m
[root@ose-master ~]# oc describe endpoints ha-router
Name: ha-router
Namespace: default
Labels: router=ha-router
Subsets:
Addresses: 10.66.140.165,10.66.141.94
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
443-tcp 443 TCP
1936-tcp 1936 TCP
80-tcp 80 TCP
No events.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1933 |
Description of problem: Two duplicate addresses (10.18.41.62 and 10.18.41.70) shown under oc describe endpoints when configured ipfailover. Addresses: 10.18.41.142,10.18.41.61,10.18.41.62,10.18.41.62,10.18.41.70,10.18.41.70 Version-Release number of selected component (if applicable): [root@dhcp-41-74 ~]# oc version oc v3.2.1.9-1-g2265530 kubernetes v1.2.0-36-g4a3f9c5 [root@dhcp-41-74 ~]# cat /etc/system-release Red Hat Enterprise Linux Server release 7.2 (Maipo) [root@dhcp-41-74 ~]# How reproducible: Easy to reproduce, just follow below steps, Steps to Reproduce: oc label nodes dhcp-41-142.bos.redhat.com "infra=ha-router" oc label nodes dhcp-41-70.bos.redhat.com "infra=ha-router" oc label nodes dhcp-41-62.bos.redhat.com "infra=ha-router" oc label nodes dhcp-41-61.bos.redhat.com "infra=ha-router" oc label nodes dhcp-41-74.bos.redhat.com "infra=ha-router" oc get nodes --selector="infra=ha-router" oc delete project pro-ipfailover oc project default sleep 20 oc new-project pro-ipfailover oc create serviceaccount harp -n pro-ipfailover oadm policy add-scc-to-user privileged system:serviceaccount:pro-ipfailover:harp oadm router ha-router --replicas=2 --selector="infra=ha-router" --labels="infra=ha-router" \ --service-account=harp [root@dhcp-41-74 ~]# oc get nodes NAME STATUS AGE dhcp-41-142.bos.redhat.com Ready 2d dhcp-41-61.bos.redhat.com Ready 2d dhcp-41-62.bos.redhat.com Ready 2d dhcp-41-70.bos.redhat.com Ready 2d dhcp-41-74.bos.redhat.com Ready,SchedulingDisabled 2d oadm ipfailover ipf-har --replicas=4 --watch-port=80 --selector="infra=ha-router" \ --virtual-ips="10.245.2.201-205" --credentials=/etc/origin/master/openshift-router.kubeconfig --service-account=harp --create [root@dhcp-41-74 ~]# oc get pods NAME READY STATUS RESTARTS AGE ha-router-1-uf6vo 1/1 Running 0 6m ha-router-1-vmdf4 1/1 Running 0 6m ipf-har-1-8eb5y 1/1 Running 0 1m ipf-har-1-h05i1 1/1 Running 0 1m ipf-har-1-hn6kh 1/1 Running 0 1m ipf-har-1-x7qn2 1/1 Running 0 1m [root@dhcp-41-74 ~]# oc get svc NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE ha-router 172.30.135.52 <none> 80/TCP,443/TCP,1936/TCP 7m [root@dhcp-41-74 ~]# oc get endpoints NAME ENDPOINTS AGE ha-router 10.18.41.142:80,10.18.41.61:80,10.18.41.62:80 + 15 more... 7m [root@dhcp-41-74 ~]# oc describe endpoints ha-router Name: ha-router Namespace: pro-ipfailover Labels: infra=ha-router Subsets: Addresses: 10.18.41.142,10.18.41.61,10.18.41.62,10.18.41.62,10.18.41.70,10.18.41.70 NotReadyAddresses: <none> Ports: Name Port Protocol ---- ---- -------- 80-tcp 80 TCP 443-tcp 443 TCP 1936-tcp 1936 TCP No events. [root@dhcp-41-74 ~]# Actual results: Addresses: 10.18.41.142,10.18.41.61,10.18.41.62,10.18.41.62,10.18.41.70,10.18.41.70 Expected results: Addresses: 10.18.41.142,10.18.41.61,10.18.41.62,10.18.41.70 Additional info: