Bug 1595291
Summary: | [Backport 3.7] Egress Router HTTP Proxy cannot reach the node which router pod runs | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Birol Bilgin <bbilgin> | ||||
Component: | Networking | Assignee: | Dan Winship <danw> | ||||
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 3.7.0 | CC: | aos-bugs, bbennett, bmeng, cdc, dmoessne, pasik, zzhao | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.7.z | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: |
Cause: The way that egress routers are set up made it impossible for an egress router pod to connect to the public IP address of the node it was hosted on.
Consequence: If an egress pod was configured to use its node as a name server via /etc/resolv.conf, it would be unable to do DNS resolution.
Fix: Traffic from an egress router pod to its node is now routed via the SDN tunnel instead of trying to send it via the egress interface.
Result: Egress routers can now connect to their node's IP, and egress router DNS should always work, regardless of configuration.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1698136 (view as bug list) | Environment: | |||||
Last Closed: | 2019-06-11 19:50:27 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Birol Bilgin
2018-06-26 14:00:54 UTC
Tested on ocp v3.7.108 and egress router image ose-egress-router:v3.7.108 d5b8e14f9ec6 The issue is not fixed. The egress router pod cannot reach the host's eth0 IP and cannot reach local dnsmasq service. The route of the egress router pod: # ip r default via 10.66.141.254 dev macvlan0 10.66.140.97 via 10.128.0.1 dev eth0 10.66.141.254 dev macvlan0 scope link 10.128.0.0/23 dev eth0 proto kernel scope link src 10.128.0.152 10.128.0.0/14 dev eth0 172.30.0.0/16 via 10.128.0.1 dev eth0 224.0.0.0/4 dev eth0 # ping 10.66.140.97 PING 10.66.140.97 (10.66.140.97) 56(84) bytes of data. From 10.66.140.200 icmp_seq=1 Destination Host Unreachable From 10.66.140.200 icmp_seq=2 Destination Host Unreachable From 10.66.140.200 icmp_seq=3 Destination Host Unreachable From 10.66.140.200 icmp_seq=4 Destination Host Unreachable # iptables-save # Generated by iptables-save v1.4.21 on Mon Apr 1 15:45:53 2019 *filter :INPUT ACCEPT [24:1937] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [20:1878] COMMIT # Completed on Mon Apr 1 15:45:53 2019 # Generated by iptables-save v1.4.21 on Mon Apr 1 15:45:53 2019 *nat :PREROUTING ACCEPT [15:1639] :INPUT ACCEPT [0:0] :OUTPUT ACCEPT [3:252] :POSTROUTING ACCEPT [3:252] -A PREROUTING -i eth0 -j DNAT --to-destination 61.135.218.25 -A POSTROUTING -j SNAT --to-source 10.66.140.200 COMMIT # Completed on Mon Apr 1 15:45:53 2019 * 10.66.140.97 is the node ip It works for me... can I get access to this cluster? (Or another cluster demonstrating the bug) Alternatively, can you get "iptables-save -c" (at the node level, not inside the router pod) and OVS dump-flows output, both before and after trying the ping test in the pod? Created attachment 1553794 [details]
iptables_and_openflow
The attachment contains the requested dumps. And I will send you a separate mail about the cluster info if you'd like to have a look on it.
> Tested on ocp v3.7.108 and egress router image ose-egress-router:v3.7.108 d5b8e14f9ec6
Oh, right; it's not fixed for you because you're using the 3.7 egress-router image whereas I was using the :latest one. We need to backport a fix to the egress-router itself.
So FTR, note that RHBA-2019:0617 does contain the openshift-sdn side of this bugfix, it's just missing the fixed egress-router image. But if your egress routers use the "openshift/origin-egress-router" (:latest) image rather than "ose-egress-router", then it will work. OK, https://github.com/openshift/ose/pull/1520 contains the rest of the fix Verified this bug on v3.7.118 the egress router pod can access the node which located. sh-4.2# ip route default via 10.0.77.254 dev macvlan0 10.0.76.163 via 10.129.0.1 dev eth0 10.0.77.254 dev macvlan0 scope link sh-4.2# ping 10.0.76.163 PING 10.0.76.163 (10.0.76.163) 56(84) bytes of data. 64 bytes from 10.0.76.163: icmp_seq=1 ttl=64 time=0.402 ms 64 bytes from 10.0.76.163: icmp_seq=2 ttl=64 time=0.113 m sh-4.2# iptables-save | grep POSTROUTING :POSTROUTING ACCEPT [2:168] -A POSTROUTING -o macvlan0 -j SNAT --to-source 10.0.76.100 ##10.0.76.100 is the egress ip. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:1302 |