Bug 1626387
| Summary: | [OVN] container cannot access the dns server | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | zhaozhanqi <zzhao> | ||||||
| Component: | Networking | Assignee: | Casey Callendrello <cdc> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | zhaozhanqi <zzhao> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 3.11.0 | CC: | aos-bugs, bbennett, weliang, wmeng | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.2.0 | ||||||||
| Hardware: | All | ||||||||
| OS: | All | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2019-06-18 15:24:43 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
zhaozhanqi
2018-09-07 08:12:43 UTC
Recreated test on my 3 node cluster. # oc get no NAME STATUS ROLES AGE VERSION wsfd-netdev22.ntdv.lab.eng.bos.redhat.com Ready infra,master 1d v1.11.0+d4cacc0 wsfd-netdev28.ntdv.lab.eng.bos.redhat.com Ready compute 1d v1.11.0+d4cacc0 wsfd-netdev35.ntdv.lab.eng.bos.redhat.com Ready compute 1d v1.11.0+d4cacc0 # oc new-project test1 # oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/list_for_pods.json replicationcontroller/test-rc created service/test-service created # oc get all NAME READY STATUS RESTARTS AGE pod/test-rc-992wn 1/1 Running 0 29s pod/test-rc-cg4t9 1/1 Running 0 29s NAME DESIRED CURRENT READY AGE replicationcontroller/test-rc 2 2 2 29s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/test-service ClusterIP 172.30.219.98 <none> 27017/TCP 29s # oc get po -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE test-rc-992wn 1/1 Running 0 1m 10.128.2.5 wsfd-netdev35.ntdv.lab.eng.bos.redhat.com <none> test-rc-cg4t9 1/1 Running 0 1m 10.128.1.4 wsfd-netdev28.ntdv.lab.eng.bos.redhat.com <none> # curl 10.128.2.5:8080 Hello OpenShift! # ping 10.128.2.5 PING 10.128.2.5 (10.128.2.5) 56(84) bytes of data. 64 bytes from 10.128.2.5: icmp_seq=1 ttl=63 time=2.18 ms ... # oc rsh test-rc-992wn / $ curl 10.128.1.4:8080 Hello OpenShift! / $ ping 10.128.1.4 PING 10.128.1.4 (10.128.1.4) 56(84) bytes of data. 64 bytes from 10.128.1.4: icmp_seq=1 ttl=63 time=2.48 ms ... $ ping www.google.com PING www.google.com (172.217.15.68) 56(84) bytes of data. 64 bytes from iad23s63-in-f4.1e100.net (172.217.15.68): icmp_seq=1 ttl=41 time=38.6 ms .... $ ping kubernetes.default.svc ping: unknown host kubernetes.default.svc $ exit # curl 172.30.219.98:27017 Hello OpenShift! There is no route for kubernetes.default.svc This is working on my cluster. Spent some time with Weibin. There are significant differences between his clusters on aws and my lab cluster. The above problem is on his lab cluster when he installs with ovn, but not when he installs sdn. There is a route certificate problem on his cluster and not on mine. He occasionally see a ovnkube panic, not sure if its the same cause each time or what I have seen. Analysis is in progress, more investigation needed. Attached a detailed test log which testing from dns passed SDN cluster and dns failed OVN cluster. Created attachment 1487896 [details]
Testing logs
Please ignore above attachment 1487896 [details].
See new test logs in new attachment.
In ovs, after creating the new pod/svc, the node will add the rules for port 53
[root@ip-172-18-10-166 ec2-user]# iptables-save | grep 53
-A KUBE-SEP-DKYVUOI2CXZAJVDR -p tcp -m comment --comment "default/kubernetes:dns-tcp" -m tcp -j DNAT --to-destination 172.18.7.59:8053
-A KUBE-SEP-KDT7ZLRJZTMDLVLE -p udp -m comment --comment "default/kubernetes:dns" -m udp -j DNAT --to-destination 172.18.7.59:8053
-A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.0.1/32 -p tcp -m comment --comment "default/kubernetes:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 172.30.0.1/32 -p tcp -m comment --comment "default/kubernetes:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-BA6I5HTZKAAAJT56
-A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:1936-tcp cluster IP" -m tcp --dport 1936 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:1936-tcp cluster IP" -m tcp --dport 1936 -j KUBE-SVC-4JCRTMMYZAAYMIJ2
-A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.0.1/32 -p udp -m comment --comment "default/kubernetes:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 172.30.0.1/32 -p udp -m comment --comment "default/kubernetes:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-3VQ6B3MLH7E2SZT4
-A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:80-tcp cluster IP" -m tcp --dport 80 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:80-tcp cluster IP" -m tcp --dport 80 -j KUBE-SVC-GQKZAHCS5DTMHUQ6
-A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:443-tcp cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 172.30.126.53/32 -p tcp -m comment --comment "default/router:443-tcp cluster IP" -m tcp --dport 443 -j KUBE-SVC-IKV43KYNCXS2W7KZ
In ovn, adding the new rules for port 53 does not happen.
[root@ip-172-18-7-211 ec2-user]# iptables-save | grep 53
# Generated by iptables-save v1.4.21 on Tue Oct 16 13:53:49 2018
# Completed on Tue Oct 16 13:53:49 2018
# Generated by iptables-save v1.4.21 on Tue Oct 16 13:53:49 2018
:OUTPUT ACCEPT [5313:492235]
# Completed on Tue Oct 16 13:53:49 2018
Created attachment 1494514 [details]
iptables rules from OVS and OVN
yes, for openshift sdn, the node will DNAT all requests to skydns in the master by: -A KUBE-SEP-DKYVUOI2CXZAJVDR -p tcp -m comment --comment "default/kubernetes:dns-tcp" -m tcp -j DNAT --to-destination 172.18.7.59:8053 -A KUBE-SEP-KDT7ZLRJZTMDLVLE -p udp -m comment --comment "default/kubernetes:dns" -m udp -j DNAT --to-destination 172.18.7.59:8053 For OVN, I'm not sure it's also use this kind of way? https://github.com/openvswitch/ovn-kubernetes/pull/456 Added iptables rule to permit traffic from pod to external network. Tested above PR and container can ping outside hostname (yahoo.com) now. seems the PR 456 only fixed the public dns issue, How about the internal dns, eg: #ping kubernetes.default.svc PR does only fix the public dns issue. There is ongoing discussion about whether to support internal cluster DNS in the dev preview and if we do which dns solution to use. Openshift 4.0 will use CoreDNS. This is a network edge issue. Please reassign. This is no longer an issue in 4.x |