Bug 1568095

Summary: Extra dns quires sent out when curl $svc_name form pods
Product: OpenShift Container Platform Reporter: Weibin Liang <weliang>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Status: CLOSED NOTABUG QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, weliang
Target Milestone: ---   
Target Release: 3.11.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-01 19:36:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dns packets none

Description Weibin Liang 2018-04-16 18:40:54 UTC
Created attachment 1422673 [details]
dns packets

Description of problem:
Comparing dns query packets when curl $svc_name:8080 form node and pod, pod send four more query packets.

Version-Release number of selected component (if applicable):
oc v3.10.0-0.15.0

How reproducible:
Every time

Steps to Reproduce:
## In the Node:
[root@host-172-16-120-10 ~]# curl red-service.p1.svc.cluster.local.:8080
Hello Red Pod-1 Example
[root@host-172-16-120-10 ~]# cat /etc/resolv.conf 
# nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
# Generated by NetworkManager
search cluster.local openstacklocal
nameserver 172.16.120.10

One type of dns query sent out for  red-service.p1.svc.cluster.local.

## In the master:
[root@host-172-16-120-135 dnsmasq.d]# oc get pods -o wide
NAME         READY     STATUS    RESTARTS   AGE       IP            NODE
blue-pod-1   1/1       Running   0          3h        10.129.0.16   172.16.120.10
red-pod-1    1/1       Running   0          3h        10.129.0.17   172.16.120.10
[root@host-172-16-120-135 dnsmasq.d]# oc rsh blue-pod-1 
/ # cat /etc/resolv.conf 
nameserver 172.16.120.10
search p1.svc.cluster.local svc.cluster.local cluster.local openstacklocal
options ndots:5
/ # curl red-service.p1.svc.cluster.local.:8080
Hello Red Pod-1 Example

Five type of dns queries sent out for:
red-service.p1.svc.cluster.local.p1.svc.cluster.local.
red-service.p1.svc.cluster.local.svc.cluster.local.
red-service.p1.svc.cluster.local.cluster.local.
red-service.p1.svc.cluster.local.openstacklocal.
red-service.p1.svc.cluster.local.

Actual results:
Five type of dns queries sent out

Expected results:
only one dns query sent out for red-service.p1.svc.cluster.local.

Additional info:
Testing log of capturing dns packets is attached.

Comment 1 Ben Bennett 2018-04-16 19:23:44 UTC
Weibin, can you set:

option ndots 5

On the node's resolv.conf and see if it tries to resolve using the search path?

I am not sure why the trailing . is being ignored.  That should mean that it is an absolute name and the search path should not be used.  But it clearly is... this may be a resolver bug :-(

Comment 2 Weibin Liang 2018-04-16 19:43:02 UTC
Ben, when disable options ndots:5 in pod's /etc/resolv.con, then only one type of dns query send out for red-service.p1.svc.cluster.local.

[root@host-172-16-120-135 dnsmasq.d]# oc rsh blue-pod-1 
/ # cat /etc/resolv.conf 
nameserver 172.16.120.10
search p1.svc.cluster.local svc.cluster.local cluster.local openstacklocal
#options ndots:5

Get same wrong results when set options ndots:5 in node's resolv.conf.

Comment 3 Weibin Liang 2018-04-16 20:11:44 UTC
Same curl results using local:8080 and local.:8080


[root@host-172-16-120-135 dnsmasq.d]# oc rsh red-pod-1 curl blue-service.p1.svc.cluster.local:8080
Hello Blue Pod-2 Example
[root@host-172-16-120-135 dnsmasq.d]# oc rsh red-pod-1 curl blue-service.p1.svc.cluster.local.:8080
Hello Blue Pod-2 Example
[root@host-172-16-120-135 dnsmasq.d]#

Comment 4 Ben Bennett 2018-05-01 19:29:34 UTC
Can you try the same test please with:
  getent hosts blue-service.p1.svc.cluster.local

And
  getent hosts blue-service.p1.svc.cluster.local.

And then:
  getent hosts google.com
  getent hosts google.com.

I'm not sure if curl is misbehaving or the resolver.  I'm pretty sure it is not us.

Comment 5 Weibin Liang 2018-05-01 19:36:05 UTC
[root@weliang-rhel75-master .ssh]# getent ahostsv4 google.com
172.217.13.238  STREAM google.com
172.217.13.238  DGRAM  
172.217.13.238  RAW    
[root@weliang-rhel75-master .ssh]# getent ahostsv4 google.com.
172.217.13.238  STREAM google.com
172.217.13.238  DGRAM  
172.217.13.238  RAW    
[root@weliang-rhel75-master .ssh]# ping google.com
PING google.com (172.217.13.238) 56(84) bytes of data.
64 bytes from iad23s61-in-f14.1e100.net (172.217.13.238): icmp_seq=1 ttl=46 time=33.3 ms
64 bytes from iad23s61-in-f14.1e100.net (172.217.13.238): icmp_seq=2 ttl=46 time=33.3 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 33.329/33.338/33.348/0.182 ms
[root@weliang-rhel75-master .ssh]# ping google.com.
PING google.com (172.217.13.238) 56(84) bytes of data.
64 bytes from iad23s61-in-f14.1e100.net (172.217.13.238): icmp_seq=1 ttl=46 time=33.2 ms
64 bytes from iad23s61-in-f14.1e100.net (172.217.13.238): icmp_seq=2 ttl=46 time=33.3 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 33.219/33.283/33.347/0.064 ms
[root@weliang-rhel75-master .ssh]#