Bug 1577886 - Wrong nameserver in pod on Atomic Host install
Summary: Wrong nameserver in pod on Atomic Host install
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.10.0
Assignee: Scott Dodson
QA Contact: Weihua Meng
URL:
Whiteboard:
: 1580981 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-14 11:30 UTC by Weihua Meng
Modified: 2018-07-30 19:15 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-07-30 19:15:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:1816 0 None None None 2018-07-30 19:15:53 UTC

Description Weihua Meng 2018-05-14 11:30:31 UTC
Description of problem:
Wrong nameserver in pod on Atomic Host install
RPM install works fine.

Version-Release number of the following components:
openshift-ansible-3.10.0-0.38.0

How reproducible:
Always

Steps to Reproduce:
1. Install OCP on Atomic Host
2. check pod DNS config

Actual results:
# oc rsh registry-console-1-phxd9
sh-4.2$ cat /etc/resolv.conf 
nameserver 0.0.0.0
search default.svc.cluster.local svc.cluster.local cluster.local openstacklocal
options ndots:5

Expected results:
nameserver should be node host IP

NOTE:
--cluster-dns=0.0.0.0

[root@host-172-16-120-85 ~]# ps -ef | grep cluster-dns
root     19897 19885  2 08:19 ?        00:04:09 /usr/bin/hyperkube kubelet --v=5 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-cache-authorized-ttl=5m --authorization-webhook-cache-unauthorized-ttl=5m --bootstrap-kubeconfig=/etc/origin/node/bootstrap.kubeconfig --cadvisor-port=0 --cert-dir=/etc/origin/node/certificates --cgroup-driver=systemd --client-ca-file=/etc/origin/node/client-ca.crt --cloud-config=/etc/origin/cloudprovider/.conf --cloud-provider= --cluster-dns=0.0.0.0 --cluster-domain=cluster.local --container-runtime-endpoint=/var/run/dockershim.sock --containerized=true --enable-controller-attach-detach=true --experimental-dockershim-root-directory=/var/lib/dockershim --fail-swap-on=false --feature-gates=RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true --file-check-frequency=0s --healthz-bind-address= --healthz-port=0 --host-ipc-sources=api --host-ipc-sources=file --host-network-sources=api --host-network-sources=file --host-pid-sources=api --host-pid-sources=file --hostname-override= --http-check-frequency=0s --image-service-endpoint=/var/run/dockershim.sock --iptables-masquerade-bit=0 --kubeconfig=/etc/origin/node/node.kubeconfig --max-pods=250 --network-plugin=cni --node-ip= --node-labels=node-role.kubernetes.io/master=true --pod-infra-container-image=registry.reg-aws.openshift.com:443/openshift3/ose-pod:v3.10 --pod-manifest-path=/etc/origin/node/pods --port=10250 --read-only-port=0 --register-node=true --root-dir=/var/lib/origin/openshift.local.volumes --rotate-certificates=true --tls-cert-file= --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_256_CBC_SHA --tls-min-version=VersionTLS12 --tls-private-key-file=

[root@host-172-16-120-85 ~]# cat /etc/origin/node/node-config.yaml 
kind: NodeConfig
apiVersion: v1
authConfig:
  authenticationCacheSize: 1000
  authenticationCacheTTL: 5m
  authorizationCacheSize: 1000
  authorizationCacheTTL: 5m
dnsBindAddress: "127.0.0.1:53"
dnsDomain: cluster.local
dnsIP: 0.0.0.0
dnsNameservers: null
dnsRecursiveResolvConf: /etc/origin/node/resolv.conf
dockerConfig:
  dockerShimRootDirectory: /var/lib/dockershim
  dockerShimSocket: /var/run/dockershim.sock
  execHandlerName: native
enableUnidling: true
imageConfig:
  format: "registry.reg-aws.openshift.com:443/openshift3/ose-${component}:v3.10"
  latest: false
iptablesSyncPeriod: 30s
kubeletArguments:
  pod-manifest-path:
  - /etc/origin/node/pods
  bootstrap-kubeconfig:
  - /etc/origin/node/bootstrap.kubeconfig
  feature-gates:
  - RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true
  rotate-certificates:
  - "true"
  cert-dir:
  - /etc/origin/node/certificates
  cloud-config:
  - /etc/origin/cloudprovider/.conf
  cloud-provider:
  - 
  node-labels: 
  - "node-role.kubernetes.io/master=true"
  enable-controller-attach-detach:
  - 'true'
masterClientConnectionOverrides:
  acceptContentTypes: application/vnd.kubernetes.protobuf,application/json
  burst: 40
  contentType: application/vnd.kubernetes.protobuf
  qps: 20
masterKubeConfig: node.kubeconfig
networkConfig:
  mtu: 1450
  networkPluginName: redhat/openshift-ovs-subnet
servingInfo:
  bindAddress: 0.0.0.0:10250
  bindNetwork: tcp4
  clientCA: client-ca.crt
volumeConfig:
  localQuota:
    perFSGroup: null
volumeDirectory: /var/lib/origin/openshift.local.volumes

Comment 2 Scott Dodson 2018-05-14 19:36:17 UTC
Looks like `openshift start node --write-flags` is not generating the proper --cluster-dns flags when run inside a system container. There's no meaningful diff between your node-config.yaml and mine and I get valid cluster-dns on rpm based install.

# grep dns /etc/origin/node/node-config.yaml 
dnsBindAddress: 127.0.0.1:53
dnsDomain: cluster.local
dnsIP: 0.0.0.0
dnsNameservers: null
dnsRecursiveResolvConf: /etc/origin/node/resolv.conf


# /usr/bin/openshift start node --write-flags --config=/etc/origin/node/node-config.yaml
I0514 15:34:28.457817   31933 feature_gate.go:190] feature gates: map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]
--address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-cache-authorized-ttl=5m --authorization-webhook-cache-unauthorized-ttl=5m --bootstrap-kubeconfig=/etc/origin/node/bootstrap.kubeconfig --cadvisor-port=0 --cert-dir=/etc/origin/node/certificates --cgroup-driver=systemd --client-ca-file=/etc/origin/node/client-ca.crt --cloud-config=/etc/origin/cloudprovider/.conf --cloud-provider= --cluster-dns=192.168.122.118 --cluster-domain=cluster.local --container-runtime-endpoint=/var/run/dockershim.sock --containerized=false --enable-controller-attach-detach=true --experimental-dockershim-root-directory=/var/lib/dockershim --fail-swap-on=false --feature-gates=RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true --file-check-frequency=0s --healthz-bind-address= --healthz-port=0 --host-ipc-sources=api --host-ipc-sources=file --host-network-sources=api --host-network-sources=file --host-pid-sources=api --host-pid-sources=file --hostname-override= --http-check-frequency=0s --image-service-endpoint=/var/run/dockershim.sock --iptables-masquerade-bit=0 --kubeconfig=/etc/origin/node/node.kubeconfig --max-pods=250 --network-plugin=cni --node-ip= --node-labels=node-role.kubernetes.io/compute=true --pod-infra-container-image=registry.reg-aws.openshift.com/openshift3/ose-pod:v3.10.0-0.41.0 --pod-manifest-path=/etc/origin/node/pods --port=10250 --read-only-port=0 --register-node=true --root-dir=/var/lib/origin/openshift.local.volumes --rotate-certificates=true --tls-cert-file= --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_256_CBC_SHA --tls-min-version=VersionTLS12 --tls-private-key-file=

Comment 3 Seth Jennings 2018-05-14 20:57:12 UTC
https://github.com/openshift/origin/blob/master/pkg/cmd/server/start/start_node.go#L241-L250

https://github.com/openshift/origin/blob/master/pkg/cmd/util/ip.go#L11-L34

If dnsIP is 0.0.0.0 in the node-config.yaml, the --cluster-dns flag is set to that of the host by looking for the first up, non-loopback interface and getting the first IPv4 address from it.  If this lookup fails, the --cluster-dns flag remains 0.0.0.0, as observed in this bug.

Comment 4 Giuseppe Scrivano 2018-05-15 20:35:22 UTC
the difference seems to be in the way the kubelet configuration is generated on the host and in the container:

https://github.com/openshift/origin/blob/master/images/node/scripts/openshift-node#L17

https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_node/files/openshift-node#L17

it looks like the two commands generate a different value for cluster-dns.  The version in the origin repository, that is used by the system container, generated cluster-dns=0.0.0.0, while the version in openshift-ansible generates cluster-dns=IP.

Comment 5 Scott Dodson 2018-05-15 21:27:03 UTC
Thanks Giuseppe!

https://github.com/openshift/origin/pull/19727

Comment 7 Scott Dodson 2018-05-17 12:45:36 UTC
PRs from comment 6 were merged, PR in comment 5 we dropped as it wasn't the right solution.

Comment 8 Wei Sun 2018-05-22 01:26:30 UTC
The PRs from #comment 6 have been merged to v3.10.0-0.50.0,please check.

Comment 9 Weihua Meng 2018-05-22 02:39:53 UTC
Fixed.
openshift-ansible-3.10.0-0.50.0.git.0.bd68ade.el7

# oc rsh docker-registry-1-l5s6q
sh-4.2$ cat /etc/resolv.conf 
nameserver 10.240.0.33
search default.svc.cluster.local svc.cluster.local cluster.local c.openshift-gce-devel.internal google.internal
options ndots:5
sh-4.2$ curl www.redhat.com
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://www.redhat.com/">here</a>.</p>
</body></html>
sh-4.2$ 
sh-4.2$ 
sh-4.2$ exit
exit

[root@qe-wmengah310-master-etcd-1 ~]# ps -ef | grep cluster-dns
root     10780  7042  0 02:35 pts/0    00:00:00 grep --color=auto cluster-dns
root     23526 23515  4 02:02 ?        00:01:25 /usr/bin/hyperkube kubelet --v=5 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-cache-authorized-ttl=5m --authorization-webhook-cache-unauthorized-ttl=5m --bootstrap-kubeconfig=/etc/origin/node/bootstrap.kubeconfig --cadvisor-port=0 --cert-dir=/etc/origin/node/certificates --cgroup-driver=systemd --client-ca-file=/etc/origin/node/client-ca.crt --cloud-config=/etc/origin/cloudprovider/gce.conf --cloud-provider=gce --cluster-dns=10.240.0.32 --cluster-domain=cluster.local --container-runtime-endpoint=/var/run/dockershim.sock --containerized=true --enable-controller-attach-detach=true --experimental-dockershim-root-directory=/var/lib/dockershim --fail-swap-on=false --feature-gates=RotateKubeletClientCertificate=true,RotateKubeletServerCertificate=true --file-check-frequency=0s --healthz-bind-address= --healthz-port=0 --host-ipc-sources=api --host-ipc-sources=file --host-network-sources=api --host-network-sources=file --host-pid-sources=api --host-pid-sources=file --hostname-override= --http-check-frequency=0s --image-service-endpoint=/var/run/dockershim.sock --iptables-masquerade-bit=0 --kubeconfig=/etc/origin/node/node.kubeconfig --max-pods=250 --network-plugin=cni --node-ip= --node-labels=node-role.kubernetes.io/master=true --pod-infra-container-image=registry.reg-aws.openshift.com:443/openshift3/ose-pod:v3.10.0-0.50.0 --pod-manifest-path=/etc/origin/node/pods --port=10250 --read-only-port=0 --register-node=true --root-dir=/var/lib/origin/openshift.local.volumes --rotate-certificates=true --tls-cert-file= --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_128_GCM_SHA256 --tls-cipher-suites=TLS_RSA_WITH_AES_256_GCM_SHA384 --tls-cipher-suites=TLS_RSA_WITH_AES_128_CBC_SHA --tls-cipher-suites=TLS_RSA_WITH_AES_256_CBC_SHA --tls-min-version=VersionTLS12 --tls-private-key-file=

Comment 10 Scott Dodson 2018-05-22 19:15:00 UTC
*** Bug 1580981 has been marked as a duplicate of this bug. ***

Comment 11 Scott Dodson 2018-05-22 19:16:09 UTC
Fixed in openshift-ansible & OCP 3.10.0-0.48.0 or later. Requires that both are of that version.

Comment 13 errata-xmlrpc 2018-07-30 19:15:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1816


Note You need to log in before you can comment on or make changes to this bug.