Description of problem: Similar to Bug 1481366. See that bug for extra info. The host has 2 NICs(in this case we have 3 nics, but only 2 that matter for this bug). 1st NIC is primary on the host, connected to the outside world. 2nd NIC will be dedicated for internal openshift traffic. In the inventory the hosts are configured the following way to have a separate IP for openshift specific traffic (172.17.11.4 is on the 2nd NIC) hp60ds-4.o.internal openshift_node_labels="{'region': 'bagl', 'zone': 'default'}" openshift_ip=172.17.11.4 openshift_hostname=hp60ds-4.o.internal openshift_dns_ip=172.17.11.4 Version-Release number of the following components: rpm -q openshift-ansible openshift-ansible-3.6.173.0.21-2.git.0.44a4038.el7.noarch rpm -q ansible ansible-2.3.2.0-2.el7.noarch ansible --version ansible 2.3.2.0 config file = /home/ocp_deployment/OpenShiftCluster-v3/ansible.cfg configured module search path = Default w/o overrides python version = 2.7.5 (default, May 3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)] How reproducible: always. Steps to Reproduce: 1. 2. 3. Actual results: Please include the entire output from the last TASK line through the end of output if an error is generated 2017-09-14 17:52:41,826 p=25774 u=root | TASK [openshift_node : Install Node package] ************************************************************************************************************************************* 2017-09-14 17:52:51,207 p=25774 u=root | fatal: [hp60ds-3.o.internal]: FAILED! => { "changed": true, "failed": true, "rc": 1, "results": [ "Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-\n : manager, versionlock\nResolving Dependencies\n--> Running transaction check\n---> Package atomic-openshift-node.x86_64 0:3.6.173.0.21-1.git.0.f95b0e7.el7 will be installed\n---> Package tuned-profiles-atomic-openshift-node.x86_64 0:3.6.173.0.21-1.git.0.f95b0e7.el7 will be installed\n--> Finished Dependency Resolution\n\nDependencies Resolved\n\n================================================================================\n Package\n Arch Version Repository Size\n================================================================================\nInstalling:\n atomic-openshift-node\n x86_64 3.6.173.0.21-1.git.0.f95b0e7.el7 rhel-7-server-ose-3.6-rpms 717 k\n tuned-profiles-atomic-openshift-node\n x86_64 3.6.173.0.21-1.git.0.f95b0e7.el7 rhel-7-server-ose-3.6-rpms 721 k\n\nTransaction Summary\n================================================================================\nInstall 2 Packages\n\nTotal download size: 1.4 M\nInstalled size: 14 k\nDownloading packages:\n" ] } MSG: https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/rhgs-server/3.1/os/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/fast-datapath/os/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/extras/os/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/ose/3.6/os/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/os/repodata/repomd.xml: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/ose/3.6/os/Packages/atomic-openshift-node-3.6.173.0.21-1.git.0.f95b0e7.el7.x86_64.rpm: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. https://cdn.redhat.com/content/dist/rhel/server/7/7Server/x86_64/ose/3.6/os/Packages/tuned-profiles-atomic-openshift-node-3.6.173.0.21-1.git.0.f95b0e7.el7.x86_64.rpm: [Errno 14] curl#6 - "Could not resolve host: cdn.redhat.com; Unknown error" Trying other mirror. Error downloading packages: atomic-openshift-node-3.6.173.0.21-1.git.0.f95b0e7.el7.x86_64: [Errno 256] No more mirrors to try. tuned-profiles-atomic-openshift-node-3.6.173.0.21-1.git.0.f95b0e7.el7.x86_64: [Errno 256] No more mirrors to try. --- On any one of the nodes as a result I have: # cat /etc/dnsmasq.d/origin-dns.conf no-resolv domain-needed no-negcache max-cache-ttl=1 enable-dbus bind-interfaces listen-address=172.17.0.58 where 172.17.0.58 - the IP for OpenShift-specific traffic. Expected results: 1) installation complete successfully 2) hosts outside of the cluster can be resolved from the hosts within the cluster Additional info: Please attach logs from ansible-playbook with the -vvv flag I've workarounded the issue by adding extra listening address for dnsmasq: listen-address={{ ansible_default_ipv4.address }} to /usr/share/ansible/openshift-ansible/roles/openshift_certificate_expiry/examples/playbooks/roles/openshift_node_dnsmasq/templates/origin-dns.conf.j2
Anton, Which IP address is in /etc/resolv.conf? I imagine not the value of openshift_ip but instead the default interface? I'm working on a refactor of the dispatcher script that instead of listening on a specific address it listens to all interfaces other than lo. I imagine that would address the issue.
Scott, yes. /etc/resolv.conf has one `nameserver <ip>` entry, where the <ip> is the ip of the default interface of the node. `openshift_ip` for that node is defined by inventory, it's the ip of the 2nd NIC.
https://github.com/openshift/openshift-ansible/pull/5778 should fix this in 3.7
https://github.com/openshift/openshift-ansible/pull/5891 was merged to address this
https://github.com/openshift/openshift-ansible/pull/5953 probably necessary also to avoid racing at startup
In 3.7.0-0.189.0
Verified in openshift-ansible-3.7.0-0.190.0.git.0.129e91a.el7.noarch.rpm 1) ##Spin up instances with two NICs: # ip addr |grep eth |grep inet inet 172.16.120.98/24 brd 172.16.120.255 scope global dynamic eth0 inet 192.168.33.3/24 brd 192.168.33.255 scope global dynamic eth1 # route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 172.16.120.1 0.0.0.0 UG 100 0 0 eth0 10.128.0.0 0.0.0.0 255.252.0.0 U 0 0 0 tun0 169.254.169.254 192.168.33.1 255.255.255.255 UGH 100 0 0 eth1 172.16.120.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 172.30.0.0 0.0.0.0 255.255.0.0 U 0 0 0 tun0 192.168.33.0 0.0.0.0 255.255.255.0 U 100 0 0 eth1 2) ##Trigger installation against two NICs of the instances: # cat inventory_hosts <--snip--> [masters] host-8-240-252.host.centralci.eng.rdu2.redhat.com [nodes] host-8-240-252.host.centralci.eng.rdu2.redhat.com openshift_node_labels="{'role': 'node'}" openshift_ip=192.168.33.3 openshift_dns_ip=192.168.33.3 host-8-241-27.host.centralci.eng.rdu2.redhat.com openshift_node_labels="{'role': 'node','registry': 'enabled','router': 'enabled'}" openshift_ip=192.168.33.5 openshift_dns_ip=192.168.33.5 [etcd] host-8-241-126.host.centralci.eng.rdu2.redhat.com [nfs] host-8-240-252.host.centralci.eng.rdu2.redhat.com <--snip--> 3) ##Check the configurations # cat /etc/resolv.conf # nameserver updated by /etc/NetworkManager/dispatcher.d/99-origin-dns.sh # Generated by NetworkManager search openstacklocal cluster.local # NOTE: the libc resolver may not support more than 3 nameservers. # The nameservers listed below may not be recognized. nameserver 172.16.120.98 # cat /etc/dnsmasq.d/origin-dns.conf no-resolv domain-needed no-negcache max-cache-ttl=1 enable-dbus dns-forward-max=5000 cache-size=5000 bind-dynamic except-interface=lo # End of config # cat /etc/dnsmasq.d/origin-upstream-dns.conf server=172.16.120.11 server=172.16.120.2 server=172.16.120.3 4) ##S2I build successfully
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188