Description of problem:
When installing OCP using flannel and a second network interface for container traffic, the /etc/resolv.conf file copied into the pods shows an unreachable ip for the pods.
Version-Release number of selected component (if applicable):
features: Basic-Auth GSSAPI Kerberos SPNEGO
Create an OCP cluster as the reference architecture using flannel (so eth0 a.b.c.d and eth1 w.x.y.z).
The /etc/resolv.conf in the pod shows the eth0 interface, that is provided by the "listen-address=a.b.c.d" setting in the /etc/dnsmasq.d/origin-dns.conf file and it is not reachable by the pod.
Steps to Reproduce:
1. Create the OCP cluster
2. Connect to any pod running in the cluster
3. Check /etc/resolv.conf
4. Try to resolve anything from that DNS
$ oc rsh docker-registry-1-14tm0
sh-4.2$ curl canihazip.com
curl: (6) Could not resolve host: canihazip.com; Unknown error
sh-4.2$ cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local openstacklocal edu.flannel.com
Resolve the DSN entry
Adding the following iptables rules solves the issue
# iptables -A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 53 -j ACCEPT
# iptables -A OS_FIREWALL_ALLOW -p udp -m state --state NEW -m udp --dport 53 -j ACCEPT
But I think there are two issues:
* Listen address should include the eth1 ip in the dnsmasq configuration
* The resolv.conf copied to the pod should point to that ip (I think this parameter is dnsIP in the node-config.yaml)
Just a side note, I do not thing iptables rules would be a good fix in the end of the day, keeping in mind https://docs.openshift.com/container-platform/latest/admin_guide/iptables.html#iptables-service that makes any iptables rules ephemeral, which is a node boot time only (IIUC). Perhaps the better fix would be to fix the dnsmasq config and/or better document the iptables persistence caveats for openshift or provide poor users like me some help with translating iptables rules for firewalld.
Proposed openshift-ansible fix https://github.com/openshift/openshift-ansible/pull/5560
As I understood, the os_firewall_manage_iptables provider works fine for simple rules and can manage to handle this case fully, therefore the proposed fix based on iptables rules.
Although I'm not sure how to handle advanced flannel configuration steps described in https://bugzilla.redhat.com/show_bug.cgi?id=1490960, like masquerade rules. But that's another story.
I think the DNS issue happens because the nodes have just one network interface as in the reference architecture the DNS iptables rules are not needed.
This requires an enhancement in Ansible so that dnsIP is overridable by flannel. Alternatively the fact 'ansible_default_ipv4' can be set to the desired interface's IP address either by the playbook or by changing the default route on the host.
(i.e. the interface that will route 18.104.22.168)
Workaround fix will be as per this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1493955
That means, the above bug's fixes will cover this bug's problems also.
*** This bug has been marked as a duplicate of bug 1493955 ***