1490820 – listen-address in dnsmasq when using flannel unreachable by pods

Bug 1490820 - listen-address in dnsmasq when using flannel unreachable by pods

Summary: listen-address in dnsmasq when using flannel unreachable by pods

Keywords:
Status:	CLOSED DUPLICATE of bug 1493955
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	3.6.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Target Release:	3.8.0
Assignee:	Rajat Chopra
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-09-12 10:15 UTC by Eduardo Minguez
Modified:	2017-12-07 14:12 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-12-07 14:12:56 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Eduardo Minguez 2017-09-12 10:15:26 UTC

Description of problem:
When installing OCP using flannel and a second network interface for container traffic, the /etc/resolv.conf file copied into the pods shows an unreachable ip for the pods.

Version-Release number of selected component (if applicable):
oc v3.6.173.0.21
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ocp.edu.flannel.com:8443
openshift v3.6.173.0.21
kubernetes v1.6.1+5115d708d7

How reproducible:
Create an OCP cluster as the reference architecture using flannel (so eth0 a.b.c.d and eth1 w.x.y.z).
The /etc/resolv.conf in the pod shows the eth0 interface, that is provided by the "listen-address=a.b.c.d" setting in the /etc/dnsmasq.d/origin-dns.conf file and it is not reachable by the pod.


Steps to Reproduce:
1. Create the OCP cluster
2. Connect to any pod running in the cluster
3. Check /etc/resolv.conf
4. Try to resolve anything from that DNS

Actual results:
$ oc rsh docker-registry-1-14tm0
sh-4.2$ curl canihazip.com
curl: (6) Could not resolve host: canihazip.com; Unknown error
sh-4.2$ cat /etc/resolv.conf 
nameserver 10.19.115.248
search default.svc.cluster.local svc.cluster.local cluster.local openstacklocal edu.flannel.com
options ndots:5

Expected results:
Resolve the DSN entry

Additional info:
Adding the following iptables rules solves the issue
# iptables -A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 53 -j ACCEPT
# iptables -A OS_FIREWALL_ALLOW -p udp -m state --state NEW -m udp --dport 53 -j ACCEPT

But I think there are two issues:

* Listen address should include the eth1 ip in the dnsmasq configuration
* The resolv.conf copied to the pod should point to that ip (I think this parameter is dnsIP in the node-config.yaml)

Comment 1 Bogdan Dobrelya 2017-09-27 15:21:07 UTC

Just a side note, I do not thing iptables rules would be a good fix in the end of the day, keeping in mind https://docs.openshift.com/container-platform/latest/admin_guide/iptables.html#iptables-service that makes any iptables rules ephemeral, which is a node boot time only (IIUC). Perhaps the better fix would be to fix the dnsmasq config and/or better document the iptables persistence caveats for openshift or provide poor users like me some help with translating iptables rules for firewalld.

Comment 2 Bogdan Dobrelya 2017-09-27 16:12:39 UTC

Proposed openshift-ansible fix https://github.com/openshift/openshift-ansible/pull/5560

Comment 3 Bogdan Dobrelya 2017-09-27 16:24:46 UTC

As I understood, the os_firewall_manage_iptables provider works fine for simple rules and can manage to handle this case fully, therefore the proposed fix based on iptables rules.

Although I'm not sure how to handle advanced flannel configuration steps described in https://bugzilla.redhat.com/show_bug.cgi?id=1490960, like masquerade rules. But that's another story.

Comment 4 Eduardo Minguez 2017-10-02 09:32:21 UTC

I think the DNS issue happens because the nodes have just one network interface as in the reference architecture the DNS iptables rules are not needed.

Comment 7 Rajat Chopra 2017-10-23 15:01:58 UTC

This requires an enhancement in Ansible so that dnsIP is overridable by flannel. Alternatively the fact 'ansible_default_ipv4' can be set to the desired interface's IP address either by the playbook or by changing the default route on the host.
(i.e. the interface that will route 8.8.8.8)

Comment 8 Rajat Chopra 2017-10-23 15:21:10 UTC

Workaround fix will be as per this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1493955
That means, the above bug's fixes will cover this bug's problems also.

Comment 9 Ben Bennett 2017-12-07 14:12:56 UTC


*** This bug has been marked as a duplicate of bug 1493955 ***

Note You need to log in before you can comment on or make changes to this bug.