Bug 1128208 - docker io not using proper DNS
Summary: docker io not using proper DNS
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: docker-io
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Lokesh Mandvekar
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 1119849
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-08 15:00 UTC by Bill C. Riemers
Modified: 2015-02-07 04:40 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 1119849
Environment:
Last Closed: 2015-02-07 04:40:05 UTC


Attachments (Terms of Use)

Description Bill C. Riemers 2014-08-08 15:00:03 UTC
The resolution for Bug #1119849 introduced a new problem.

I have a dockerfile that uses the command:

git clone http://gitolite.corp.redhat.com/cgit/it-sales/sfjavasuite.git/

Up until the most recent update the dockerfile worked expected.  Now it fails with a hostname not found.

It seems part of the update is docker will now try and use fixed DNS values of 8.8.8.8 and 8.8.4.4.   Which is of course in appropriate for anyone inside a private network.   In some cases it is even considered a security risk to have DNS lookups leak to a public DNS server, as it gives outside user information about the private network.

It is possible to update the docker options to work around the problem.   But of course the DNS servers obtained by DHCP, so it would require restarting docker-io with new settings everytime a new network connection is established...

Likewise another workaround is a set if iptable rules to override all DNS lookups but again this introduces it's own set of problems.

And of course, I don't want to assume everyone who will use my Dockerfile has updated their workstations and servers with whatever hack solution I use...

Reproduce steps:

1. docker run fedora cat /etc/resolv.conf

Expected results:

The DNS settings equivalent to the host, which in my case are:

$ cat /etc/resolv.conf
# Generated by NetworkManager
domain docbill.info
search docbill.info
nameserver 127.0.0.1
nameserver 172.31.253.1
nameserver 172.31.252.1
# NOTE: the libc resolver may not support more than 3 nameservers.
# The nameservers listed below may not be recognized.
nameserver 10.11.255.155
nameserver 10.11.255.156
nameserver 10.5.26.21
nameserver 10.7.142.20
nameserver fe80::beae:c5ff:fee8:b5e%em1
nameserver fe80::4216:7eff:feea:a5b8%em1
nameserver 2001:470:1d:8a2::1


Actual results:

$ docker run fedora cat /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
search docbill.info


Note: I'm not sure how the previous docker-io version got the 127.0.0.1 correct.  But somehow it figured out that was an instruction to use the dnsmasq instance on my laptop.

Bill

Comment 1 Lars Kellogg-Stedman 2014-08-08 16:23:14 UTC
> It is possible to update the docker options to work around the
> problem.   But of course the DNS servers obtained by DHCP, so it
> would require restarting docker-io with new settings everytime a new
> network connection is established...

You could just configure docker dns to point at the docker bridge
address (172.17.42.1), and then run a dnsmasq instance attached to
that bridge.

You can add the appropriate --dns option to /etc/sysconfig/docker.

With this in place, dnsmasq will take care of forwarding requests appropriately, and you will not have to restart either docker or dnsmasq in the event that your system resolver configuration changes.

Also, note that if docker tried to use 127.0.0.1 as a nameserver, it would fail (because inside a docker container, 127.0.0.1 maps to the container, not to the hsot).

Comment 2 Bill C. Riemers 2014-08-08 17:18:03 UTC
Yes.  Both are viable workarounds.  But this is not consistent with docker io just working everwhere.   Not only would I have to distribute the Dockerfile but setup instructions for the server/workstation.   Lord help them if the user needs to run two different containers that don't have the same server setup instructions, or if they don't have the admin priviledges to change the configuration.

BTW. As I said, I'm not certain how the previous version of docker was managing to successfully work, as I was doing none of these workarounds.  It just worked as expected.  I'll probably to revert to the previous version later to try and figure that part out.

Comment 3 Bill C. Riemers 2014-08-08 17:20:06 UTC
Yuck.  It turns out that isn't the only place where my docker is failing.  Later it fails with maven2 builds where it tries using repositories like:

http://mvn.re.redhat.com/maven2

Comment 4 Bill C. Riemers 2014-08-08 17:45:16 UTC
Darn. "yum downgrade docker-io" won't work because the previous version is no longer in the official repository.   For comparison I tried on a server I with a different version of docker-io:

-bash-4.1$ rpm -qa |grep docker-io
docker-io-1.0.0-3.el6.x86_64
-bash-4.1$ sudo docker run fedora cat /etc/resolv.conf
[sudo] password for briemers: 
; search domain for devlab.redhat.com on sfa-docker.devlab.redhat.com
search devlab.redhat.com redhat.com 

nameserver 10.7.142.20
nameserver 10.7.142.21
options timeout:1
-bash-4.1$ cat /etc/resolv.conf
; search domain for devlab.redhat.com on sfa-docker.devlab.redhat.com
search devlab.redhat.com redhat.com 

nameserver 10.7.142.20
nameserver 10.7.142.21
options timeout:1

So it looks like the 1.0.0-3 version simply copied /etc/resolv.conf directly.  I don't really know what the 1.0.0-8 version did, but whatever it did it also worked on my laptop.   Even the 1.0.0-3 version behavior is preferable to the 1.0.0-9 version, in that at least that only requires special docker-io configuration when one is running their own dns server.  In most cases, the 1.0.0-3 version should simply work out of the box, as expected.

Bill

Comment 5 Lokesh Mandvekar 2014-08-08 17:57:09 UTC
Bill, what's the latest version of docker you have/had? docker 1.1.2-2 on rawhide seems to use the same /etc/resolv.conf as my host.

Looks like 1.1.2-2 has landed in f20-stable

Comment 6 Bill C. Riemers 2014-08-08 18:07:37 UTC
It looks like using --dns=172.17.42.1 completely breaks dns lookup's all together.  The reason is by default dnsmasq only listens to the lo interface.   Certainly I can add the docker0 interface to the list of interfaces in my dnsmasq.conf file.  The problem is if the dnsmasq service starts first, which it usually does, the docker0 interface does not exist...

Comment 7 Bill C. Riemers 2014-08-08 18:15:46 UTC
I just pulled 1.1.2-2 from updates-testing.   I still see the exact same issue.  I edited out the 127.0.0.1 from my /etc/resolv.conf file just incase that is was causing the problem but still the same issue.

Comment 8 Bill C. Riemers 2014-08-08 18:50:48 UTC
It looks like the problem is there is logic in docker that if the /etc/resolv.conf file contains 127.0.0.1 it ignores the all the DNS entries and instead uses the google addresses.  It is a bit tricky to reproduce the exact tests, because NetworkManager will overwrite changes to /etc/resolv.conf almost instantly...

So what I had to do:

cp /etc/resolv.conf /etc/resolv.conf.save
vi /etc/resolv.conf.save
chattr +i /etc/resolv.conf.save
mount --bind /etc/resolv.conf.save /etc/resolv.conf
service docker restart
docker run fedora cat /etc/resolv.conf


Here is the really fun part.  The new logic is only evaluated based on what is in the docker daemon is started or restarted.   So if later I do:

umount /etc/resolv.conf
docker run fedora cat /etc/resolv.conf

I will find docker happily picks up my resolv.conf file with the 127.0.0.1 to use inside the container...

At boot time, when my docker first starts, my /etc/resolv.conf contains:

domain docbill.info
search docbill.info
nameserver 172.31.252.1
nameserver 172.31.253.1

The docker daemon happily starts in the mode where it will copy /etc/resolv.conf file.   Later when I connect via VPN and NetworkManage completely rewrites my /etc/resolv.conf containers continue to pickup my /etc/resolv.conf file.

In this case though I had done a "yum update -y" after my most recent reboot while connected to VPN.   So when docker updated, it restarted the daemon.  The daemon saw my /etc/resolv.conf contained 127.0.0.1 and so it decided to ignore it.

All of this is pretty long and complicated.  So I'll write up simple reproduce instructions.

Comment 9 Bill C. Riemers 2014-08-08 19:16:01 UTC
I'm really still at a loss though, how this was working for me before you have pointed out essientially the 127.0.0.1 part would be ignored, and the rest of my path was in the wrong order to successfully resolve the dns lookups that worked before but are now failing...

I've been unsuccessful getting it working again.  The IP address 172.17.42.1 simply does not work as a DNS server, even when I tell dnsmasq to listen on all interfaces.  Simply specify a list of dns service doesn't cut it, as I need to use one set of dns servers for my home network, and another for the VPN connection.

Comment 10 Lars Kellogg-Stedman 2014-08-08 19:54:06 UTC
For what it's worth, I have dnsmasq listening on the docker bridge like this:

  listen-address=172.17.42.1
  bind-interfaces

And my docker daemon running like this:

  /usr/bin/docker -d -H fd:// --selinux-enabled --dns=172.17.42.1

Which gets me /etc/resolv.conf in the container that looks like this:

  nameserver 172.17.42.1

...and it all Just Works.  You can fix the startup ordering issue by:

- Enabling the docker *service* unit (rather than the *socket*):

      system enable docker

- Giving your dnsmasq service a Before= dependency on the docker service.  There are various ways of doing that; you could create /etc/systemd/system/dnsmasq.service.d/docker.conf with the contents:

    [Unit]
    After=docker.service
    Requires=docker.service

If you have docker listening on 172.17.42.1 and it's not working, it's possible you may need to tweak iptables to permit access.

Comment 11 Bill C. Riemers 2014-08-08 20:00:06 UTC
Alright workaround in place.  I just needed to add accept rules to iptables for port 53.  e.g.

# iptables -I INPUT -p udp --deport 53 -j ACCEPT
# iptables -I INPUT -p tcp --deport 53 -j ACCEPT


Now that I understand the bug.  Let me give a fresh reproduce instructions.

Prepartion:
# echo "nameserver 127.0.0.1" > /etc/resolv.conf.loopback
# chattr +i /etc/resolv.conf.loopback
# echo "nameserver 208.67.222.222" > /etc/resolv.conf.opendns
# echo "nameserver 208.67.220.220" >> /etc/resolv.conf.opendns
# chattr +i /etc/resolv.conf.opendns
# (cat /etc/resolv.conf.loopback /etc/resolv.conf.opendns) > /etc/resolv.conf.mixed
# chattr +i /etc/resolv.conf.mixed

Tests:

1. Opendns:

# mount --bind /etc/resolv.conf.opendns /etc/resolv.conf
# service docker restart
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 208.67.222.222
nameserver 208.67.220.220

Expected:
nameserver 208.67.222.222
nameserver 208.67.220.220

Desired:
nameserver 208.67.222.222
nameserver 208.67.220.220

2. Localhost:

# mount --bind /etc/resolv.conf.localhost /etc/resolv.conf
# service docker restart
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 8.8.8.8
nameserver 8.8.4.4

Expected:
nameserver 8.8.8.8
nameserver 8.8.4.4

Desired:
nameserver 172.17.42.1
nameserver 8.8.8.8
nameserver 8.8.4.4

3. Mixed:

# mount --bind /etc/resolv.conf.mixed /etc/resolv.conf
# service docker restart
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 8.8.8.8
nameserver 8.8.4.4

Expected:
nameserver 8.8.8.8
nameserver 8.8.4.4

Desired:
nameserver 172.17.42.1
nameserver 208.67.222.222
nameserver 208.67.220.220

4. Localhost then opendns:
# mount --bind /etc/resolv.conf.localhost /etc/resolv.conf
# service docker restart
# umount /etc/resolv.conf
# mount --bind /etc/resolv.conf.opendns /etc/resolv.conf
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 8.8.8.8
nameserver 8.8.4.4

Expected:
nameserver 208.67.222.222
nameserver 208.67.220.220

Desired:
nameserver 208.67.222.222
nameserver 208.67.220.220

5. Localhost then mixed:

# mount --bind /etc/resolv.conf.localhost /etc/resolv.conf
# service docker restart
# umount /etc/resolv.conf
# mount --bind /etc/resolv.conf.mixed /etc/resolv.conf
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 8.8.8.8
nameserver 8.8.4.4

Expected:
nameserver 8.8.8.8
nameserver 8.8.4.4

Desired:
nameserver 172.17.42.1
nameserver 208.67.222.222
nameserver 208.67.220.220

6. Opendns then localhost

# mount --bind /etc/resolv.conf.opendns /etc/resolv.conf
# service docker restart
# umount /etc/resolv.conf
# mount --bind /etc/resolv.conf.localhost /etc/resolv.conf
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 127.0.0.1

Expected:
nameserver 8.8.8.8
nameserver 8.8.4.4

Desired:
nameserver 172.17.42.1
nameserver 8.8.8.8
nameserver 8.8.4.4

7. Opendns then mixed

# mount --bind /etc/resolv.conf.opendns /etc/resolv.conf
# service docker restart
# umount /etc/resolv.conf
# mount --bind /etc/resolv.conf.mixed /etc/resolv.conf
# docker run fedora grep nameserver /etc/resolv.conf
# umount /etc/resolv.conf

Result:
nameserver 127.0.0.1
nameserver 208.67.222.222
nameserver 208.67.220.220

Expected:
nameserver 8.8.8.8
nameserver 8.8.4.4

Desired:
nameserver 172.17.42.1
nameserver 208.67.222.222
nameserver 208.67.220.220


In a nut shell, the current rules fail to work in that they depend on what is in the /etc/resolv.conf at the time the docker daemon is started, not at the time the file is referenced.  That is why tests fail to produce the expected results.  

In many cases, the expect results are not is not the desired results.  The desired results is that docker containers just work.  In practice that simply won't happen by substituting in public DNS for the whole content of the resolv.conf file. Something that might work is if 127.0.0.1 is substituted with 172.17.42.1, and possibly appending public DNS values if that is the only entry.  The only reason to append the public DNS values is the user needs to configure there local dns server to accept requests from the docker.  (Although presumably this could be done with a set of firewall rules automatically.)   While the public DNS behavior is broken, it is sufficent for some dockers to work.  So in a worse case scenario it is reasonable to append those values.

Comment 12 Bill C. Riemers 2014-08-27 14:37:06 UTC
Just correcting my previous typo and restricting the rule to just the docker subnet:

# iptables -I INPUT -s 172.17.0.0/16 -p udp --dport 53 -j ACCEPT
# iptables -I INPUT -s 172.17.0.0/16 -p tcp --dport 53 -j ACCEPT
# service iptables save

Comment 13 Daniel Walsh 2014-09-15 19:51:02 UTC
I just put out a pull request for this.

https://github.com/docker/docker/pull/8047

Basically it remove nameserver 127.0.0.1 or nameserver 127.0.1.1

And if there are still nameserver lines in /etc/resolv.conf they will get added to the container.

We will see what upstream thinks.

Comment 14 Bill C. Riemers 2014-09-17 15:32:07 UTC
Hmmm.  Seems like an odd choice of IP addresses.   Loopback is normally, 127.0.0.0/8 so really one could use virtually any lookback address such as 127.5.11.253 and have the same problem.

A better strategy would be to use iptables to redirect lookback port 53 on the container to port 53 on the host.  However, I don't know if the way docker does containers would allow that.

Comment 15 Daniel Walsh 2014-09-19 19:33:14 UTC
How easy is it to write those rules.

Comment 16 Bill C. Riemers 2014-09-19 21:02:10 UTC
Writing the rules, fairly trivial:

In the container:

iptables -t nat -I PREROUTING -p tcp -d 127.0.0.1/8 --dport 53 -j DNAT --to-destination 172.17.42.1:53
iptables -t nat -I PREROUTING -p udp -d 127.0.0.1/8 --dport 53 -j DNAT --to-destination 172.17.42.1:53

On the host:

iptables -t nat -I PREROUTING -p tcp -d 172.17.42.1 --dport 53 -j DNAT --to-destination 127.0.0.1:53
iptables -t nat -I PREROUTING -p udp -d 172.17.42.1 --dport 53 -j DNAT --to-destination 127.0.0.1:53

Presumably one also needs to set the kernel flags that allow loopback traffic to be routed...

The challenge here is the firewall rules would need to be set-up without the container doing any iptables calls.  Presumably one would want a configuration option to selectively enable or disable this feature as well, as otherwise it would be impossible for a container to run it's own DNS server.   I think right now it is possible just difficult because you can't update the /etc/resolv.conf file to use the containers DNS server...

I doubt docker-io has the infrastructure to specify custom firewall rules for the container, so a brand new framework would need to be added.   That is probably well beyound the what shoud be done to address this particular issue.

I think actually the patch you found will address 99% of the use cases for a niche condition that only effects 1% of the users.   So I would personally consider anything beyond the patch you found as an enhancement.  It is probably higher priority to address issues like being able to resize /dev/shm than it is to work on niche loopback DNS scenarios.

Like most docker-io problems there are workarounds, it is just a question if the workaround is sufficient short term and long term.


Bill

Comment 17 Bill C. Riemers 2014-10-08 20:57:47 UTC
I've been having a bit of trouble with this workaround.  Everytime I reboot I have to manually run the following commands:

sudo systemctl restart dnsmasq
sudo systemctl restart docker
sudo systemctl restart iptables


Everything starts up running, but it seems not consistently in the right sequence.   Usually my iptables rules are not loaded at all, and sometimes my docker containers do not have access to dnsmasq.   Today I had the fun time that docker seemed to start before all the networking so none of my containers could even ping ipv4 addresses.

Restarting all the services manually seems to work, but this seems to indicate that services are starting in random sequences in relations to docker.

Bill

Comment 18 Bill C. Riemers 2014-10-08 21:01:18 UTC
I'm not actually sure what the correct sequence should be.   It seems like both dnsmasq and iptables need to start after docker so the docker0 interface is defined.   But if they start after, that would result in the docker service itself not having access to the network...

Comment 19 Daniel Walsh 2014-10-09 12:00:02 UTC
https://github.com/docker/docker/pull/8457 has been merged into docker, and I believe will be in docker-1.3 or we should carry the patch until docker-1.4.

Basically this will check the resolv.conf at docker run time rather then at start of the daemon, which should fix the nameserver 8.8.8.8 problem.

The other problems you discuss here maybe should be opened as other bugzillas  or even as issues with docker, since I am not sure what the correct fix should be.

Comment 20 Bill C. Riemers 2014-10-09 12:21:59 UTC
That seems quite reasonable.

Comment 21 Daniel Walsh 2014-11-20 19:35:14 UTC
Fixed in docker-1.3.1


Note You need to log in before you can comment on or make changes to this bug.