Bug 1622543 - [f29 regression] networking / IP routing is broken in docker/podman containers
Summary: [f29 regression] networking / IP routing is broken in docker/podman containers
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 29
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-27 12:31 UTC by Martin Pitt
Modified: 2018-10-02 09:21 UTC (History)
31 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-10-02 09:21:13 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journal from the f29 host (328.45 KB, text/plain)
2018-08-27 12:31 UTC, Martin Pitt
no flags Details

Description Martin Pitt 2018-08-27 12:31:18 UTC
Created attachment 1478942 [details]
journal from the f29 host

Description of problem: Containers on Fedora 29 hosts cannot reach out to the internet. E. g.

# docker run -it --rm docker.io/fedora:27 bash
[root@eca50f7c7af7 /]# dnf -v -y update
DNF version: 2.7.5
cachedir: /var/cache/dnf
Cannot download 'https://mirrors.fedoraproject.org/metalink?repo=updates-released-f27&arch=x86_64': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=updates-released-f27&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org].
Error: Failed to synchronize cache for repo 'updates'

# getent ahosts fedoraproject.org || echo FAIL
FAIL

It's not really DNS, IP routing is broken in general:

# curl https://140.211.169.206  
curl: (7) Failed to connect to 140.211.169.206 port 443: No route to host

The same error in docker.io/fedora:28, so this isn't specific to an f27 container.

The default containers for f27 and f28 don't have the "ip" tool, which makes debugging a bit nasty. But the interface exists at least, but I think they are down:

# cat /sys/class/net/eth0/link_mode 
0
# cat /sys/class/net/eth0/carrier   
1

# cat /proc/net/dev
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
  eth0:    4082      43    0    0    0     0          0         0     3902      49    0    0    0     0       0          0
    lo:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0

# cat /proc/net/route 
Iface	Destination	Gateway 	Flags	RefCnt	Use	Metric	Mask		MTU	Window	IRTT                                                       
eth0	00000000	010011AC	0003	0	0	0	00000000	0	0	0                                                                               
eth0	000011AC	00000000	0001	0	0	0	0000FFFF	0	0	0   


Version-Release number of selected component (if applicable):

docker-1.13.1-62.git9cb56fd.fc29.x86_64
kernel-4.18.5-300.fc29.x86_64

Networking works fine on the F29 host itself, like resolving fedoraproject.org, doing dnf update etc.:

# nmcli d
DEVICE   TYPE      STATE         CONNECTION  
eth0     ethernet  connected     System eth0 
docker0  bridge    connected     docker0     

How reproducible: Always

I attach the journal from the host, in case that's helpful.

Comment 1 Martin Pitt 2018-08-27 12:47:02 UTC
For comparison I tried this with

    podman run -it --rm docker.io/fedora:27 bash

and this has exactly the same problem.

Comment 2 Tomas Tomecek 2018-08-30 07:02:41 UTC
I am on rawhide and have exactly the same version of docker as you, and both podman and docker containers have correct internet connection:


$ podman run --rm -ti registry.fedoraproject.org/fedora:28 bash                                                                                                                                         
[root@78b00d06c06a /]# getent hosts redhat.com
10.4.204.55     redhat.com

[root@78b00d06c06a /]# curl --head -L https://redhat.com                                                                                                                                                          
HTTP/1.0 301 Moved Permanently

HTTP/1.0 301 Moved Permanently

HTTP/1.0 200 OK


$ docker run --rm -ti registry.fedoraproject.org/fedora:28 bash
[root@f4297b113c13 /]# getent hosts redhat.com
10.4.204.55     redhat.com

[root@f4297b113c13 /]# getent ahosts redhat.com
10.4.204.55     STREAM redhat.com  
10.4.204.55     DGRAM
10.4.204.55     RAW

[root@f4297b113c13 /]# curl --head -L https://redhat.com/
HTTP/1.0 301 Moved Permanently

HTTP/1.0 301 Moved Permanently

HTTP/1.0 200 OK


I use this software:

docker-1.13.1-62.git9cb56fd.fc29.x86_64
podman-0.8.3-4.dev.git3d55721.fc30.x86_64

Linux oat 4.17.18-200.fc28.x86_64 #1 SMP Wed Aug 22 19:08:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

I can see that podman is using containernetworking-plugins-0.7.3-1.fc30.x86_64 but docker is not. Does it mean that the problem is actually lower in the stack? kernel, glibc, networkmanager?

Comment 3 Bohuslav "Slavek" Kabrda 2018-08-30 07:07:05 UTC
I've also run into this problem and I've managed to solve it by running on kernel from fc28. The kernel from fc29 that doesn't work for me is kernel-4.18.5-300.fc29.x86_64 - the bug is probably there. Now running on kernel-4.17.19-200.fc28.x86_64 and everything is working fine with no other packages changed.

Comment 4 Martin Pitt 2018-08-30 08:04:54 UTC
Thanks. Indeed Tomas is also rnuning on kernel 4.17. -- Does that mean that rawhide has an older kernel than F29? Or did you simply not upgrade to the latest one yet?

Based on all of our observations, this looks like a kernel regression in 4.18.5 then, or possibly some userspace that needs to be adjusted to changes in that kernel. So I'll reassign to kernel for the time being.

Comment 5 Laura Abbott 2018-08-30 16:44:28 UTC
Is this at all related to https://bugzilla.redhat.com/show_bug.cgi?id=1623868 ? Failing that, bisection is going to be the fastest way to figure this out.

Comment 6 Laura Abbott 2018-10-01 21:28:21 UTC
Did this get resolved with later 4.18.x stable kernels?

Comment 7 Martin Pitt 2018-10-02 09:21:13 UTC
Confirmed, container networking works fine again in current Fedora 29. Thanks!


Note You need to log in before you can comment on or make changes to this bug.