Description of problem: Sometimes when I use docker I get an error message about an unregistered netdevice. [343948.870623] docker0: port 1(veth0f10bc6) entered disabled state [344088.137146] unregister_netdevice: waiting for lo to become free. Usage count = 1 [344098.176868] unregister_netdevice: waiting for lo to become free. Usage count = 1 [344179.258056] docker0: port 1(vethfaf0cca) entered blocking state Version-Release number of selected component (if applicable): docker-1.10.3-26.git1ecb834.fc24.x86_64 kernel-4.7.2-201.fc24.x86_64 How reproducible: happens occasionally Steps to Reproduce: 1. start/stop docker container
This was corrected upstream, as per the kernel changelog https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22 I request this fix to be backported to RHEL7.2 3.10 kernel. commit 8b18e0e49804ad6d481482a6663b18d99510fdfe Author: Wei Yongjun <weiyongjun1> Date: Mon Sep 5 16:06:31 2016 +0800 ipv6: addrconf: fix dev refcont leak when DAD failed commit 751eb6b6042a596b0080967c1a529a9fe98dac1d upstream. In general, when DAD detected IPv6 duplicate address, ifp->state will be set to INET6_IFADDR_STATE_ERRDAD and DAD is stopped by a delayed work, the call tree should be like this: ndisc_recv_ns -> addrconf_dad_failure <- missing ifp put -> addrconf_mod_dad_work -> schedule addrconf_dad_work() -> addrconf_dad_stop() <- missing ifp hold before call it addrconf_dad_failure() called with ifp refcont holding but not put. addrconf_dad_work() call addrconf_dad_stop() without extra holding refcount. This will not cause any issue normally. But the race between addrconf_dad_failure() and addrconf_dad_work() may cause ifp refcount leak and netdevice can not be unregister, dmesg show the following messages: IPv6: eth0: IPv6 duplicate address fe80::XX:XXXX:XXXX:XX detected! ... unregister_netdevice: waiting for eth0 to become free. Usage count = 1 Fixes: c15b1ccadb32 ("ipv6: move DAD and addrconf_verify processing to workqueue") Signed-off-by: Wei Yongjun <weiyo...> Signed-off-by: David S. Miller <da...> Signed-off-by: Greg Kroah-Hartman <gre...> --- net/ipv6/addrconf.c | 2 ++ 1 file changed, 2 insertions(+) --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1898,6 +1898,7 @@ errdad: spin_unlock_bh(&ifp->lock); addrconf_mod_dad_work(ifp, 0); + in6_ifa_put(ifp); } /* Join to solicited addr multicast group. @@ -3609,6 +3610,7 @@ static void addrconf_dad_work(struct wor addrconf_dad_begin(ifp); goto out; } else if (action == DAD_ABORT) { + in6_ifa_hold(ifp); addrconf_dad_stop(ifp, 1); goto out; }
(In reply to Bernardo Donadio from comment #1) > This was corrected upstream, as per the kernel changelog > https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.22 > > I request this fix to be backported to RHEL7.2 3.10 kernel. Please use the RHEL process to request fixes and backports. The Fedora and RHEL bugzilla products are separate and are handled entirely differently.
I'm seeing this issue with a fedora 24 kernel, so it is not fixed by 8b18e0e49804ad6d481482a6663b18d99510fdfe. Also note that in my case it concerns the lo interface.
Josh, comment #2, sorry. I was with RHEL in mind (since my productions systems are also affected by this bug), but I meant the Fedora current kernel (I'm seeing the issue on my F24 desktop). Too much coffee I guess... Stefan, comment #3, this fix was applied to the currently supported branches of the kernel by Linus. However, since it is recent, there's a good chance that it hadn't reached Fedora 24 stable yet. I will verify it as soon as I have a bit of spare time. In the meantime, there's quite a bit of discussion on this issue and how it affects docker in the following link: https://github.com/docker/docker/issues/5618
Also met this issue in the newest kernel with docker: [Sun Jan 15 10:53:44 2017] unregister_netdevice: waiting for lo to become free. Usage count = 1 [Sun Jan 15 10:54:07 2017] unregister_netdevice: waiting for lo to become free. Usage count = 1 [Sun Jan 15 10:54:17 2017] unregister_netdevice: waiting for lo to become free. Usage count = 1 kernel: 3.10.0-514.2.2.el7.x86_64 docker: 1.12.5
Any hope to see this backport coming to Centos7/Atomic as I'm facing also this problem from time to time? It appears randomly, on the very latest version of Atomic.
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There are a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 24 kernel bugs. Fedora 25 has now been rebased to 4.10.9-100.fc24. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 2 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.