Bug 880394

Summary:	unregister_netdevice: waiting for lo to become free. Usage count = 1
Product:	[Fedora] Fedora	Reporter:	Davide Repetto <red>
Component:	kernel	Assignee:	Rashid Khan <rkhan>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	17	CC:	cristi.falcas, gansalmon, itamar, jeder, jogreene, jonathan, kernel-maint, madhu.chinakonda, michele, mleitner, red, rkhan, rocketraman, tengo01+rh, thadden, un.f.m.o.j, vclinton
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	FC20 with kernel 3.14.4-200.fc20.x86_64	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-05-23 14:26:45 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Davide Repetto 2012-11-26 21:54:26 UTC

Description of problem:
=======================
Shutting down an lxc client sometimes causes a message:
unregister_netdevice: waiting for lo to become free. Usage count = 1
After that the host machine cannot reboot as it keeps waiting.
On my machines a power cycle is required as not even sysrq-b will force the reboot.

Version-Release number of selected component (if applicable):
=============================================================
I experienced it with PAE kernels: 3.6.5-1 3.6.6-1
Today I'll install 3.6.7-4. So I cannot yet tell if it has the problem.

How reproducible:
=================
Not consistently. To me though it happens most often. 
Someone has written a c program to toggle consistently.

Additional info:
================
This is the same problem and some patch have been created:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471

Upstream kernel seems to have a solution too:
https://bugzilla.kernel.org/show_bug.cgi?id=47181

Comment 1 Davide Repetto 2012-11-26 21:56:44 UTC

the same happens (less often) with kernel 3.6.6-1.fc16.x86_64

Comment 2 Joachim von Thadden 2013-02-07 09:33:25 UTC

This problem seems to be active since a longer time and has supposedly been fixed in a kernel version but reappeared.

It is easily reproducible:
- login to container
- scp some data (probably more than 50 MB) from your local network into the container
- halt the container with: halt

My kernel is
[root@jo-laptop ~]# uname -a
Linux jo-laptop.athome.de 3.7.3-101.fc17.x86_64 #1 SMP Fri Jan 18 17:40:57 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

But the problem is there since a looooooooong time. I see this since I use LXC heavily, at least since Fedora 16 and also on RedHat 6. So many older kernels are affected, too.

There are already bug reports in Ubuntu and there should have been a fix in 3.5.0, but the problem reoccurred with 3.6. This is the report with the former fix:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1065434

And this is the report with the reappearance of the bug (if it has ever been solved...):
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1039196

Comment 3 Davide Repetto 2013-02-23 00:34:24 UTC

Solved on new kernels

Comment 4 Jeremy Eder 2013-08-23 18:09:12 UTC

Still occurs on 3.10.7-200.fc19.x86_64

Comment 6 Michele Baldessari 2013-11-20 11:25:59 UTC

Jeremy,

keep an eye on this one:
http://marc.info/?l=linux-netdev&m=138491716317270&w=2

Looks relevant.

hth,
Michele

Comment 7 Michele Baldessari 2013-11-26 21:15:40 UTC

This potential fix has hit upstream (v3.13):
commit dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4
Author: Alexei Starovoitov <ast>
Date:   Tue Nov 19 19:12:34 2013 -0800

    ipv4: fix race in concurrent ip_route_input_slow()
    
    CPUs can ask for local route via ip_route_input_noref() concurrently.
    if nh_rth_input is not cached yet, CPUs will proceed to allocate
    equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
    via rt_cache_route()
    Most of the time they succeed, but on occasion the following two lines:
        orig = *p;
        prev = cmpxchg(p, orig, rt);
    in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
    But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
    so dst is leaking. dst_destroy() is never called and 'lo' device
    refcnt doesn't go to zero, which can be seen in the logs as:
        unregister_netdevice: waiting for lo to become free. Usage count = 1
    Adding mdelay() between above two lines makes it easily reproducible.
    Fix it similar to nh_pcpu_rth_output case.
    
    Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.")
    Signed-off-by: Alexei Starovoitov <ast>
    Signed-off-by: David S. Miller <davem>

The regression was introduced in v3.6 so it would match the data from this BZ
quite well.

Comment 8 John Greene 2014-05-19 21:27:45 UTC

Have you tried the later kernel? Is it still an issue?

Comment 9 John Greene 2014-05-23 14:26:45 UTC

The patch you describe is in 3.12, and looks promising.  Please give that try and reopen this is your issue persists.  Happy to help, but I need some replies.

I'll assume this is no longer a problem and close this.

Comment 10 Davide Repetto 2014-05-28 11:19:49 UTC

At least it seems to be fixed in FC20 with kernel 3.14.4-200.fc20.x86_64.

Comment 11 John Greene 2014-05-28 14:06:32 UTC

Thanks Davide.  Have updated the Bug with this info.

Comment 12 Vaughn Clinton 2015-07-04 09:34:13 UTC

I'm still seeing this bug whenever I restart/reboot a container instance.  I'm running Centos 7 with the most recent updates.  I'm running LXC commands and have not attempted this with docker:


[22990.726061] virbr0: port 2(veth8XFXRX) entered disabled state
[23000.827016] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23010.939016] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23021.048019] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23031.163009] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23041.278016] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23051.387018] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23061.493018] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23071.599020] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23081.714021] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23091.842022] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23101.969029] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23112.078016] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23122.197027] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23132.306016] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23142.412020] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23152.515019] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23160.721096] INFO: task lxc-start:3051 blocked for more than 120 seconds.
[23160.721103] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[23160.721106] lxc-start       D ffff88025fdd3680     0  3051   2955 0x00000084
[23160.721111]  ffff880093367d30 0000000000000086 ffff8800969b6660 ffff880093367fd8
[23160.721121]  ffff880093367fd8 ffff880093367fd8 ffff8800969b6660 ffffffff819e18a0
[23160.721124]  ffffffff819e18a4 ffff8800969b6660 00000000ffffffff ffffffff819e18a8
[23160.721127] Call Trace:
[23160.721138]  [<ffffffff8160a899>] schedule_preempt_disabled+0x29/0x70
[23160.721142]  [<ffffffff816085e5>] __mutex_lock_slowpath+0xc5/0x1c0
[23160.721144]  [<ffffffff81607a4f>] mutex_lock+0x1f/0x2f
[23160.721149]  [<ffffffff814f4cf1>] copy_net_ns+0x71/0x130
[23160.721154]  [<ffffffff8109c469>] create_new_namespaces+0xf9/0x180
[23160.721157]  [<ffffffff8109c60e>] copy_namespaces+0x8e/0xd0
[23160.721161]  [<ffffffff8106c6d5>] copy_process.part.25+0x925/0x14d0
[23160.721164]  [<ffffffff8106d43c>] do_fork+0xbc/0x350
[23160.721169]  [<ffffffff811138fd>] ? call_rcu_sched+0x1d/0x20
[23160.721172]  [<ffffffff8109df9f>] ? __put_cred+0x3f/0x50
[23160.721175]  [<ffffffff8106d756>] SyS_clone+0x16/0x20
[23160.721179]  [<ffffffff81614559>] stub_clone+0x69/0x90
[23160.721181]  [<ffffffff81614209>] ? system_call_fastpath+0x16/0x1b
[23162.624019] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23172.739022] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23182.860022] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23192.972031] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23203.087029] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23213.199028] unregister_netdevice: waiting for lo to become free. Usage count = 1
[23223.308020] unregister_netdevice: waiting for lo to become free. Usage count = 1

----------- Kernel Version ------------------------

Centos 7 Kernel version: 3.10.0-229.7.2.el7.x86_64

----------- CPU Info ------------------------------

cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
stepping	: 6
microcode	: 0x60f
cpu MHz		: 2327.288
cache size	: 6144 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm dtherm tpr_shadow vnmi flexpriority
bogomips	: 4654.57
clflush size	: 64
cache_alignment	: 64
address sizes	: 38 bits physical, 48 bits virtual
power management:


Host Virbr0 and L0 interfaces:

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 0  (Local Loopback)
        RX packets 10  bytes 1076 (1.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10  bytes 1076 (1.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

virbr0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 192.168.122.1  netmask 255.255.255.0  broadcast 192.168.122.255
        ether 52:54:00:a0:86:0e  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         192.168.1.1     0.0.0.0         UG    100    0        0 enp4s0f0
192.168.1.0     0.0.0.0         255.255.255.0   U     100    0        0 enp4s0f0
192.168.122.0   0.0.0.0         255.255.255.0   U     0      0        0 virbr0



Let me know what additional information is required.  I can't seem to find a pattern to be able to reproduce this on demand.  I've tried heavy network IO and I'm still unable to consistently been able to reproduce the above behavior on demand.

Vaughn

Comment 13 Raman Gupta 2016-02-05 04:46:58 UTC

Just saw this on 4.3.3-303.fc23.x86_64.

Comment 14 un.f.m.o.j 2016-07-08 06:28:28 UTC

Experiencing it on 4.5.7-200.fc23.x86_64

Comment 15 Marcelo Ricardo Leitner 2016-07-08 16:28:51 UTC

(In reply to un.f.m.o.j from comment #14)
> Experiencing it on 4.5.7-200.fc23.x86_64

Please report a new bug then. And if possible, give a run with a 4.6 kernel, though I'm not sure it's available for F23 already.