Bug 567540 - unregister_netdevice: waiting for veth5 to become free when I remove netloop
Summary: unregister_netdevice: waiting for veth5 to become free when I remove netloop
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Laszlo Ersek
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 514490
TreeView+ depends on / blocked
 
Reported: 2010-02-23 08:18 UTC by masanari iida
Modified: 2011-07-21 10:26 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-07-21 10:26:45 UTC
Target Upstream Version:


Attachments (Terms of Use)
render netloop permanent (971 bytes, patch)
2010-12-09 12:42 UTC, Laszlo Ersek
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1065 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update 2011-07-21 09:21:37 UTC

Description masanari iida 2010-02-23 08:18:19 UTC
Description of problem:
When I try to remove netloop module from kernel, following message 
appeared repeatedly. And I can not operate the OS any more.

host kernel: unregister_netdevice: waiting for veth5 to become free. Usage count = 1

Version-Release number of selected component (if applicable):
RHEL5 (2.6.18-164.11.1.el5xen)


How reproducible:
Always

Steps to Reproduce:
1. Install xen kernel.
   This system doesn't have Guest OS at a moment.

2. Configure bonding for DOM-0.
   I don't know if this is needed for reproduce.

3. Host OS the system with super user.

4. #rmmod netloop

  
Actual results:
host kernel: unregister_netdevice: waiting for veth5 to become free. Usage count = 1
Message from syslogd@ at Tue Feb 23 16:48:40 2010 ...
host last message repeated 3 times
Message from syslogd@ at Tue Feb 23 16:49:41 2010 ...
host last message repeated 6 times
Message from syslogd@ at Tue Feb 23 16:50:43 2010 ...
host last message repeated 6 times

Expected results:
Display "ERROR: Module netloop is in use" 
and prompt come back soon.

Additional info:

Comment 1 Laszlo Ersek 2010-12-08 14:41:01 UTC
peth_X --- xenbr_X --- vif0.X --- eth_X (dom0)
   \                    netloop    netloop
    \
  iptables FORWARD
      \
       \
      virbr_Y --------- vifU.Y --- eth_Y (domU)
                        netback    netfront

I was able to reproduce the bug on a 2.6.18-233.el5xen host, with the same message. Network access to the host died (understandably), but console works.

To me it doesn't seem very useful to try to remove netloop. Based on what result you expected, I assume you don't really wish for a removed netloop, rather correct error checking. I think the netloop module should be made permanent, like netback is.

# uname -r
2.6.18-233.el5xen

# lsmod | egrep 'netbk|netloop'
netloop                40001  0 
netbk                 130305  0 [permanent]

I'll try to dig up an upstream commit to this effect, or if I find none, I'll try to write it myself.

Comment 2 Laszlo Ersek 2010-12-08 17:03:53 UTC
A Debian bug was reported for this behavior in 2007:

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=425703

It was not fixed, just closed when the netloop code disappeared from the Debian kernel. The issue also popped up in a Fedora-xen thread:

    http://www.redhat.com/archives/fedora-xen/2007-April/msg00074.html

Daniel P. Berrange stated at that time, "This is basically a limitation of the netloop module - its not written to allow its removal once loaded." That's a good hint for me to make netloop permanent, even though netloop looks as if it wanted to support removal (see clean_loopback()).

(I presume the removal fails *exactly* because the two halves of the loop are linked to each other, and whichever half you wanted to remove first, it has the other half referring to it -- there's a loop in the dependency graph. To make it removable, I guess one should code the teardown as a precise reversal of the setup -- stop the transmission on both sides first, atomically, then dissociate the halves, then bring down both halves. I believe this might not fit the current netdevice attitude.)

The removal code in netloop is "blamed" on this commit:

    commit 23a376dfe8864bfb6e410675b7ec227e1fc27fb8
    Author: Dave Jones <davej@redhat.com>
    Date:   Sat Oct 15 00:00:00 2005 -0400

	xen:

	The Xen patch

	  * linux-2.6 /mnt/ro/repos/hg/linux-2.6
	changeset:   36267:7fd51353a76f
	  * linux-2.6-xen-fedora /mnt/ro/repos/hg/linux-2.6-xen-fedora
	changeset:   36192:656cc38a840d
	  * xen-3.0.3-testing /mnt/ro/repos/hg/xen-3.0.3-testing
	changeset:   11633:000aa9510e55
	  * linux-2.6-xen-3.0.3 /mnt/ro/repos/hg/linux-2.6-xen-3.0.3
	changeset:   22908:55fbb4a85ac3

I was unable to trace those.

There was also this patch on xen-devel:

    http://lists.xensource.com/archives/html/xen-devel/2006-02/msg01033.html

merged as

    http://xenbits.xensource.com/xen-unstable.hg?rev/271cb04a4f2b

Comment 3 Laszlo Ersek 2010-12-09 12:42:47 UTC
Created attachment 467736 [details]
render netloop permanent

Comment 4 Laszlo Ersek 2010-12-09 12:44:25 UTC
Sent patch in comment 3 to upstream too:

  http://lists.xensource.com/archives/html/xen-devel/2010-12/msg00541.html

Comment 5 Laszlo Ersek 2010-12-10 16:14:54 UTC
Upstream c/s 1058/f3d9d0deec4e:

  http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/f3d9d0deec4e

Comment 7 RHEL Program Management 2011-02-01 16:52:18 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 9 Jarod Wilson 2011-02-09 14:55:09 UTC
in kernel-2.6.18-243.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 12 Jinxin Zheng 2011-05-18 07:36:28 UTC
I reproduced this on kernel-xen -238, using the steps provided in the main description. Host prints the following message and hangs upon `rmmod netloop`,

host kernel: unregister_netdevice: waiting for vethN to become free. Usage count = 1

on -261 the netloop module is made permanent:

# lsmod | grep netloop
netloop                40001  0 [permanent]

and removing the module produces error message without hang,

# rmmod netloop
ERROR: Removing 'netloop': Device or resource busy

so I'm putting this to VERIFIED.

Comment 13 errata-xmlrpc 2011-07-21 10:26:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html


Note You need to log in before you can comment on or make changes to this bug.