Bug 839384 - kernel hangs up in netpoll
kernel hangs up in netpoll
Status: CLOSED DUPLICATE of bug 769734
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Rashid Khan
Red Hat Kernel QE team
: 839381 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2012-07-11 15:20 EDT by Andrew Vagin
Modified: 2012-08-08 03:53 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-08-08 03:53:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
console.log (37.16 KB, text/x-log)
2012-07-11 15:20 EDT, Andrew Vagin
no flags Details

  None (edit)
Description Andrew Vagin 2012-07-11 15:20:10 EDT
Created attachment 597656 [details]

Description of problem:
Non-debug kernel hangs up, the debug kernel reports a problem:
<3>BUG: sleeping function called from invalid context at kernel/mutex.c:287
<0>BUG: spinlock recursion on CPU#0, brctl/4715 (Not tainted)
<0> lock: ffffffffa02d3000, .magic: dead4ead, .owner: brctl/4715, .owner_cpu: 

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Set up netconsole in eth0
2. Create veth devices
   # ip link create type veth
   # ip link set up dev veth1
   # ip link set up dev veth0
3. Create bridge and add veth1 to it
   # brctl addbr br0
   # brctl addif br0 veth1
4. Add eth0 to br0
   # brctl addif br0 eth0
Actual results:
kernel hangs up

Expected results:
kernel should not hang
Comment 2 Andrew Vagin 2012-07-11 15:36:50 EDT
The mainstrem kernel doesn't hangs in this case and reports following messages in log:
netconsole: network logging stopped on interface eth0 as it is joining a master device
device eth0 entered promiscuous mode
br0: port 2(eth0) entered forwarding state
br0: port 2(eth0) entered forwarding state
Comment 3 Andrew Vagin 2012-07-12 03:59:32 EDT
Looks like the following commit should be back-ported.

commit 13f172ff26563995049abe73f6eeba828de3c09d
Author: Neil Horman <nhorman@tuxdriver.com>
Date:   Fri Apr 22 08:10:59 2011 +0000

    netconsole: fix deadlock when removing net driver that netconsole is using (v2)
    A deadlock was reported to me recently that occured when netconsole was being
    used in a virtual guest.  If the virtio_net driver was removed while netconsole
    was setup to use an interface that was driven by that driver, the guest
    deadlocked.  No backtrace was provided because netconsole was the only console
    configured, but it became clear pretty quickly what the problem was.  In
    netconsole_netdev_event, if we get an unregister event, we call
    __netpoll_cleanup with the target_list_lock held and irqs disabled.
    __netpoll_cleanup can, if pending netpoll packets are waiting call
    cancel_delayed_work_sync, which is a sleeping path.  the might_sleep call in
    that path gets triggered, causing a console warning to be issued.  The
    netconsole write handler of course tries to take the target_list_lock again,
    which we already hold, causing deadlock.
    The fix is pretty striaghtforward.  Simply drop the target_list_lock and
    re-enable irqs prior to calling __netpoll_cleanup, the re-acquire the lock, and
    restart the loop.  Confirmed by myself to fix the problem reported.
    Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
    CC: "David S. Miller" <davem@davemloft.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
Comment 4 Rashid Khan 2012-07-26 16:26:54 EDT
*** Bug 839381 has been marked as a duplicate of this bug. ***
Comment 5 Cong Wang 2012-08-08 03:53:27 EDT

*** This bug has been marked as a duplicate of bug 769734 ***

Note You need to log in before you can comment on or make changes to this bug.