Bug 839384
Summary: | kernel hangs up in netpoll | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Andrew Vagin <avagin> | ||||
Component: | kernel | Assignee: | Rashid Khan <rkhan> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 6.3 | CC: | amwang, kdube, khorenko, manuel, rkhan, tgraf | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-08-08 07:53:27 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
The mainstrem kernel doesn't hangs in this case and reports following messages in log: netconsole: network logging stopped on interface eth0 as it is joining a master device device eth0 entered promiscuous mode br0: port 2(eth0) entered forwarding state br0: port 2(eth0) entered forwarding state Looks like the following commit should be back-ported. commit 13f172ff26563995049abe73f6eeba828de3c09d Author: Neil Horman <nhorman> Date: Fri Apr 22 08:10:59 2011 +0000 netconsole: fix deadlock when removing net driver that netconsole is using (v2) A deadlock was reported to me recently that occured when netconsole was being used in a virtual guest. If the virtio_net driver was removed while netconsole was setup to use an interface that was driven by that driver, the guest deadlocked. No backtrace was provided because netconsole was the only console configured, but it became clear pretty quickly what the problem was. In netconsole_netdev_event, if we get an unregister event, we call __netpoll_cleanup with the target_list_lock held and irqs disabled. __netpoll_cleanup can, if pending netpoll packets are waiting call cancel_delayed_work_sync, which is a sleeping path. the might_sleep call in that path gets triggered, causing a console warning to be issued. The netconsole write handler of course tries to take the target_list_lock again, which we already hold, causing deadlock. The fix is pretty striaghtforward. Simply drop the target_list_lock and re-enable irqs prior to calling __netpoll_cleanup, the re-acquire the lock, and restart the loop. Confirmed by myself to fix the problem reported. Signed-off-by: Neil Horman <nhorman> CC: "David S. Miller" <davem> Signed-off-by: David S. Miller <davem> *** Bug 839381 has been marked as a duplicate of this bug. *** *** This bug has been marked as a duplicate of bug 769734 *** |
Created attachment 597656 [details] console.log Description of problem: Non-debug kernel hangs up, the debug kernel reports a problem: <3>BUG: sleeping function called from invalid context at kernel/mutex.c:287 <0>BUG: spinlock recursion on CPU#0, brctl/4715 (Not tainted) <0> lock: ffffffffa02d3000, .magic: dead4ead, .owner: brctl/4715, .owner_cpu: Version-Release number of selected component (if applicable): 2.6.32-279.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Set up netconsole in eth0 2. Create veth devices # ip link create type veth # ip link set up dev veth1 # ip link set up dev veth0 3. Create bridge and add veth1 to it # brctl addbr br0 # brctl addif br0 veth1 4. Add eth0 to br0 # brctl addif br0 eth0 Actual results: kernel hangs up Expected results: kernel should not hang