Bug 308971 - rapid block device plug / unplug leads to kernel crash and/or soft lockup
rapid block device plug / unplug leads to kernel crash and/or soft lockup
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen (Show other bugs)
All Linux
low Severity low
: ---
: ---
Assigned To: Xen Maintainance List
Martin Jenner
Depends On:
Blocks: 426031
  Show dependency treegraph
Reported: 2007-09-27 09:32 EDT by Ian Campbell
Modified: 2008-05-21 10:56 EDT (History)
3 users (show)

See Also:
Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-05-21 10:56:27 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
linux-2.6.180-xen 217:c1c57fea77e9 ported to 2.6.18-53.el5 (926 bytes, application/octet-stream)
2007-12-14 04:17 EST, Ian Campbell
no flags Details
linux-2.6.18-xen 217:c1c57fea77e9 ported to 2.6.9-67.EL (926 bytes, patch)
2007-12-14 04:18 EST, Ian Campbell
no flags Details | Diff

  None (edit)
Description Ian Campbell 2007-09-27 09:32:13 EDT
Description of problem:
Rapidly pluging and unpluggin a block device eventually leads to a kernel crash
when the device is unplugged before fully established due to a double free.

This effects RHEL5 and RHEL4u5. The fix is up at
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c1c57fea77e9 and/or

The fix depends on 
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017 and/or
http://xenbits.xensource.com/kernels/rhel4x.hg?rev/156e3eaca552 which are
attached to #247265 but only applied to RHEL5, I think.

Version-Release number of selected component (if applicable):
2.6.18-8.el5 and 2.6.9-55.EL

How reproducible:
A simple loop which attaches and detaches a block device as quick as possible is
Actual results:
The backtrace on 2.6.9-55.EL:

 [<c043ea67>] softlockup_tick+0x98/0xa6

 [<c0408b7d>] timer_interrupt+0x504/0x557

 [<c043ec9b>] handle_IRQ_event+0x27/0x51

 [<c043ed58>] __do_IRQ+0x93/0xe8

 [<c040672b>] do_IRQ+0x93/0xae

 [<c053a045>] evtchn_do_upcall+0x64/0x9b

 [<c0404ec5>] hypervisor_callback+0x3d/0x48

 [<c045007b>] __pte_alloc+0x1e1/0x21a

 [<c05f4e18>] _spin_lock+0x7/0xf

 [<c05f4179>] __mutex_lock_slowpath+0x19/0x74

 [<c04d36d0>] __next_cpu+0x12/0x21

 [<c045e2ee>] kfree+0xe/0x77

 [<c05f41e3>] .text.lock.mutex+0xf/0x14

 [<c04d3e3d>] kobject_cleanup+0x4a/0x5e

 [<c04c6467>] elevator_exit+0xe/0x32

 [<c04c931c>] blk_cleanup_queue+0x2d/0x36

 [<d1029e6d>] xlvbd_del+0x4d/0x56 [xenblk]

 [<d10293f4>] blkfront_closing+0x45/0x4f [xenblk]

 [<d1029e1b>] blkif_release+0x34/0x39 [xenblk]

 [<c0468cb4>] __blkdev_put+0x50/0x123

 [<c04950d8>] register_disk+0x117/0x155

 [<c04cc39c>] add_disk+0x2b/0x36

 [<c04cbd9c>] exact_match+0x0/0x4

 [<c04cc248>] exact_lock+0x0/0xd

 [<d1029d7f>] backend_changed+0xf3/0x153 [xenblk]

 [<c053fc58>] otherend_changed+0x74/0x79

 [<c053e442>] xenwatch_handle_callback+0x12/0x44

 [<c053ee3a>] xenwatch_thread+0xf1/0x107

 [<c042cbe1>] autoremove_wake_function+0x0/0x2d

 [<c053ed49>] xenwatch_thread+0x0/0x107

 [<c042cb15>] kthread+0xc0/0xeb

 [<c042ca55>] kthread+0x0/0xeb

 [<c04029b5>] kernel_thread_helper+0x5/0xb
Comment 1 Ian Campbell 2007-12-14 04:17:42 EST
Created attachment 288811 [details]
linux-2.6.180-xen 217:c1c57fea77e9 ported to 2.6.18-53.el5
Comment 2 Ian Campbell 2007-12-14 04:18:14 EST
Created attachment 288821 [details]
linux-2.6.18-xen 217:c1c57fea77e9 ported to 2.6.9-67.EL
Comment 3 Don Dutile 2007-12-14 16:07:23 EST
Simple fix for corner case; working on rhel5 patch & test right now.
Comment 4 Don Dutile 2007-12-17 16:30:05 EST
Posted patch to 5.2.
Comment 5 Bill Burns 2007-12-19 13:54:41 EST
Setting dev ack.
Comment 7 Don Zickus 2007-12-21 15:17:58 EST
in 2.6.18-62.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 10 errata-xmlrpc 2008-05-21 10:56:27 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.