Bug 426031 - rapid block device plug / unplug leads to kernel crash and/or soft lockup
rapid block device plug / unplug leads to kernel crash and/or soft lockup
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel-xen (Show other bugs)
4.6
All Linux
low Severity low
: rc
: ---
Assigned To: Don Dutile
Martin Jenner
:
Depends On: 308971
Blocks:
  Show dependency treegraph
 
Reported: 2007-12-17 16:25 EST by Don Dutile
Modified: 2008-07-24 15:23 EDT (History)
1 user (show)

See Also:
Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-24 15:23:30 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Don Dutile 2007-12-17 16:25:42 EST
+++ This bug was initially created as a clone of Bug #308971 +++

Description of problem:
Rapidly pluging and unpluggin a block device eventually leads to a kernel crash
when the device is unplugged before fully established due to a double free.

This effects RHEL5 and RHEL4u5. The fix is up at
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c1c57fea77e9 and/or
http://xenbits.xensource.com/kernels/rhel4x.hg?rev/5bccd45a081d

The fix depends on 
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017 and/or
http://xenbits.xensource.com/kernels/rhel4x.hg?rev/156e3eaca552 which are
attached to #247265 but only applied to RHEL5, I think.

Version-Release number of selected component (if applicable):
2.6.18-8.el5 and 2.6.9-55.EL

How reproducible:
A simple loop which attaches and detaches a block device as quick as possible is
sufficient.
  
Actual results:
The backtrace on 2.6.9-55.EL:

 [<c043ea67>] softlockup_tick+0x98/0xa6

 [<c0408b7d>] timer_interrupt+0x504/0x557

 [<c043ec9b>] handle_IRQ_event+0x27/0x51

 [<c043ed58>] __do_IRQ+0x93/0xe8

 [<c040672b>] do_IRQ+0x93/0xae

 [<c053a045>] evtchn_do_upcall+0x64/0x9b

 [<c0404ec5>] hypervisor_callback+0x3d/0x48

 [<c045007b>] __pte_alloc+0x1e1/0x21a

 [<c05f4e18>] _spin_lock+0x7/0xf

 [<c05f4179>] __mutex_lock_slowpath+0x19/0x74

 [<c04d36d0>] __next_cpu+0x12/0x21

 [<c045e2ee>] kfree+0xe/0x77

 [<c05f41e3>] .text.lock.mutex+0xf/0x14

 [<c04d3e3d>] kobject_cleanup+0x4a/0x5e

 [<c04c6467>] elevator_exit+0xe/0x32

 [<c04c931c>] blk_cleanup_queue+0x2d/0x36

 [<d1029e6d>] xlvbd_del+0x4d/0x56 [xenblk]

 [<d10293f4>] blkfront_closing+0x45/0x4f [xenblk]

 [<d1029e1b>] blkif_release+0x34/0x39 [xenblk]

 [<c0468cb4>] __blkdev_put+0x50/0x123

 [<c04950d8>] register_disk+0x117/0x155

 [<c04cc39c>] add_disk+0x2b/0x36

 [<c04cbd9c>] exact_match+0x0/0x4

 [<c04cc248>] exact_lock+0x0/0xd

 [<d1029d7f>] backend_changed+0xf3/0x153 [xenblk]

 [<c053fc58>] otherend_changed+0x74/0x79

 [<c053e442>] xenwatch_handle_callback+0x12/0x44

 [<c053ee3a>] xenwatch_thread+0xf1/0x107

 [<c042cbe1>] autoremove_wake_function+0x0/0x2d

 [<c053ed49>] xenwatch_thread+0x0/0x107

 [<c042cb15>] kthread+0xc0/0xeb

 [<c042ca55>] kthread+0x0/0xeb

 [<c04029b5>] kernel_thread_helper+0x5/0xb

-- Additional comment from ijc@hellion.org.uk on 2007-12-14 04:17 EST --
Created an attachment (id=288811)
linux-2.6.180-xen 217:c1c57fea77e9 ported to 2.6.18-53.el5


-- Additional comment from ijc@hellion.org.uk on 2007-12-14 04:18 EST --
Created an attachment (id=288821)
linux-2.6.18-xen 217:c1c57fea77e9 ported to 2.6.9-67.EL


-- Additional comment from ddutile@redhat.com on 2007-12-14 16:07 EST --
Simple fix for corner case; working on rhel5 patch & test right now.
Comment 1 Don Dutile 2007-12-17 16:35:47 EST
Posted a patch to 4.7.
Comment 2 Bill Burns 2008-01-04 14:16:44 EST
Seting dev ack for Don Dutile.
Comment 3 Vivek Goyal 2008-02-28 18:29:33 EST
Committed in 68.15.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 6 errata-xmlrpc 2008-07-24 15:23:30 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2008-0665.html

Note You need to log in before you can comment on or make changes to this bug.