Bug 426031 - rapid block device plug / unplug leads to kernel crash and/or soft lockup
Summary: rapid block device plug / unplug leads to kernel crash and/or soft lockup
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel-xen   
(Show other bugs)
Version: 4.6
Hardware: All Linux
Target Milestone: rc
: ---
Assignee: Don Dutile
QA Contact: Martin Jenner
Depends On: 308971
TreeView+ depends on / blocked
Reported: 2007-12-17 21:25 UTC by Don Dutile
Modified: 2008-07-24 19:23 UTC (History)
1 user (show)

Fixed In Version: RHSA-2008-0665
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-07-24 19:23:30 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2008:0665 normal SHIPPED_LIVE Moderate: Updated kernel packages for Red Hat Enterprise Linux 4.7 2008-07-24 16:41:06 UTC

Description Don Dutile 2007-12-17 21:25:42 UTC
+++ This bug was initially created as a clone of Bug #308971 +++

Description of problem:
Rapidly pluging and unpluggin a block device eventually leads to a kernel crash
when the device is unplugged before fully established due to a double free.

This effects RHEL5 and RHEL4u5. The fix is up at
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/c1c57fea77e9 and/or

The fix depends on 
http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/11483a00c017 and/or
http://xenbits.xensource.com/kernels/rhel4x.hg?rev/156e3eaca552 which are
attached to #247265 but only applied to RHEL5, I think.

Version-Release number of selected component (if applicable):
2.6.18-8.el5 and 2.6.9-55.EL

How reproducible:
A simple loop which attaches and detaches a block device as quick as possible is
Actual results:
The backtrace on 2.6.9-55.EL:

 [<c043ea67>] softlockup_tick+0x98/0xa6

 [<c0408b7d>] timer_interrupt+0x504/0x557

 [<c043ec9b>] handle_IRQ_event+0x27/0x51

 [<c043ed58>] __do_IRQ+0x93/0xe8

 [<c040672b>] do_IRQ+0x93/0xae

 [<c053a045>] evtchn_do_upcall+0x64/0x9b

 [<c0404ec5>] hypervisor_callback+0x3d/0x48

 [<c045007b>] __pte_alloc+0x1e1/0x21a

 [<c05f4e18>] _spin_lock+0x7/0xf

 [<c05f4179>] __mutex_lock_slowpath+0x19/0x74

 [<c04d36d0>] __next_cpu+0x12/0x21

 [<c045e2ee>] kfree+0xe/0x77

 [<c05f41e3>] .text.lock.mutex+0xf/0x14

 [<c04d3e3d>] kobject_cleanup+0x4a/0x5e

 [<c04c6467>] elevator_exit+0xe/0x32

 [<c04c931c>] blk_cleanup_queue+0x2d/0x36

 [<d1029e6d>] xlvbd_del+0x4d/0x56 [xenblk]

 [<d10293f4>] blkfront_closing+0x45/0x4f [xenblk]

 [<d1029e1b>] blkif_release+0x34/0x39 [xenblk]

 [<c0468cb4>] __blkdev_put+0x50/0x123

 [<c04950d8>] register_disk+0x117/0x155

 [<c04cc39c>] add_disk+0x2b/0x36

 [<c04cbd9c>] exact_match+0x0/0x4

 [<c04cc248>] exact_lock+0x0/0xd

 [<d1029d7f>] backend_changed+0xf3/0x153 [xenblk]

 [<c053fc58>] otherend_changed+0x74/0x79

 [<c053e442>] xenwatch_handle_callback+0x12/0x44

 [<c053ee3a>] xenwatch_thread+0xf1/0x107

 [<c042cbe1>] autoremove_wake_function+0x0/0x2d

 [<c053ed49>] xenwatch_thread+0x0/0x107

 [<c042cb15>] kthread+0xc0/0xeb

 [<c042ca55>] kthread+0x0/0xeb

 [<c04029b5>] kernel_thread_helper+0x5/0xb

-- Additional comment from ijc@hellion.org.uk on 2007-12-14 04:17 EST --
Created an attachment (id=288811)
linux-2.6.180-xen 217:c1c57fea77e9 ported to 2.6.18-53.el5

-- Additional comment from ijc@hellion.org.uk on 2007-12-14 04:18 EST --
Created an attachment (id=288821)
linux-2.6.18-xen 217:c1c57fea77e9 ported to 2.6.9-67.EL

-- Additional comment from ddutile@redhat.com on 2007-12-14 16:07 EST --
Simple fix for corner case; working on rhel5 patch & test right now.

Comment 1 Don Dutile 2007-12-17 21:35:47 UTC
Posted a patch to 4.7.

Comment 2 Bill Burns 2008-01-04 19:16:44 UTC
Seting dev ack for Don Dutile.

Comment 3 Vivek Goyal 2008-02-28 23:29:33 UTC
Committed in 68.15.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 6 errata-xmlrpc 2008-07-24 19:23:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.