Bug 162759
Summary: | System occasionally experienced system hangs. | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Jeff Burke <jburke> |
Component: | kernel | Assignee: | Doug Ledford <dledford> |
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.0 | CC: | ccb |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | RHSA-2006-0132 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-03-07 19:16:47 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 168429 |
Description
Jeff Burke
2005-07-08 13:22:30 UTC
Update from a couple days ago: Jeff Burke gathered the sysrq-t output from another 2.6.9-16 deadlock, in which dozens of tasks were blocked in I/O transactions in mddev_lock_uninterruptible(). There's no vmcore for this one, but the *recursive* mddev_lock_uninterruptible() caller appears to be the md1_raid1 task, which first takes the semaphore in md_handle_safemode(), and then subsequently via the sync_page_io() route, calls bio_alloc(GFP_KERNEL), leading to the deadlocking mddev_lock_uninterruptible() call: md1_raid1 D 0000010037e108d8 0 273 1 274 250 (L-TLB) 000001013f781588 0000000000000046 7974742f7665642f 0000001900000073 000001000a3167f0 0000000000000073 000001000565d840 0000000300000246 000001013f64c7f0 0000000000001eec Call Trace:<ffffffff80178420>{__find_get_block_slow+255} <ffffffff80302183>{__down+147} <ffffffff80132ef9>{default_wake_function+0} <ffffffff8030375f>{__down_failed+53} <ffffffff8029ce28>{.text.lock.md+155} <= mddev_lock_uninterruptible() <ffffffffa00629b5>{:raid1:make_request+622} <ffffffff8024959e>{generic_make_request+355} <ffffffff8013478e>{autoremove_wake_function+0} <ffffffff802496aa>{submit_bio+247} <ffffffff8017b224>{bio_alloc+288} <ffffffff80179156>{submit_bh+255} <ffffffff8017a02d>{__block_write_full_page+440} <ffffffffa007f40a>{:ext3:ext3_get_block+0} <ffffffffa007db46>{:ext3:ext3_ordered_writepage+245} <ffffffff80163553>{shrink_zone+3095} <ffffffff801cdf24>{avc_has_perm+70} <ffffffff80163b3d>{try_to_free_pages+303} <ffffffff8015c4c6>{__alloc_pages+596} <ffffffff8015c657>{__get_free_pages+11} <ffffffff8015f4ec>{kmem_getpages+36} <ffffffff8015fc81>{cache_alloc_refill+609} <ffffffff8015f9bf>{kmem_cache_alloc+90} <ffffffff8015b049>{mempool_alloc+186} <ffffffff8013478e>{autoremove_wake_function+0} <ffffffff80134797>{autoremove_wake_function+9} <ffffffff8013478e>{autoremove_wake_function+0} <ffffffff8017b120>{bio_alloc+28} <ffffffff80297a89>{sync_page_io+48} <ffffffff802991b0>{md_update_sb+263} <ffffffff80131cd9>{finish_task_switch+55} <ffffffffa0062ffc>{:raid1:raid1d+0} <ffffffff8029c237>{md_handle_safemode+244} <= mddev_lock_uninterruptible() <ffffffffa0063022>{:raid1:raid1d+38} <ffffffff80302ee8>{thread_return+110} <ffffffffa0062ffc>{:raid1:raid1d+0} <ffffffff80299c91>{md_thread+392} <ffffffff8013478e>{autoremove_wake_function+0} <ffffffff80131cd9>{finish_task_switch+55} <ffffffff8013478e>{autoremove_wake_function+0} <ffffffff80131d28>{schedule_tail+11} <ffffffff80110ca3>{child_rip+8} <ffffffffa0062ffc>{:raid1:raid1d+0} <ffffffff80299b09>{md_thread+0} <ffffffff80110c9b>{child_rip+0} An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0132.html |