Bug 185447 - kernel dm: flush queued bios if suspend is interrupted
Summary: kernel dm: flush queued bios if suspend is interrupted
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Alasdair Kergon
QA Contact: Brian Brock
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 181409
TreeView+ depends on / blocked
 
Reported: 2006-03-14 21:27 UTC by Alasdair Kergon
Modified: 2007-11-30 22:07 UTC (History)
3 users (show)

Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 22:41:21 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Regression test for this bug (2.54 KB, application/x-shellscript)
2006-03-20 20:26 UTC, Jun'ichi NOMURA
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 04:00:00 UTC

Description Alasdair Kergon 2006-03-14 21:27:20 UTC
If dm_suspend() is cancelled, bios already added
to the deferred list need to be submitted.
Otherwise they remain 'in limbo' until there's a dm_resume().
 
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-Off-By: Alasdair G Kergon <agk@redhat.com>
 
Index: linux-2.6.16-rc5/drivers/md/dm.c
===================================================================
--- linux-2.6.16-rc5.orig/drivers/md/dm.c       2006-03-12 21:56:04.000000000 +0000
+++ linux-2.6.16-rc5/drivers/md/dm.c    2006-03-12 21:58:05.000000000 +0000
@@ -1093,6 +1093,7 @@ int dm_suspend(struct mapped_device *md,
 {
        struct dm_table *map = NULL;
        DECLARE_WAITQUEUE(wait, current);
+       struct bio *def;
        int r = -EINVAL;
  
        down(&md->suspend_lock);
@@ -1152,9 +1153,11 @@ int dm_suspend(struct mapped_device *md,
        /* were we interrupted ? */
        r = -EINTR;
        if (atomic_read(&md->pending)) {
+               clear_bit(DMF_BLOCK_IO, &md->flags);
+               def = bio_list_get(&md->deferred);
+               __flush_deferred_io(md, def);
                up_write(&md->io_lock);
                unlock_fs(md);
-               clear_bit(DMF_BLOCK_IO, &md->flags);
                goto out;
        }
        up_write(&md->io_lock);

Comment 2 Jun'ichi NOMURA 2006-03-20 20:26:38 UTC
Created attachment 126361 [details]
Regression test for this bug

This test checks the bug.
You'll see "PASS" if the bug is fixed.
"FAIL" on RHEL4 U3 (2.6.9-34.EL).

I ran this test on 2.6.9-34.EL plus the proposed patch
and confirmed it's fixed.

Comment 3 Jun'ichi NOMURA 2006-03-20 20:30:14 UTC
If someone hits this bug, process doing I/O on the map will stall.
Crash will show backtrace like below:

crash> bt 6694
PID: 6694   TASK: e00000011ecb8000  CPU: 7   COMMAND: "dd"
 #0 [BSP:e00000011ecb93a0] context_switch at a000000100068500
 #1 [BSP:e00000011ecb9288] schedule at a000000100586fc0
 #2 [BSP:e00000011ecb9268] io_schedule at a000000100589830
 #3 [BSP:e00000011ecb9238] __wait_on_buffer at a000000100123010
 #4 [BSP:e00000011ecb91a8] __block_prepare_write at a00000010012b6f0
 #5 [BSP:e00000011ecb9170] block_prepare_write at a00000010012be40
 #6 [BSP:e00000011ecb9138] blkdev_prepare_write at a000000100134810
 #7 [BSP:e00000011ecb9060] generic_file_buffered_write at a0000001000d3520
 #8 [BSP:e00000011ecb8ff0] __generic_file_aio_write_nolock at a0000001000d4500
 #9 [BSP:e00000011ecb8fa0] generic_file_aio_write_nolock at a0000001000d4de0
#10 [BSP:e00000011ecb8f68] generic_file_write_nolock at a0000001000d5220
#11 [BSP:e00000011ecb8f30] blkdev_file_write at a000000100136f50
#12 [BSP:e00000011ecb8ee0] vfs_write at a000000100120090
#13 [BSP:e00000011ecb8e68] sys_write at a0000001001202b0
#14 [BSP:e00000011ecb8e68] ia64_ret_from_syscall at a00000010000f3e0

Workaround is to suspend the problematic map again and resume it.


Comment 4 Jason Baron 2006-03-28 17:51:59 UTC
committed in stream u4 build 34.9. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 8 Red Hat Bugzilla 2006-08-10 22:41:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html



Note You need to log in before you can comment on or make changes to this bug.