Bug 187249 - [RHEL4 U3] dm-mirror: read stalls if all mirrors failed
[RHEL4 U3] dm-mirror: read stalls if all mirrors failed
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Alasdair Kergon
Depends On:
Blocks: 181409 186476
  Show dependency treegraph
Reported: 2006-03-29 10:28 EST by Jun'ichi NOMURA
Modified: 2007-11-30 17:07 EST (History)
10 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-08-10 18:58:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Testcase: Try to read from failed mirror device (578 bytes, application/x-shellscript)
2006-03-29 10:36 EST, Jun'ichi NOMURA
no flags Details
pass correct size to bio_endio (746 bytes, patch)
2006-03-29 10:42 EST, Jun'ichi NOMURA
no flags Details | Diff
Fix error message (729 bytes, patch)
2006-03-29 11:38 EST, Jun'ichi NOMURA
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 00:00:00 EDT

  None (edit)
Description Jun'ichi NOMURA 2006-03-29 10:28:20 EST
Description of problem:
  [RHEL4 U3] dm-mirror: read stalls if all mirrors failed

Version-Release number of selected component:

How reproducible:

Steps to Reproduce:
  1. Create a dm-mirror device
  2. Fail all underlying devices
  3. Read from the device

Actual results:
  Read will stall.

Expected results:
  Read should fail.

Hardware info:
  No dependency to hardware.

Additional Info:
  This bug can be one of the causes of BZ#185751.

  This is caused by calling bio_endio with 0 size.
  Fixing this problem reveals other bug.
  Attached set of patches will fix the problem.
Comment 1 Jun'ichi NOMURA 2006-03-29 10:36:54 EST
Created attachment 126995 [details]
Testcase: Try to read from failed mirror device

# sh mirror-read-fail-test.sh
0 256 mirror core 1 16 2  /dev/mapper/err1 0 /dev/mapper/err2 0
Read from failed mirror.
If you don't see 'PASS', the test fails.

<If the bug is not, we'll stop here>

dd: reading `/dev/mapper/error-mirror': Input/output error
0+0 records in
0+0 records out
Comment 2 Jun'ichi NOMURA 2006-03-29 10:39:49 EST
Typical backtrace of the stalled process:

crash> bt 22493
PID: 22493  TASK: 101aeda57f0       CPU: 1   COMMAND: "dd"
 #0 [10164b65af8] schedule at ffffffff80304a85
 #1 [10164b65bd0] io_schedule at ffffffff803053ef
 #2 [10164b65bf0] __lock_page at ffffffff80159215
 #3 [10164b65c70] find_get_page at ffffffff8015929c
 #4 [10164b65c90] do_generic_mapping_read at ffffffff80159771
 #5 [10164b65d90] __generic_file_aio_read at ffffffff8015b53c
 #6 [10164b65e10] generic_file_read at ffffffff8015b6d7
 #7 [10164b65f10] vfs_read at ffffffff80177a83
 #8 [10164b65f40] sys_read at ffffffff80177cda
 #9 [10164b65f80] system_call at ffffffff801101c6
Comment 3 Jun'ichi NOMURA 2006-03-29 10:42:05 EST
Created attachment 126996 [details]
pass correct size to bio_endio

Complete the failed bio with correct size.
Otherwise, reads to the all-failed mirror never completes.
Comment 4 Jun'ichi NOMURA 2006-03-29 11:38:43 EST
Created attachment 127002 [details]
Fix error message

When the failed bio completes, we'll see the following
kernel message:
  Out of memory causing inability to retry read.

The patch tries to fix the message.
Comment 5 Jonathan Earl Brassow 2006-04-01 12:36:52 EST

The attachment in comment #1 should be added to the mirror test suite
Comment 6 Jason Baron 2006-04-28 13:29:09 EDT
committed in stream U4 build 34.26. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 10 Red Hat Bugzilla 2006-08-10 18:58:16 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.