Bug 187249 - [RHEL4 U3] dm-mirror: read stalls if all mirrors failed
[RHEL4 U3] dm-mirror: read stalls if all mirrors failed
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Alasdair Kergon
:
Depends On:
Blocks: 181409 186476
  Show dependency treegraph
 
Reported: 2006-03-29 10:28 EST by Jun'ichi NOMURA
Modified: 2007-11-30 17:07 EST (History)
10 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 18:58:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Testcase: Try to read from failed mirror device (578 bytes, application/x-shellscript)
2006-03-29 10:36 EST, Jun'ichi NOMURA
no flags Details
pass correct size to bio_endio (746 bytes, patch)
2006-03-29 10:42 EST, Jun'ichi NOMURA
no flags Details | Diff
Fix error message (729 bytes, patch)
2006-03-29 11:38 EST, Jun'ichi NOMURA
no flags Details | Diff

  None (edit)
Description Jun'ichi NOMURA 2006-03-29 10:28:20 EST
Description of problem:
  [RHEL4 U3] dm-mirror: read stalls if all mirrors failed

Version-Release number of selected component:
  kernel-2.6.9-34.EL

How reproducible:
  Always

Steps to Reproduce:
  1. Create a dm-mirror device
  2. Fail all underlying devices
  3. Read from the device

Actual results:
  Read will stall.

Expected results:
  Read should fail.

Hardware info:
  No dependency to hardware.

Additional Info:
  This bug can be one of the causes of BZ#185751.

  This is caused by calling bio_endio with 0 size.
  Fixing this problem reveals other bug.
  Attached set of patches will fix the problem.
Comment 1 Jun'ichi NOMURA 2006-03-29 10:36:54 EST
Created attachment 126995 [details]
Testcase: Try to read from failed mirror device

# sh mirror-read-fail-test.sh
0 256 mirror core 1 16 2  /dev/mapper/err1 0 /dev/mapper/err2 0
Read from failed mirror.
If you don't see 'PASS', the test fails.

<If the bug is not, we'll stop here>

dd: reading `/dev/mapper/error-mirror': Input/output error
0+0 records in
0+0 records out
PASS
Comment 2 Jun'ichi NOMURA 2006-03-29 10:39:49 EST
Typical backtrace of the stalled process:

crash> bt 22493
PID: 22493  TASK: 101aeda57f0       CPU: 1   COMMAND: "dd"
 #0 [10164b65af8] schedule at ffffffff80304a85
 #1 [10164b65bd0] io_schedule at ffffffff803053ef
 #2 [10164b65bf0] __lock_page at ffffffff80159215
 #3 [10164b65c70] find_get_page at ffffffff8015929c
 #4 [10164b65c90] do_generic_mapping_read at ffffffff80159771
 #5 [10164b65d90] __generic_file_aio_read at ffffffff8015b53c
 #6 [10164b65e10] generic_file_read at ffffffff8015b6d7
 #7 [10164b65f10] vfs_read at ffffffff80177a83
 #8 [10164b65f40] sys_read at ffffffff80177cda
 #9 [10164b65f80] system_call at ffffffff801101c6
Comment 3 Jun'ichi NOMURA 2006-03-29 10:42:05 EST
Created attachment 126996 [details]
pass correct size to bio_endio

Complete the failed bio with correct size.
Otherwise, reads to the all-failed mirror never completes.
Comment 4 Jun'ichi NOMURA 2006-03-29 11:38:43 EST
Created attachment 127002 [details]
Fix error message

When the failed bio completes, we'll see the following
kernel message:
  Out of memory causing inability to retry read.

The patch tries to fix the message.
Comment 5 Jonathan Earl Brassow 2006-04-01 12:36:52 EST
Corey,

The attachment in comment #1 should be added to the mirror test suite
Comment 6 Jason Baron 2006-04-28 13:29:09 EDT
committed in stream U4 build 34.26. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 10 Red Hat Bugzilla 2006-08-10 18:58:16 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html

Note You need to log in before you can comment on or make changes to this bug.