Bug 185754 - [RHEL4 U3] kernel dm mirror: unrelated mirror devices stall if any log device fails
[RHEL4 U3] kernel dm mirror: unrelated mirror devices stall if any log device...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Alasdair Kergon
:
Depends On:
Blocks: 181409 186476
  Show dependency treegraph
 
Reported: 2006-03-17 11:52 EST by Kiyoshi Ueda
Modified: 2013-04-02 19:51 EDT (History)
10 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 18:45:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Kiyoshi Ueda 2006-03-17 11:52:36 EST
Description of problem:
If a log device fails, *ALL* mirror devices stall.
(The "ALL" includes other mirror devices which doesn't use the
 log device.)


Version-Release number of selected component:
kernel-2.6.9-34.EL


How reproducible:
Always


Steps to Reproduce:
 1. Prepare some PVs (more than 5) and create 2 VGs from them.
    Example)
      - /dev/sda, /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde, /dev/sdf as PVs
      - vg0 contains 3 PVs, /dev/sda, /dev/sdb, /dev/sdc
      - vg1 contains 3 PVs, /dev/sdd, /dev/sde, /dev/sdf
 2. Create a mirror LV on each VG and activate it.
      # lvcreate -L 12M -n lv0 -m 1 vg0
      # lvcreate -L 12M -n lv1 -m 1 vg1
 3. Issue I/Os to the mirror LVs and continue that.
      # while true; do
      > dd if=/dev/zero of=/dev/mapper/vg0-lv0 bs=512 count=1 >& /dev/null
      > dd if=/dev/zero of=/dev/mapper/vg1-lv1 bs=512 count=1 >& /dev/null
      > done
 4. Disconnect one of PVs used for the log device of one of the mirror LVs.
    Example) If /dev/sdc is used for the log device of the vg0-lv0:
      # echo offline > /sys/block/sdc/device/state
 5. Check if I/Os to the vg1-lv1 are processed.
      # iostat 1


Actual results:
I/Os to the vg1-lv1 are not processed.


Expected results:
I/Os to the vg1-lv1 are processed, because all PVs for the vg1-lv1
are fine.


Additional info:
This problem seems to be in kmirrord.
kmirrord is blocked in disk_flush() if update of the log fails.
Back trace of kmirrord are attached below.

-----------------------------------------------------------------------
crash> bt 2115
PID: 2115   TASK: 101aff8a030       CPU: 3   COMMAND: "kmirrord"
 #0 [101ac01bb58] schedule at ffffffff80304a85
 #1 [101ac01bc30] wait_for_completion at ffffffff80304cbd
 #2 [101ac01bc90] dm_table_event at ffffffffa00ea343
 #3 [101ac01bcb0] disk_flush at ffffffffa01019ce
 #4 [101ac01bcd0] do_work at ffffffffa0102ce5
 #5 [101ac01bd10] move_tasks at ffffffff8013257f
 #6 [101ac01bda0] thread_return at ffffffff80304add
 #7 [101ac01be70] worker_thread at ffffffff80146e1e
 #8 [101ac01bf20] kthread at ffffffff8014aa93
 #9 [101ac01bf50] kernel_thread at ffffffff80110e17
crash>
-----------------------------------------------------------------------
Comment 1 Kiyoshi Ueda 2006-03-23 16:11:17 EST
Additional info:
I'd like to say this is kernel issue, not dmeventd issue.
To reproduce the kernel issue, the following setting is needed
before Step 1 of the reproduction steps.

  0. Modify /etc/lvm/lvm.conf not to launch the dmeventd like below.
        dmeventd {
            mirror_library = "none"
        }

If this step isn't done, dmeventd may handle the log device failure.
Comment 2 Jonathan Earl Brassow 2006-03-23 16:57:36 EST
w/o changes I've been working on, log failures are not handled by the userspace code.
Comment 5 Jason Baron 2006-05-09 13:07:33 EDT
committed in stream U4 build 34.26. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 8 Red Hat Bugzilla 2006-08-10 18:45:58 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html

Note You need to log in before you can comment on or make changes to this bug.