Bug 555197 - dm-raid1: fix data lost at mirror log failure
Summary: dm-raid1: fix data lost at mirror log failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.5
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Mikuláš Patočka
QA Contact: Gris Ge
URL:
Whiteboard:
Depends On:
Blocks: 557937 640580
TreeView+ depends on / blocked
 
Reported: 2010-01-14 00:03 UTC by Takahiro Yasui
Modified: 2014-07-25 05:08 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 557937 (view as bug list)
Environment:
Last Closed: 2011-01-13 20:59:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch for 2.6.18-182.el5 kernel (1.18 KB, text/plain)
2010-01-14 17:11 UTC, Takahiro Yasui
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0017 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.6 kernel security and bug fix update 2011-01-13 10:37:42 UTC

Description Takahiro Yasui 2010-01-14 00:03:59 UTC
Description of problem:
  dm-radi1: fix data lost at mirror log failure

Version-Release number of selected component (if applicable):
  2.6.18-182.el5

How reproducible:
  See the following steps.

Steps to Reproduce:
  1. create two way mirror without "block_on_error" option
    # dmsetup table
    vg00-lv00_mimage_1: 0 24576 linear 8:48 384
    vg00-lv00_mimage_0: 0 24576 linear 8:32 384
    vg00-lv00_mlog: 0 8192 linear 8:64 384
    vg00-lv00: 0 24576 mirror disk 2 253:0 1024 2 253:1 0 253:2 0

  2. disable a device assined to the mirror log
    # echo offline > /sys/block/<dev>/device/state

  3. Write I/O to the mirror device
    # dd if=/dev/zero of=/dev/mapper/vg00-lv00 bs=4096 count=1 oflag=sync
    1+0 records in
    1+0 records out
    4096 bytes (4.1 kB) copied, 0.000557289 seconds, 7.3 MB/s
    *** Write I/O successfully finished ***

  4. Check status of the mirror device
    # dmsetup status
    vg00-lv00_mimage_1: 0 24576 linear
    vg00-lv00_mimage_0: 0 24576 linear
    vg00-lv00_mlog: 0 8192 linear
    vg00-lv00: 0 24576 mirror 2 253:1 253:2 24/24 1 AA 3 disk 253:0 D
    *** mirror log is marked as "D" ***

Actual results:
  An write I/O finishes successfully.

Expected results:
  An write I/O is blocked and doesn't return when a log device of the mirror
  is marked as "failed." (i.e. dmsetup status command shows "D" state about
  the log device.)

Additional info:
  This issue is reported on dm-devel.
  https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html

Comment 1 Takahiro Yasui 2010-01-14 00:15:29 UTC
In the reproduction step (In reply to comment #0)
>   3. Write I/O to the mirror device
>     # dd if=/dev/zero of=/dev/mapper/vg00-lv00 bs=4096 count=1 oflag=sync
>     1+0 records in
>     1+0 records out
>     4096 bytes (4.1 kB) copied, 0.000557289 seconds, 7.3 MB/s
>     *** Write I/O successfully finished ***

In the reproduction step 3, no I/O is sent to mirror legs, but dd command successfully finished. This causes data lost. The code sequences is:

do_mirror()
  do_writes()
    * bios are put into ms->failures when ms->log_failure is set.
  do_failures()
    * Bios in ms->failures are processed by bio_endio(bio, bio->bi_size, 0).

Comment 2 Takahiro Yasui 2010-01-14 17:11:50 UTC
Created attachment 383725 [details]
Patch for 2.6.18-182.el5 kernel

Comment 4 Mikuláš Patočka 2010-01-14 18:00:46 UTC
Taka: I think we shouldn't hold bios when "block_on_error" isn't specified. "block_on_error" means that dmeventd isn't running and holding any bios in this case could deadlock the whole system. I'd simply pass the write to both legs if the log failed...

Comment 5 Takahiro Yasui 2010-01-14 18:35:16 UTC
RHEL5.4 kernel keeps bios in ms->failures while ms->log_failure == 1. And the current implementation never reset ms->log_failure. Therefore, the behavior is the same as RHEL5.4. If the behavior is not correct, it means that the behavior of RHEL5.4 isn't correct, either.

Comment 6 Takahiro Yasui 2010-01-14 22:53:53 UTC
Correction. RHEL5.4 kernel can reset ms->log_failure.

do_writes()
        ms->log_failure = rh_flush(&ms->rh);

So, 2.6.18-182.el5 kernel has a different behavior as RHEL5.4. In RHEL5.4, bios in the failures list are possible to be processed in case ms->log_failure is reset.

On the other hand, 2.6.18-182.el5 kernel doesn't reset ms->log_failure once it is set. In this case, do bios need to return -EIO if ms->log_failure is set?

Comment 8 Ludek Smid 2010-03-11 12:18:27 UTC
Since it is too late to address this issue in RHEL 5.5, it has been proposed for RHEL 5.6.  Contact your support representative if you need to escalate this issue.

Comment 11 Mikuláš Patočka 2010-07-13 11:16:46 UTC
Yes, it needs backporting. The appropriate commit is 5528d17de1cf1462f285c40ccaf8e0d0e4c64dc0 in 2.6.33.

Comment 13 Jarod Wilson 2010-09-21 20:59:20 UTC
in kernel-2.6.18-223.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 15 Gris Ge 2010-12-06 07:57:48 UTC
This is the setup and status of dmsetup when log device offline:
===================================================
[root@VM1-RHEL5-Dev ~]# dmsetup status
test_mirror: 0 24576 mirror 2 252:3 252:4 24/24 1 AA 3 disk 252:2 D
mimage1: 0 24576 linear
mimage0: 0 24576 linear
mlog: 0 24576 linear
[root@VM1-RHEL5-Dev ~]# dmsetup table
test_mirror: 0 24576 mirror disk 2 252:2 1024 2 252:3 0 252:4 0
mimage1: 0 24576 linear 3:64 384
mimage0: 0 24576 linear 3:0 384
mlog: 0 24576 linear 8:0 384
===================================================


In BOTH of kernel 2.6.18-194.el5 and 2.6.18-233.el5, the dd command return success without any error.
Only got a kernel error:
sd 0:0:0:1: rejecting I/O to offline device


Mikulas,
Can you check the patch? It seems it doesn't fix the issue.

Comment 16 Milan Broz 2010-12-06 09:06:49 UTC
Do you run dd to mirror device with "oflag=sync" flag?
Is the dmevent disabled there for test? (so lvm will not try to recover.)

Comment 17 Gris Ge 2010-12-09 03:48:59 UTC
Milan,

dd command is with "oflag=sync" flag.
mimage0 mimage1 and mlog is not LV. It's just linear from a disk.
Like this:
0 24576 linear /dev/sda 384

I run dmeventd with '-ddd' option for debug, but no extra error besize rejecting I/O came out from /var/log/message

Do I need to create dm-raid0 on LV?

Comment 18 Barry Donahue 2010-12-15 16:45:25 UTC
Do we have the info needed to retest this? RE: comment 17.

Comment 19 Jonathan Earl Brassow 2010-12-15 17:17:36 UTC
Steps to verify bug fix:

Step 1: Create the unsupported setup
       [Note: You can't create the unsupported setup by any
        means available through LVM - you must do it by hand.]
~> echo "0 1024 mirror disk 2 <devA> 1024 2 <devB> 0 <devC> 0" | \
        dmsetup create mirror

Step 2: Clear mirror device
~> dd if=/dev/zero of=/dev/mapper/mirror

Step 2: Disable log device
~> echo offline > /sys/block/<devA>/device/state

Step 3: Write to mirror
~> dd if=/dev/urandom of=/dev/mapper/mirror

Step 4: Verify the write took place
## Check contents of /dev/mapper/mirror: non-zero means success


Note that the discussion of this bug has moved on from the original.  If this bug has turned into a complaint about the operation of an unsupported configuration, then that is not really something that can be fixed.  (Although I have in the past presented patches to completely disable this unsupported configuration - which is still a possibility.)

Comment 20 Gris Ge 2010-12-16 10:21:48 UTC

On RHEL 5.5 GA kernel-2.6.18-194.el5 :
1. Connect 3 iscsi disks. (/dev/sda /dev/sdb /dev/sdc)
2. echo "0 1024 mirror disk 2 /dev/sda 1024 2 /dev/sdb 0 /dev/sdc 0" | dmsetup create mirror
3. dd if=/dev/zero  of=/dev/mapper/mirror
4. echo offline > /sys/block/sda/device/state
5. dd if=/dev/urandom of=/dev/mapper/mirror oflag=sync
6. dmsetup status
   mirror: 0 1024 mirror 2 8:16 8:32 1/1 1 AA 3 disk 8:0 D
7. hexdump /dev/mapper/mirror
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0080000

/var/log/messages got:
Dec 16 17:56:35 VMC2 kernel: sd 0:0:0:1: rejecting I/O to offline device

On kernel-2.6.18-236.el5:

dd command finished without error. I/O do goes to device even log device offline.
Same /var/log/messages error:
Dec 16 17:56:35 VMC2 kernel: sd 0:0:0:1: rejecting I/O to offline device


So, we are letting I/O go instead of failing I/O when log device went offline.
If that is what we expect, I think this bug has been fixed.

Comment 24 errata-xmlrpc 2011-01-13 20:59:31 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0017.html


Note You need to log in before you can comment on or make changes to this bug.