Description of problem: dm-radi1: fix data lost at mirror log failure Version-Release number of selected component (if applicable): 2.6.18-182.el5 How reproducible: See the following steps. Steps to Reproduce: 1. create two way mirror without "block_on_error" option # dmsetup table vg00-lv00_mimage_1: 0 24576 linear 8:48 384 vg00-lv00_mimage_0: 0 24576 linear 8:32 384 vg00-lv00_mlog: 0 8192 linear 8:64 384 vg00-lv00: 0 24576 mirror disk 2 253:0 1024 2 253:1 0 253:2 0 2. disable a device assined to the mirror log # echo offline > /sys/block/<dev>/device/state 3. Write I/O to the mirror device # dd if=/dev/zero of=/dev/mapper/vg00-lv00 bs=4096 count=1 oflag=sync 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.000557289 seconds, 7.3 MB/s *** Write I/O successfully finished *** 4. Check status of the mirror device # dmsetup status vg00-lv00_mimage_1: 0 24576 linear vg00-lv00_mimage_0: 0 24576 linear vg00-lv00_mlog: 0 8192 linear vg00-lv00: 0 24576 mirror 2 253:1 253:2 24/24 1 AA 3 disk 253:0 D *** mirror log is marked as "D" *** Actual results: An write I/O finishes successfully. Expected results: An write I/O is blocked and doesn't return when a log device of the mirror is marked as "failed." (i.e. dmsetup status command shows "D" state about the log device.) Additional info: This issue is reported on dm-devel. https://www.redhat.com/archives/dm-devel/2009-December/msg00211.html
In the reproduction step (In reply to comment #0) > 3. Write I/O to the mirror device > # dd if=/dev/zero of=/dev/mapper/vg00-lv00 bs=4096 count=1 oflag=sync > 1+0 records in > 1+0 records out > 4096 bytes (4.1 kB) copied, 0.000557289 seconds, 7.3 MB/s > *** Write I/O successfully finished *** In the reproduction step 3, no I/O is sent to mirror legs, but dd command successfully finished. This causes data lost. The code sequences is: do_mirror() do_writes() * bios are put into ms->failures when ms->log_failure is set. do_failures() * Bios in ms->failures are processed by bio_endio(bio, bio->bi_size, 0).
Created attachment 383725 [details] Patch for 2.6.18-182.el5 kernel
Taka: I think we shouldn't hold bios when "block_on_error" isn't specified. "block_on_error" means that dmeventd isn't running and holding any bios in this case could deadlock the whole system. I'd simply pass the write to both legs if the log failed...
RHEL5.4 kernel keeps bios in ms->failures while ms->log_failure == 1. And the current implementation never reset ms->log_failure. Therefore, the behavior is the same as RHEL5.4. If the behavior is not correct, it means that the behavior of RHEL5.4 isn't correct, either.
Correction. RHEL5.4 kernel can reset ms->log_failure. do_writes() ms->log_failure = rh_flush(&ms->rh); So, 2.6.18-182.el5 kernel has a different behavior as RHEL5.4. In RHEL5.4, bios in the failures list are possible to be processed in case ms->log_failure is reset. On the other hand, 2.6.18-182.el5 kernel doesn't reset ms->log_failure once it is set. In this case, do bios need to return -EIO if ms->log_failure is set?
Since it is too late to address this issue in RHEL 5.5, it has been proposed for RHEL 5.6. Contact your support representative if you need to escalate this issue.
Yes, it needs backporting. The appropriate commit is 5528d17de1cf1462f285c40ccaf8e0d0e4c64dc0 in 2.6.33.
in kernel-2.6.18-223.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
This is the setup and status of dmsetup when log device offline: =================================================== [root@VM1-RHEL5-Dev ~]# dmsetup status test_mirror: 0 24576 mirror 2 252:3 252:4 24/24 1 AA 3 disk 252:2 D mimage1: 0 24576 linear mimage0: 0 24576 linear mlog: 0 24576 linear [root@VM1-RHEL5-Dev ~]# dmsetup table test_mirror: 0 24576 mirror disk 2 252:2 1024 2 252:3 0 252:4 0 mimage1: 0 24576 linear 3:64 384 mimage0: 0 24576 linear 3:0 384 mlog: 0 24576 linear 8:0 384 =================================================== In BOTH of kernel 2.6.18-194.el5 and 2.6.18-233.el5, the dd command return success without any error. Only got a kernel error: sd 0:0:0:1: rejecting I/O to offline device Mikulas, Can you check the patch? It seems it doesn't fix the issue.
Do you run dd to mirror device with "oflag=sync" flag? Is the dmevent disabled there for test? (so lvm will not try to recover.)
Milan, dd command is with "oflag=sync" flag. mimage0 mimage1 and mlog is not LV. It's just linear from a disk. Like this: 0 24576 linear /dev/sda 384 I run dmeventd with '-ddd' option for debug, but no extra error besize rejecting I/O came out from /var/log/message Do I need to create dm-raid0 on LV?
Do we have the info needed to retest this? RE: comment 17.
Steps to verify bug fix: Step 1: Create the unsupported setup [Note: You can't create the unsupported setup by any means available through LVM - you must do it by hand.] ~> echo "0 1024 mirror disk 2 <devA> 1024 2 <devB> 0 <devC> 0" | \ dmsetup create mirror Step 2: Clear mirror device ~> dd if=/dev/zero of=/dev/mapper/mirror Step 2: Disable log device ~> echo offline > /sys/block/<devA>/device/state Step 3: Write to mirror ~> dd if=/dev/urandom of=/dev/mapper/mirror Step 4: Verify the write took place ## Check contents of /dev/mapper/mirror: non-zero means success Note that the discussion of this bug has moved on from the original. If this bug has turned into a complaint about the operation of an unsupported configuration, then that is not really something that can be fixed. (Although I have in the past presented patches to completely disable this unsupported configuration - which is still a possibility.)
On RHEL 5.5 GA kernel-2.6.18-194.el5 : 1. Connect 3 iscsi disks. (/dev/sda /dev/sdb /dev/sdc) 2. echo "0 1024 mirror disk 2 /dev/sda 1024 2 /dev/sdb 0 /dev/sdc 0" | dmsetup create mirror 3. dd if=/dev/zero of=/dev/mapper/mirror 4. echo offline > /sys/block/sda/device/state 5. dd if=/dev/urandom of=/dev/mapper/mirror oflag=sync 6. dmsetup status mirror: 0 1024 mirror 2 8:16 8:32 1/1 1 AA 3 disk 8:0 D 7. hexdump /dev/mapper/mirror 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0080000 /var/log/messages got: Dec 16 17:56:35 VMC2 kernel: sd 0:0:0:1: rejecting I/O to offline device On kernel-2.6.18-236.el5: dd command finished without error. I/O do goes to device even log device offline. Same /var/log/messages error: Dec 16 17:56:35 VMC2 kernel: sd 0:0:0:1: rejecting I/O to offline device So, we are letting I/O go instead of failing I/O when log device went offline. If that is what we expect, I think this bug has been fixed.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html