Bug 825323 - mirror image + mirror log image device failures on multiple mirrors leads to I/O errors
mirror image + mirror log image device failures on multiple mirrors leads to ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2 (Show other bugs)
6.3
Unspecified Unspecified
urgent Severity urgent
: rc
: 6.4
Assigned To: Jonathan Earl Brassow
Corey Marthaler
:
Depends On:
Blocks: 732124 886216
  Show dependency treegraph
 
Reported: 2012-05-25 12:50 EDT by Corey Marthaler
Modified: 2013-02-21 03:10 EST (History)
13 users (show)

See Also:
Fixed In Version: lvm2-2.02.98-4.el6
Doc Type: Bug Fix
Doc Text:
A mirror logical volume can itself have a mirrored log device. When a device in an image of the mirror and its log failed at the same time, it was possible for I/O errors to appear on the mirror LV when they should have been handled. That is, the kernel would not absorb the I/O errors from the failed device by relying on the remaining device. The cause was found to be that the mirror was not suspended for repair using the 'noflush' flag. This flag allows the kernel to requeue I/O requests that need to be retried. Because the kernel was not allowed to requeue the requests, it had no choice but to return the I/O as errored. This issue has been corrected and the mirror is now properly suspended with the 'noflush' flag.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-21 03:10:24 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 219543 None None None Never

  None (edit)
Description Corey Marthaler 2012-05-25 12:50:16 EDT
Description of problem:
./helter_skelter -o taft-01 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -f

================================================================================
Iteration 1.9 started at Thu May 24 19:41:30 CDT 2012
================================================================================
Scenario kill_pri_log_and_pri_leg_2_legs_2_logs: Kill primary leg and primary log of synced 2 leg redundant log mirror(s)

********* Mirror hash info for this scenario *********
* names:              syncd_pri_leg_pri_log_2legs_2logs_1 syncd_pri_leg_pri_log_2legs_2logs_2
* sync:               1
* striped:            0
* leg devices:        /dev/sdc1 /dev/sdb1
* log devices:        /dev/sdd1 /dev/sde1
* no MDA devices:     
* failpv(s):          /dev/sdc1 /dev/sdd1
* failnode(s):        taft-01
* leg fault policy:   remove
* log fault policy:   remove
******************************************************

Creating mirror(s) on taft-01...
taft-01: lvcreate --mirrorlog mirrored -m 1 -n syncd_pri_leg_pri_log_2legs_2logs_1 -L 500M helter_skelter /dev/sdc1:0-1000 /dev/sdb1:0-1000 /dev/sdd1:0-150 /dev/sde1:0-150
taft-01: lvcreate --mirrorlog mirrored -m 1 -n syncd_pri_leg_pri_log_2legs_2logs_2 -L 500M helter_skelter /dev/sdc1:0-1000 /dev/sdb1:0-1000 /dev/sdd1:0-150 /dev/sde1:0-150

Mirror Structure(s):
  LV                                                  Attr     LSize   Copy%  Devices                                                                                                  
  syncd_pri_leg_pri_log_2legs_2logs_1                 mwi-a-m- 500.00m  41.60 syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0(0),syncd_pri_leg_pri_log_2legs_2logs_1_mimage_1(0)          
  [syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0]      Iwi-aom- 500.00m        /dev/sdc1(0)                                                                                             
  [syncd_pri_leg_pri_log_2legs_2logs_1_mimage_1]      Iwi-aom- 500.00m        /dev/sdb1(0)                                                                                             
  [syncd_pri_leg_pri_log_2legs_2logs_1_mlog]          mwi-aom-   4.00m 100.00 syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0(0),syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_1(0)
  [syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0] iwi-aom-   4.00m        /dev/sdd1(0)                                                                                             
  [syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_1] iwi-aom-   4.00m        /dev/sde1(0)                                                                                             
  syncd_pri_leg_pri_log_2legs_2logs_2                 mwi-a-m- 500.00m  12.00 syncd_pri_leg_pri_log_2legs_2logs_2_mimage_0(0),syncd_pri_leg_pri_log_2legs_2logs_2_mimage_1(0)          
  [syncd_pri_leg_pri_log_2legs_2logs_2_mimage_0]      Iwi-aom- 500.00m        /dev/sdc1(125)                                                                                           
  [syncd_pri_leg_pri_log_2legs_2logs_2_mimage_1]      Iwi-aom- 500.00m        /dev/sdb1(125)                                                                                           
  [syncd_pri_leg_pri_log_2legs_2logs_2_mlog]          mwi-aom-   4.00m 100.00 syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_0(0),syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_1(0)
  [syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_0] iwi-aom-   4.00m        /dev/sdd1(1)                                                                                             
  [syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_1] iwi-aom-   4.00m        /dev/sde1(1)                                                                                             

PV=/dev/sdd1
        syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0: 2
        syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_0: 2
PV=/dev/sdc1
        syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0: 6
        syncd_pri_leg_pri_log_2legs_2logs_2_mimage_0: 6
PV=/dev/sdd1
        syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0: 2
        syncd_pri_leg_pri_log_2legs_2logs_2_mlog_mimage_0: 2
PV=/dev/sdc1
        syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0: 6
        syncd_pri_leg_pri_log_2legs_2logs_2_mimage_0: 6

Waiting until all mirror|raid volumes become fully syncd...
   0/2 mirror(s) are fully synced: ( 93.50% 63.80% )
   2/2 mirror(s) are fully synced: ( 100.00% 100.00% )

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.41.12 (17-May-2010)
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on taft-01...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----

Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----

Disabling device sdc on taft-01
Disabling device sdd on taft-01

<fail name="taft-01_syncd_pri_leg_pri_log_2legs_2logs_2" pid="3434" time="Thu May 24 19:42:43 2012" type="cmd" duration="30" ec="1" />
ALL STOP!
<stop name="taft-01_syncd_pri_leg_pri_log_2legs_2logs_1" pid="3432" time="Thu May 24 19:42:43 2012" type="cmd" duration="30" ec="127" />

Attempting I/O to cause mirror down conversion(s) on taft-01
dd: opening `/mnt/syncd_pri_leg_pri_log_2legs_2logs_1/ddfile': Invalid argument
couldn't write to syncd_pri_leg_pri_log_2legs_2logs_1

FI_engine: recover() method failed


[root@taft-01 ~]# lvs -a -o +devices
  Couldn't find device with uuid Lli1l4-KrjS-PaCy-Z3EA-XHCf-r2n3-xBNBuW.
  Couldn't find device with uuid tflrpd-Ssf0-My84-XYhp-20b6-oKzP-8nSLd4.
  LV                                  VG             Attr     LSize   Pool Origin Data%  Move Log Copy%  Convert Devices         
  syncd_pri_leg_pri_log_2legs_2logs_1 helter_skelter -wi-ao-- 500.00m                                            /dev/sdb1(0)    
  syncd_pri_leg_pri_log_2legs_2logs_2 helter_skelter -wi-ao-- 500.00m                                            /dev/sdb1(125)  

[root@taft-01 ~]# mount
/dev/mapper/helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 on /mnt/syncd_pri_leg_pri_log_2legs_2logs_1 type ext3 (rw)
/dev/mapper/helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2 on /mnt/syncd_pri_leg_pri_log_2legs_2logs_2 type ext3 (rw)


[root@taft-01 ~]# touch /mnt/syncd_pri_leg_pri_log_2legs_2logs_1/foobar
touch: setting times of `/mnt/syncd_pri_leg_pri_log_2legs_2logs_1/foobar': No such file or directory

[root@taft-01 ~]# touch /mnt/syncd_pri_leg_pri_log_2legs_2logs_2/foobar
touch: setting times of `/mnt/syncd_pri_leg_pri_log_2legs_2logs_2/foobar': No such file or directory

[root@taft-01 ~]# dd if=/dev/zero of=/mnt/syncd_pri_leg_pri_log_2legs_2logs_1/ddfile count=1
dd: opening `/mnt/syncd_pri_leg_pri_log_2legs_2logs_1/ddfile': Invalid argument


Version-Release number of selected component (if applicable):
2.6.32-274.el6.x86_64
lvm2-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-libs-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-cluster-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
udev-147-2.41.el6    BUILT: Thu Mar  1 13:01:08 CST 2012
device-mapper-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
cmirror-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
Comment 1 Corey Marthaler 2012-05-25 12:55:58 EDT
May 24 19:43:29 taft-01 qarshd[21121]: Running cmdline: echo offline > /sys/block/sdc/device/state &
May 24 19:43:29 taft-01 qarshd[21124]: Running cmdline: echo offline > /sys/block/sdd/device/state &
May 24 19:43:31 taft-01 lvm[3045]: Primary mirror device 253:9 has failed (D).
May 24 19:43:31 taft-01 lvm[3045]: Device failure in helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2_mlog.
May 24 19:43:33 taft-01 kernel: EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data
May 24 19:43:33 taft-01 kernel: device-mapper: raid1: log postsuspend failed
May 24 19:43:33 taft-01 kernel: EXT3-fs (dm-14): error in ext3_ordered_write_end: IO failure
May 24 19:43:33 taft-01 kernel: EXT3-fs (dm-14): previous I/O error to superblock detected
May 24 19:43:33 taft-01 kernel: JBD: Detected IO errors while flushing file data on dm-14
May 24 19:43:33 taft-01 lvm[3045]: /dev/helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_2: read failed after 0 of 1024 at 524222464: Input/output error
May 24 19:43:35 taft-01 kernel: device-mapper: raid1: Mirror read failed from 253:6. Trying alternative device.
May 24 19:43:37 taft-01 lvm[3045]: Mirror status: 1 of 2 images failed.
May 24 19:43:37 taft-01 lvm[3045]: Mirror log status: 1 of 2 images failed.
May 24 19:43:37 taft-01 lvm[3045]: Repair of mirrored device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2_mlog finished successfully.
May 24 19:43:37 taft-01 lvm[3045]: Primary mirror device 253:3 has failed (D).
May 24 19:43:37 taft-01 lvm[3045]: Device failure in helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog.
May 24 19:43:37 taft-01 lvm[3045]: dm_task_run failed, errno = 6, No such device or address
May 24 19:43:37 taft-01 lvm[3045]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2_mlog disappeared, detaching
May 24 19:43:37 taft-01 lvm[3045]: No longer monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2_mlog for events.
May 24 19:43:39 taft-01 lvm[3045]: Internal error: Performing unsafe table load while 4 device(s) are known to be suspended:  (253:7)
May 24 19:43:39 taft-01 kernel: EXT3-fs: ext3_journal_dirty_data: aborting transaction: IO failure in ext3_journal_dirty_data
May 24 19:43:39 taft-01 kernel: device-mapper: raid1: log postsuspend failed
May 24 19:43:39 taft-01 kernel: EXT3-fs (dm-8): error in ext3_ordered_write_end: IO failure
May 24 19:43:39 taft-01 kernel: EXT3-fs (dm-8): previous I/O error to superblock detected
May 24 19:43:39 taft-01 kernel: JBD: Detected IO errors while flushing file data on dm-8
May 24 19:43:39 taft-01 lvm[3045]: /dev/helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_1: read failed after 0 of 1024 at 524222464: Input/output error
May 24 19:43:41 taft-01 lvm[3045]: Mirror status: 1 of 2 images failed.
May 24 19:43:41 taft-01 lvm[3045]: Mirror log status: 1 of 2 images failed.
May 24 19:43:41 taft-01 lvm[3045]: Repair of mirrored device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog finished successfully.
May 24 19:43:41 taft-01 lvm[3045]: Log device 253:11 has failed (D).
May 24 19:43:41 taft-01 lvm[3045]: Device failure in helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2.
May 24 19:43:41 taft-01 lvm[3045]: device-mapper: waitevent ioctl on  failed: No such device or address
May 24 19:43:41 taft-01 lvm[3045]: dm_task_run failed, errno = 22, Invalid argument
May 24 19:43:41 taft-01 lvm[3045]: dm_task_run failed, errno = 6, No such device or address
May 24 19:43:41 taft-01 lvm[3045]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog disappeared, detaching
May 24 19:43:41 taft-01 lvm[3045]: No longer monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog for events.
May 24 19:43:41 taft-01 lvm[3045]: Couldn't find device with uuid Lli1l4-KrjS-PaCy-Z3EA-XHCf-r2n3-xBNBuW.
May 24 19:43:41 taft-01 lvm[3045]: Couldn't find device with uuid tflrpd-Ssf0-My84-XYhp-20b6-oKzP-8nSLd4.
May 24 19:43:41 taft-01 lvm[3045]: Repair of mirrored device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2 finished successfully.
May 24 19:43:41 taft-01 lvm[3045]: Primary mirror device 253:6 read failed.
May 24 19:43:41 taft-01 lvm[3045]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 is now in-sync.
May 24 19:43:41 taft-01 lvm[3045]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_2 has unmirrored portion.
May 24 19:43:41 taft-01 lvm[3045]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 has unmirrored portion.
May 24 19:43:41 taft-01 qarshd[21228]: Running cmdline: dd if=/dev/zero of=/mnt/syncd_pri_leg_pri_log_2legs_2logs_1/ddfile count=10 bs=4M
Comment 2 Corey Marthaler 2012-05-25 15:22:04 EDT
This is 100% reproducable when there are 2 mirrors but not at all when there is just one mirror.
Comment 3 Corey Marthaler 2012-05-30 13:24:38 EDT
This problem only happens on 2 legged mirrors. The mirror image + mirror log image device failure scenarios pass on 3 legged mirrors.
Comment 4 RHEL Product and Program Management 2012-07-10 04:34:03 EDT
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.
Comment 5 RHEL Product and Program Management 2012-07-10 19:58:39 EDT
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.
Comment 7 Jonathan Earl Brassow 2012-11-13 18:13:51 EST
test name is: kill_pri_log_and_pri_leg_2_legs_2_logs
Comment 8 Jonathan Earl Brassow 2012-11-14 16:01:01 EST
Fix checked in upstream (pending release is 2.02.99)

commit 54c73b7723713f43413584d59ca0bdd42c1d8241
Author: Jonathan Brassow <jbrassow@redhat.com>
Date:   Wed Nov 14 14:58:47 2012 -0600

    mirror:  Mirrored log should be fixed before mirror when double fault occu
    
    This patch is intended to fix bug 825323 - FS turns read-only during a dou
    fault of a mirror leg and mirrored log's leg at the same time.  It only
    affects a 2-way mirror with a mirrored log.  3+-way mirrors and mirrors
    without a mirrored log are not affected.
    
    The problem resulted from the fact that the top level mirror was not
    using 'noflush' when suspending before its "down-convert".  When a
    mirror image fails, the bios are queue until a suspend is recieved.  If
    it is a 'noflush' suspend, the bios can be safely requeued in the DM
    core.  If 'noflush' is not used, the bios must be pushed through the
    target and if a device is failed for a mirror, that means issuing an
    error.  When an error is received by a file system, it results in it
    turning read-only (depending on the FS).
    
    Part of the problem was is due to the nature of the stacking involved in
    using a mirror as a mirror's log.  When an image in each fail, the top
    level mirror stalls because it is waiting for a log flush.  The other
    stalls waiting for corrective action.  When the repair command is issued,
    the entire stacked arrangement is collapsed to a linear LV.  The log
    flush then fails (somewhat uncleanly) and the top-level mirror is suspende
    without 'noflush' because it is a linear device.
    
    This patch allows the log to be repaired first, which in turn allows the
    top-level mirror's log flush to complete cleanly.  The top-level mirror
    is then secondarily reduced to a linear device - at which time this mirror
    is suspended properly with 'noflush'.
Comment 9 Jonathan Earl Brassow 2012-11-14 16:03:32 EST
Unit tests:
- passes LVM test suite
- passes STS test helter_skelter:kill_pri_log_and_pri_leg_2_legs_2_logs, it did not pass a single iteration of this test before the patch.
...
================================================================================
Iteration 35.1 started at Wed Nov 14 15:02:09 CST 2012
================================================================================
Scenario kill_pri_log_and_pri_leg_2_legs_2_logs: Kill primary leg and primary log of synced 2 leg redundant log mirror(s)
...
Comment 13 Corey Marthaler 2012-12-06 18:52:45 EST
Fix verified in the latest rpms.

2.6.32-343.el6.x86_64
lvm2-2.02.98-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
lvm2-libs-2.02.98-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
lvm2-cluster-2.02.98-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
udev-147-2.43.el6    BUILT: Thu Oct 11 05:59:38 CDT 2012
device-mapper-1.02.77-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
device-mapper-libs-1.02.77-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
device-mapper-event-1.02.77-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
device-mapper-event-libs-1.02.77-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012
cmirror-2.02.98-4.el6    BUILT: Wed Dec  5 08:35:04 CST 2012


...
================================================================================
Iteration 10.1 started at Thu Dec  6 17:38:12 CST 2012
================================================================================
Scenario kill_pri_log_and_pri_leg_2_legs_2_logs: Kill primary leg and primary log of synced 2 leg redundant log mirror(s)
...
Comment 15 errata-xmlrpc 2013-02-21 03:10:24 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0501.html

Note You need to log in before you can comment on or make changes to this bug.