Bug 734252 - problem up converting striped mirror after any leg device failure
Summary: problem up converting striped mirror after any leg device failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Corey Marthaler
URL:
Whiteboard:
Depends On:
Blocks: 743047
TreeView+ depends on / blocked
 
Reported: 2011-08-29 21:44 UTC by Corey Marthaler
Modified: 2011-12-06 17:02 UTC (History)
9 users (show)

Fixed In Version: lvm2-2.02.87-3.el6
Doc Type: Bug Fix
Doc Text:
Do not document.
Clone Of:
Environment:
Last Closed: 2011-12-06 17:02:53 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1522 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2011-12-06 00:50:10 UTC

Description Corey Marthaler 2011-08-29 21:44:20 UTC
Description of problem:
With both the leg and log fault policies set to "allocate", this mirror should have continued to be a 2-way disk log mirror instead of ending up a linear after the primary leg failure


Scenario: Kill primary leg of striped 2 leg mirror(s)

********* Mirror hash info for this scenario *********
* names:              striped_primary_2legs_1
* sync:               1
* striped:            1
* leg devices:        /dev/sdh1 /dev/sdf1 /dev/sdg1 /dev/sdc1
* log devices:        /dev/sde1
* no MDA devices:     
* failpv(s):          /dev/sdh1
* failnode(s):        taft-01
* leg fault policy:   allocate
* log fault policy:   allocate
******************************************************

Creating mirror(s) on taft-01...
taft-01: lvcreate -m 1 -i 2 -n striped_primary_2legs_1 -L 300M helter_skelter /dev/sdh1:0-1000 /dev/sdf1:0-1000 /dev/sdg1:0-1000 /dev/sdc1:0-1000 /dev/sde1:0-150

PV=/dev/sdh1
        striped_primary_2legs_1_mimage_0: 5.1
PV=/dev/sdh1
        striped_primary_2legs_1_mimage_0: 5.1

Waiting until all mirrors become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on taft-01...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----

Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----

Disabling device sdh on taft-01

Attempting I/O to cause mirror down conversion(s) on taft-01
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.481984 s, 87.0 MB/s
Verifying current sanity of lvm after the failure
  Couldn't find device with uuid 3z2rTL-iIxv-Az84-cXhO-nIdq-dVaD-FNttSk.
Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)
  Couldn't find device with uuid 3z2rTL-iIxv-Az84-cXhO-nIdq-dVaD-FNttSk.
olog: 1
Verifying LOG device(s) /dev/sde1 *ARE* in the mirror(s)
  Couldn't find device with uuid 3z2rTL-iIxv-Az84-cXhO-nIdq-dVaD-FNttSk.
log device /dev/sde1 should still be present on taft-01



[root@taft-01 ~]# lvs -a -o +devices
  Couldn't find device with uuid 3z2rTL-iIxv-Az84-cXhO-nIdq-dVaD-FNttSk.
  LV                      VG             Attr   LSize   Log Copy%  Devices
  striped_primary_2legs_1 helter_skelter -wi-ao 304.00m            /dev/sdg1(0),/dev/sdc1(0)



Aug 29 15:14:28 taft-01 lvm[3013]: Monitoring mirror device helter_skelter-striped_primary_2legs_1 for events.
Aug 29 15:14:35 taft-01 lvm[3013]: helter_skelter-striped_primary_2legs_1 is now in-sync.
Aug 29 15:15:17 taft-01 lvm[3013]: Primary mirror device 253:3 has failed (D).
Aug 29 15:15:17 taft-01 lvm[3013]: Device failure in helter_skelter-striped_primary_2legs_1.
Aug 29 15:15:18 taft-01 lvm[3013]: /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error
Aug 29 15:15:18 taft-01 lvm[3013]: /dev/sdh1: read failed after 0 of 512 at 145669664768: Input/output error
Aug 29 15:15:18 taft-01 lvm[3013]: /dev/sdh1: read failed after 0 of 512 at 0: Input/output error
Aug 29 15:15:18 taft-01 lvm[3013]: /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error
Aug 29 15:15:18 taft-01 lvm[3013]: /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
Aug 29 15:15:18 taft-01 lvm[3013]: Couldn't find device with uuid 3z2rTL-iIxv-Az84-cXhO-nIdq-dVaD-FNttSk.
Aug 29 15:15:22 taft-01 lvm[3013]: Mirror status: 1 of 2 images failed.
Aug 29 15:15:22 taft-01 lvm[3013]: Trying to up-convert to 2 images, 1 logs.
Aug 29 15:15:23 taft-01 lvm[3013]: LV striped_primary_2legs_1: segment 1 log LV striped_primary_2legs_1_mlog is not a mirror log or a RAID image
Aug 29 15:15:23 taft-01 lvm[3013]: Internal error: LV segments corrupted in striped_primary_2legs_1.
Aug 29 15:15:23 taft-01 lvm[3013]: Trying to up-convert to 2 images, 0 logs.
Aug 29 15:15:24 taft-01 lvm[3013]: WARNING: Failed to replace 1 of 1 logs in volume striped_primary_2legs_1
Aug 29 15:15:24 taft-01 lvm[3013]: Repair of mirrored LV helter_skelter/striped_primary_2legs_1 finished successfully.
Aug 29 15:15:25 taft-01 lvm[3013]: No longer monitoring mirror device helter_skelter-striped_primary_2legs_1 for events.


Version-Release number of selected component (if applicable):
2.6.32-191.el6.x86_64

lvm2-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
lvm2-libs-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
lvm2-cluster-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
udev-147-2.37.el6    BUILT: Wed Aug 10 07:48:15 CDT 2011
device-mapper-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-libs-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-event-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
device-mapper-event-libs-1.02.66-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011
cmirror-2.02.87-1.el6    BUILT: Fri Aug 12 06:11:57 CDT 2011

Comment 1 Corey Marthaler 2011-08-30 17:03:35 UTC
I reproduced this as well when failing the secondary leg device.

Aug 29 16:48:18 taft-01 lvm[3013]: helter_skelter-striped_secondary_2legs_1 is
now in-sync.
Aug 29 16:48:59 taft-01 lvm[3013]: Secondary mirror device 253:4 has failed
(D).
Aug 29 16:48:59 taft-01 lvm[3013]: Device failure in
helter_skelter-striped_secondary_2legs_1.
Aug 29 16:48:59 taft-01 lvm[3013]: /dev/sdc1: read failed after 0 of 512 at
145669554176: Input/output error
Aug 29 16:48:59 taft-01 lvm[3013]: /dev/sdc1: read failed after 0 of 512 at
145669664768: Input/output error
Aug 29 16:48:59 taft-01 lvm[3013]: /dev/sdc1: read failed after 0 of 512 at 0:
Input/output error
Aug 29 16:48:59 taft-01 lvm[3013]: /dev/sdc1: read failed after 0 of 512 at
4096: Input/output error
Aug 29 16:48:59 taft-01 lvm[3013]: /dev/sdc1: read failed after 0 of 2048 at 0:
Input/output error
Aug 29 16:49:00 taft-01 lvm[3013]: Couldn't find device with uuid
152PWT-mZDJ-mcbs-fT5N-Ga3s-MaAf-jzuZAu.
Aug 29 16:49:04 taft-01 lvm[3013]: Mirror status: 1 of 2 images failed.
Aug 29 16:49:04 taft-01 lvm[3013]: Trying to up-convert to 2 images, 1 logs.
Aug 29 16:49:05 taft-01 lvm[3013]: LV striped_secondary_2legs_1: segment 1 log
LV striped_secondary_2legs_1_mlog is not a mirror log or a RAID image
Aug 29 16:49:05 taft-01 lvm[3013]: Internal error: LV segments corrupted in
striped_secondary_2legs_1.
Aug 29 16:49:05 taft-01 lvm[3013]: Trying to up-convert to 2 images, 0 logs.
Aug 29 16:49:06 taft-01 lvm[3013]: WARNING: Failed to replace 1 of 1 logs in
volume striped_secondary_2legs_1
Aug 29 16:49:06 taft-01 lvm[3013]: Repair of mirrored LV
helter_skelter/striped_secondary_2legs_1 finished successfully.
Aug 29 16:49:08 taft-01 lvm[3013]: No longer monitoring mirror device
helter_skelter-striped_secondary_2legs_1 for events.

Comment 2 Jonathan Earl Brassow 2011-09-14 02:47:44 UTC
Fix for bug 734252 - problem up converting striped mirror after image failure

lv_mirror_count was not able to handle mirrors of stripes properly.  When a
failed device is removed, the MIRRORED status flag is removed from the LV
conditionally based on the results of lv_mirror_count.  However, lv_mirror_count
trusted the MIRRORED flag - thinking any such LV must be mirrored.  It would
happily assign first_seg(lv)->area_count as the number of mirrors, but when
a mirrored striped LV was reduced to a simple striped LV area_count would be
the number of /stripes/ not the number of /mirrors/.  A result higher than 1
would be returned from lv_mirror_count, the MIRRORED flag would not be cleared,
and the LV would fail to be up-converted properly in lvconvert_mirrors_aux
because of it.

Fix checked in upstream in Version 2.02.89.

Comment 4 Jonathan Earl Brassow 2011-09-14 04:58:53 UTC
git commit IDs:
2c9cf3b73bc0d655e8521d1e440592ece129aa8c -- original fix
0168b2fe118dfc5a973c7b3a4185945c60da755b -- follow-up correction

Comment 6 Corey Marthaler 2011-09-16 16:31:09 UTC
I didn't see any "Internal error: LV segments corrupted" issues with striped mirror device failure testing on the latest scratch built rpms. This does appear to be fixed.

2.6.32-195.el6.x86_64

lvm2-2.02.87-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
lvm2-libs-2.02.87-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
lvm2-cluster-2.02.87-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
udev-147-2.38.el6    BUILT: Fri Sep  9 16:25:50 CDT 2011
device-mapper-1.02.66-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
device-mapper-libs-1.02.66-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
device-mapper-event-1.02.66-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
device-mapper-event-libs-1.02.66-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011
cmirror-2.02.87-2.1.el6    BUILT: Wed Sep 14 09:44:16 CDT 2011

Comment 8 Corey Marthaler 2011-09-28 18:21:58 UTC
Fix verified in the latest rpms.

2.6.32-198.el6.x86_64

lvm2-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-libs-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-cluster-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
udev-147-2.38.el6    BUILT: Fri Sep  9 16:25:50 CDT 2011
device-mapper-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
cmirror-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011

Comment 9 Peter Rajnoha 2011-10-27 09:02:11 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Do not document.

Comment 10 errata-xmlrpc 2011-12-06 17:02:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1522.html


Note You need to log in before you can comment on or make changes to this bug.