Bug 825026

Summary: RFE: get rid of the -missing devices during partial allocation failure scenarios
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED NOTABUG QA Contact: Corey Marthaler <cmarthal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: agk, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-30 22:55:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2012-05-24 20:47:24 UTC
Description of problem:

./black_bird -o taft-01 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -f -i 2 -e kill_multiple_synced_raid6_4legs

Scenario kill_multiple_synced_raid6_4legs: Kill multiple legs of synced 4 leg raid6 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_multiple_raid6_4legs_1
* sync:               1
* type:               raid6
* -m |-i value:       4
* leg devices:        /dev/sdf1 /dev/sdh1 /dev/sdb1 /dev/sdg1 /dev/sdd1 /dev/sde1
* failpv(s):          /dev/sdf1 /dev/sde1
* failnode(s):        taft-01
* additional snap:    /dev/sdh1
* raid fault policy:   warn
******************************************************

Creating raids(s) on taft-01...
taft-01: lvcreate --type raid6 -i 4 -n synced_multiple_raid6_4legs_1 -L 500M black_bird /dev/sdf1:0-1000 /dev/sdh1:0-1000 /dev/sdb1:0-1000 /dev/sdg1:0-1000 /dev/sdd1:0-1000 /dev/sde1:0-1000

RAID Structure(s):
  LV                                       Attr     LSize   Copy%  Devices
  synced_multiple_raid6_4legs_1            rwi-a-r- 512.00m        synced_multiple_raid6_4legs_1_rimage_0(0),synced_multiple_raid6_4legs_1_rimage_1(0),synced_multiple_raid6_4legs_1_rimage_2(0),synced_multiple_raid6_4legs_1_rimage_3(0),synced_multiple_raid6_4legs_1_rimage_4(0),synced_multiple_raid6_4legs_1_rimage_5(0)
  [synced_multiple_raid6_4legs_1_rimage_0] Iwi-aor- 128.00m        /dev/sdf1(1)
  [synced_multiple_raid6_4legs_1_rimage_1] Iwi-aor- 128.00m        /dev/sdh1(1)
  [synced_multiple_raid6_4legs_1_rimage_2] Iwi-aor- 128.00m        /dev/sdb1(1)
  [synced_multiple_raid6_4legs_1_rimage_3] Iwi-aor- 128.00m        /dev/sdg1(1)
  [synced_multiple_raid6_4legs_1_rimage_4] Iwi-aor- 128.00m        /dev/sdd1(1)
  [synced_multiple_raid6_4legs_1_rimage_5] Iwi-aor- 128.00m        /dev/sde1(1)
  [synced_multiple_raid6_4legs_1_rmeta_0]  ewi-aor-   4.00m        /dev/sdf1(0)
  [synced_multiple_raid6_4legs_1_rmeta_1]  ewi-aor-   4.00m        /dev/sdh1(0)
  [synced_multiple_raid6_4legs_1_rmeta_2]  ewi-aor-   4.00m        /dev/sdb1(0)
  [synced_multiple_raid6_4legs_1_rmeta_3]  ewi-aor-   4.00m        /dev/sdg1(0)
  [synced_multiple_raid6_4legs_1_rmeta_4]  ewi-aor-   4.00m        /dev/sdd1(0)
  [synced_multiple_raid6_4legs_1_rmeta_5]  ewi-aor-   4.00m        /dev/sde1(0)

* NOTE: not enough available devices for allocation fault polices to fully work *
(well technically, since we have 1, some allocation should work)

PV=/dev/sde1
        synced_multiple_raid6_4legs_1_rimage_5: 2
        synced_multiple_raid6_4legs_1_rmeta_5: 2
PV=/dev/sdf1
        synced_multiple_raid6_4legs_1_rimage_0: 2
        synced_multiple_raid6_4legs_1_rmeta_0: 2

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on taft-01...

Creating a snapshot volume of each of the raids
Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----

Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----

Disabling device sdf on taft-01
Disabling device sde on taft-01

Attempting I/O to cause mirror down conversion(s) on taft-01
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.244412 s, 172 MB/s

Verifying current sanity of lvm after the failure

RAID Structure(s):
  /dev/sde1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sdf1: read failed after 0 of 512 at 145669554176: Input/output error
  LV                                       Attr     LSize   Copy%  Devices
  bb_snap1                                 swi-a-s- 252.00m        /dev/sdh1(33)
  synced_multiple_raid6_4legs_1            owi-aor- 512.00m        synced_multiple_raid6_4legs_1_rimage_0(0),synced_multiple_raid6_4legs_1_rimage_1(0),synced_multiple_raid6_4legs_1_rimage_2(0),synced_multiple_raid6_4legs_1_rimage_3(0),synced_multiple_raid6_4legs_1_rimage_4(0),synced_multiple_raid6_4legs_1_rimage_5(0)
  [synced_multiple_raid6_4legs_1_rimage_0] iwi-aor- 128.00m        unknown device(1)
  [synced_multiple_raid6_4legs_1_rimage_1] iwi-aor- 128.00m        /dev/sdh1(1)
  [synced_multiple_raid6_4legs_1_rimage_2] iwi-aor- 128.00m        /dev/sdb1(1)
  [synced_multiple_raid6_4legs_1_rimage_3] iwi-aor- 128.00m        /dev/sdg1(1)
  [synced_multiple_raid6_4legs_1_rimage_4] iwi-aor- 128.00m        /dev/sdd1(1)
  [synced_multiple_raid6_4legs_1_rimage_5] iwi-aor- 128.00m        unknown device(1)
  [synced_multiple_raid6_4legs_1_rmeta_0]  ewi-aor-   4.00m        unknown device(0)
  [synced_multiple_raid6_4legs_1_rmeta_1]  ewi-aor-   4.00m        /dev/sdh1(0)
  [synced_multiple_raid6_4legs_1_rmeta_2]  ewi-aor-   4.00m        /dev/sdb1(0)
  [synced_multiple_raid6_4legs_1_rmeta_3]  ewi-aor-   4.00m        /dev/sdg1(0)
  [synced_multiple_raid6_4legs_1_rmeta_4]  ewi-aor-   4.00m        /dev/sdd1(0)
  [synced_multiple_raid6_4legs_1_rmeta_5]  ewi-aor-   4.00m        unknown device(0)

Verifying FAILED device /dev/sdf1 is *NOT* in the volume(s)
Verifying FAILED device /dev/sde1 is *NOT* in the volume(s)
Verifying IMAGE device /dev/sdh1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdb1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdg1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdd1 *IS* in the volume(s)
verify the rimage/rmeta dm devices remain after the failures
Checking EXISTENCE and STATE of synced_multiple_raid6_4legs_1_rimage_5 on:  taft-01
Checking EXISTENCE and STATE of synced_multiple_raid6_4legs_1_rmeta_5 on:  taft-01
Checking EXISTENCE and STATE of synced_multiple_raid6_4legs_1_rimage_0 on:  taft-01
Checking EXISTENCE and STATE of synced_multiple_raid6_4legs_1_rmeta_0 on:  taft-01

Verify the raid image order is what's expected based on raid fault policy
EXPECTED LEG ORDER: unknown /dev/sdh1 /dev/sdb1 /dev/sdg1 /dev/sdd1 unknown
ACTUAL LEG ORDER: unknown /dev/sdh1 /dev/sdb1 /dev/sdg1 /dev/sdd1 unknown
Fault policy is warn, manually repairing failed raid volumes
taft-01: 'lvconvert --yes --repair black_bird/synced_multiple_raid6_4legs_1'
  /dev/sde1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sdf1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid eaOgwu-bNGK-lHqQ-vzWz-LWyB-dDDA-sBNwHA.
  Couldn't find device with uuid OFfTdo-63aD-tgBe-kLRi-0Pa9-lAzd-i3RXEW.
  Insufficient suitable allocatable extents for logical volume : 66 more required
  Failed to allocate replacement images for black_bird/synced_multiple_raid6_4legs_1
  Attempting replacement of 1 devices instead of 2

Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----
checkit starting with:
VERIFY
Verify XIOR Stream: /tmp/checkit_synced_multiple_raid6_4legs_1
Working dir:        /mnt/synced_multiple_raid6_4legs_1/checkit

Enabling device sdf on taft-01
Enabling device sde on taft-01

Checking for leftover '-missing_0_0' or 'unknown devices'
we dont know yet if this '-missing' device should still exist, maybe it'll go away on it's own
[FAIL]


[root@taft-01 ~]# dmsetup ls | grep missing
black_bird-synced_multiple_raid6_4legs_1_rmeta_0-missing_0_0    (253:20)
black_bird-synced_multiple_raid6_4legs_1_rimage_0-missing_0_0   (253:19)


Version-Release number of selected component (if applicable):
2.6.32-274.el6.x86_64
lvm2-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-libs-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
lvm2-cluster-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
udev-147-2.41.el6    BUILT: Thu Mar  1 13:01:08 CST 2012
device-mapper-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
device-mapper-event-libs-1.02.74-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012
cmirror-2.02.95-10.el6    BUILT: Fri May 18 03:26:00 CDT 2012


How reproducible:
Everytime

Comment 2 RHEL Program Management 2012-07-10 08:34:08 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 RHEL Program Management 2012-07-10 23:58:34 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 4 Jonathan Earl Brassow 2012-07-30 22:55:12 UTC
In this case, the '-missing' devices are a necessary part of the solution.  Note that the RAID images haven't been replaced with anything else - nor have they been removed.  Therefore, these specific images are, in fact, sitting on '-missing' devices.