Bug 877221

Summary: lvconvert --repair won't reuse physical volumes
Product: Red Hat Enterprise Linux 6 Reporter: benscott
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
lvm2 sub component: Mirroring and RAID (RHEL6) QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: agk, cmarthal, dwysocha, heinzm, jbrassow, mkarg, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac
Version: 6.5   
Target Milestone: beta   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.107-2.el6 Doc Type: Bug Fix
Doc Text:
No Documentation needed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 08:23:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 960054, 1056252, 1075263    

Description benscott 2012-11-16 00:41:47 UTC
Description of problem:



I don't know if this is really a bug but it seems
odd. First I create a RAID mirror and then remove one
of the physical volume devices. If I run the utility
"lvconvert --repair" it won't restore the mirror leg
with a physical volume that another segment of the
damaged leg is already using that has plenty of space
left.




Version-Release number of selected component (if applicable):

#lvs --version
  LVM version:     2.02.98(2) (2012-10-15)
  Library version: 1.02.77 (2012-10-15)
  Driver version:  4.23.0

Steps to Reproduce:

#pvs
  PV         VG   Fmt  Attr PSize PFree
  /dev/sdc   vg1  lvm2 a--  5.13g 4.64g
  /dev/sdd   vg1  lvm2 a--  5.13g 4.64g
  /dev/sde   vg1  lvm2 a--  5.13g 4.64g
  /dev/sdg   vg1  lvm2 a--  5.13g 4.64g


A working RAID mirror with two legs and two segments each:

#lvs --all --segments -o +devices
  LV               VG   Attr      #Str Type   SSize    Devices
  lvol0            vg1  rwi-a-r--    2 raid1  1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0)
  [lvol0_rimage_0] vg1  iwi-aor--    1 linear  500.00m /dev/sdc(1)
  [lvol0_rimage_0] vg1  iwi-aor--    1 linear  500.00m /dev/sde(0)
  [lvol0_rimage_1] vg1  iwi-aor--    1 linear  500.00m /dev/sdd(1)
  [lvol0_rimage_1] vg1  iwi-aor--    1 linear  500.00m /dev/sdg(0)
  [lvol0_rmeta_0]  vg1  ewi-aor--    1 linear    2.00m /dev/sdc(0)
  [lvol0_rmeta_1]  vg1  ewi-aor--    1 linear    2.00m /dev/sdd(0)



With a device removed:  

#lvs --all --segments -o +devices
  Couldn't find device with uuid Mzkggh-TSe6-G8pB-4LfY-Atfb-0a4R-p8P5D8.
  LV               VG   Attr      #Str Type   SSize    Devices
  lvol0            vg1  rwi-a-r-p    2 raid1  1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0)
  [lvol0_rimage_0] vg1  iwi-aor-p    1 linear  500.00m /dev/sdc(1)
  [lvol0_rimage_0] vg1  iwi-aor-p    1 linear  500.00m unknown device(0)
  [lvol0_rimage_1] vg1  iwi-aor--    1 linear  500.00m /dev/sdd(1)
  [lvol0_rimage_1] vg1  iwi-aor--    1 linear  500.00m /dev/sdg(0)
  [lvol0_rmeta_0]  vg1  ewi-aor--    1 linear    2.00m /dev/sdc(0)
  [lvol0_rmeta_1]  vg1  ewi-aor--    1 linear    2.00m /dev/sdd(0)


Results:

lvconvert --alloc anywhere  --repair -y vg1/lvol0 /dev/sdc
  Couldn't find device with uuid Mzkggh-TSe6-G8pB-4LfY-Atfb-0a4R-p8P5D8.
  Insufficient suitable allocatable extents for logical volume : 501 more required
  Failed to allocate replacement images for vg1/lvol0
  Failed to replace faulty devices in vg1/lvol0.


Expected results:

Since /dev/sdc has room left it seems it could be reused for both segments
of the damaged mirror leg.

Comment 10 Jonathan Earl Brassow 2014-06-20 15:04:54 UTC
While waiting for the code to be able to do this automatically, here are the steps to do what you want manually:

1) Remove failed device (removes the failed device from the VG)
# vgreduce --removemissing --force <vg>

2) Down-convert the LV (removes the empty slot from the RAID LV)
# lvconvert -m -1 <vg>/<lv>

3) Up-convert the LV (use available VG space to allocate another RAID image)
# lvconvert --type raid1 -m 1 <vg>/<lv>


Example:
  Couldn't find device with uuid ObYo59-GiW9-QS50-VIGT-dYu8-E3Gt-8Hhhou.
  LV            Attr       Cpy%Sync Devices                      
  lv            rwi-a-r-p- 100.00   lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] iwi-aor-p-          unknown device(1)            
  [lv_rimage_0] iwi-aor-p-          /dev/sdd1(0)                 
  [lv_rimage_1] iwi-aor---          /dev/sdc1(1)                 
  [lv_rimage_1] iwi-aor---          /dev/sde1(0)                 
  [lv_rmeta_0]  ewi-aor-p-          unknown device(0)            
  [lv_rmeta_1]  ewi-aor---          /dev/sdc1(0)                 
[root@bp-01 lvm2]# vgreduce --removemissing --force vg
  /dev/sdb1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 898381381632: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 898381488128: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 0: Input/output error
  /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error
  Couldn't find device with uuid ObYo59-GiW9-QS50-VIGT-dYu8-E3Gt-8Hhhou.
  Wrote out consistent volume group vg
[root@bp-01 lvm2]# devices vg
  /dev/sdb1: open failed: No such device or address
  LV            Attr       Cpy%Sync Devices                      
  lv            rwi-a-r-r- 100.00   lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] vwi-aor-r-                                       
  [lv_rimage_1] iwi-aor---          /dev/sdc1(1)                 
  [lv_rimage_1] iwi-aor---          /dev/sde1(0)                 
  [lv_rmeta_0]  ewi-aor-r-                                       
  [lv_rmeta_1]  ewi-aor---          /dev/sdc1(0)                 
[root@bp-01 lvm2]# lvconvert -m 0 vg/lv
  /dev/sdb1: open failed: No such device or address
[root@bp-01 lvm2]# devices vg
  /dev/sdb1: open failed: No such device or address
  LV   Attr       Cpy%Sync Devices     
  lv   -wi-a-----          /dev/sdc1(1)
  lv   -wi-a-----          /dev/sde1(0)
[root@bp-01 lvm2]# lvconvert --type raid1 -m 1 vg/lv
  /dev/sdb1: open failed: No such device or address
[root@bp-01 lvm2]# devices vg
  /dev/sdb1: open failed: No such device or address
  LV            Attr       Cpy%Sync Devices                      
  lv            rwi-a-r--- 100.00   lv_rimage_0(0),lv_rimage_1(0)
  [lv_rimage_0] iwi-aor---          /dev/sdc1(1)                 
  [lv_rimage_0] iwi-aor---          /dev/sde1(0)                 
  [lv_rimage_1] iwi-aor---          /dev/sdd1(1)                 
  [lv_rmeta_0]  ewi-aor---          /dev/sdc1(0)                 
  [lv_rmeta_1]  ewi-aor---          /dev/sdd1(0)

Comment 11 Jonathan Earl Brassow 2014-06-26 03:33:53 UTC
3 upstream check-ins are needed to fix this bug (plus an extra one for a test that will cause you merge conflicts otherwise):

commit ed3c2537b82be4e326a53c7e3e6d5eccdd833800
Author: Jonathan Brassow <jbrassow>
Date:   Wed Jun 25 22:26:06 2014 -0500

    raid: Allow repair to reuse PVs from same image that suffered a PV failure
    
    When repairing RAID LVs that have multiple PVs per image, allow
    replacement images to be reallocated from the PVs that have not
    failed in the image if there is sufficient space.
    
    This allows for scenarios where a 2-way RAID1 is spread across 4 PVs,
    where each image lives on two PVs but doesn't use the entire space
    on any of them.  If one PV fails and there is sufficient space on the
    remaining PV in the image, the image can be reallocated on just the
    remaining PV.

commit 7028fd31a0f2d2234ffdd1b94ea6ae6128ca9362
Author: Jonathan Brassow <jbrassow>
Date:   Wed Jun 25 22:04:58 2014 -0500

    misc: after releasing a PV segment, merge it with any adjacent free space
    
    Previously, the seg_pvs used to track free and allocated space where left
    in place after 'release_pv_segment' was called to free space from an LV.
    Now, an attempt is made to combine any adjacent seg_pvs that also track
    free space.  Usually, this doesn't provide much benefit, but in a case
    where one command might free some space and then do an allocation, it
    can make a difference.  One such case is during a repair of a RAID LV,
    where one PV of a multi-PV image fails.  This new behavior is used when
    the replacement image can be allocated from the remaining space of the
    PV that did not fail.  (First the entire image with the failed PV is
    removed.  Then the image is reallocated from the remaining PVs.)

commit b35fb0b15af1d87693be286f0630e95622056a77
Author: Jonathan Brassow <jbrassow>
Date:   Wed Jun 25 21:20:41 2014 -0500

    raid/misc: Allow creation of parallel areas by LV vs segment
    
    I've changed build_parallel_areas_from_lv to take a new parameter
    that allows the caller to build parallel areas by LV vs by segment.
    Previously, the function created a list of parallel areas for each
    segment in the given LV.  When it came time for allocation, the
    parallel areas were honored on a segment basis.  This was problematic
    for RAID because any new RAID image must avoid being placed on any
    PVs used by other images in the RAID.  For example, if we have a
    linear LV that has half its space on one PV and half on another, we
    do not want an up-convert to use either of those PVs.  It should
    especially not wind up with the following, where the first portion
    of one LV is paired up with the second portion of the other:
    ------PV1-------  ------PV2-------
    [ 2of2 image_1 ]  [ 1of2 image_1 ]
    [ 1of2 image_0 ]  [ 2of2 image_0 ]
    ----------------  ----------------
    Previously, it was possible for this to happen.  The change makes
    it so that the returned parallel areas list contains one "super"
    segment (seg_pvs) with a list of all the PVs from every actual
    segment in the given LV and covering the entire logical extent range.
    
    This change allows RAID conversions to function properly when there
    are existing images that contain multiple segments that span more
    than one PV.

commit 1f1675b059d65768524398791b2e505b7dfe2497
Author: Jonathan Brassow <jbrassow>
Date:   Sat Jun 21 15:33:52 2014 -0500

    test:  Test addition to show incorrect allocator behavior
    
    If a RAID LV has images that are spread across more than one PV
    and you allocate a new image that requires more than one PV,
    parallel_areas is only honored for one segment.  This commit
    adds a test for this condition.

Comment 12 Jonathan Earl Brassow 2014-06-26 03:36:10 UTC
*** Bug 1113180 has been marked as a duplicate of this bug. ***

Comment 14 Corey Marthaler 2014-07-16 22:42:45 UTC
This appears to work with the latest rpms. Marking verified.

2.6.32-485.el6.x86_64
lvm2-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-libs-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-cluster-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
udev-147-2.55.el6    BUILT: Wed Jun 18 06:30:21 CDT 2014
device-mapper-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014



[root@host-002 ~]# lvcreate -m 1 --type raid1 -n lvol0 vg1 -L 1000m /dev/sda1:0-200 /dev/sdb1:0-200 /dev/sdc1:0-200 /dev/sdd1:0-200
  Logical volume "lvol0" created

# A working RAID mirror with two legs and two segments each:

[root@host-002 ~]# lvs --all --segments -o +devices
  LV               VG   Attr       #Str Type   SSize    Devices
  lvol0            vg1  rwi-a-r---    2 raid1  1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0)
  [lvol0_rimage_0] vg1  Iwi-aor---    1 linear  800.00m /dev/sda1(1)
  [lvol0_rimage_0] vg1  Iwi-aor---    1 linear  200.00m /dev/sdc1(0)
  [lvol0_rimage_1] vg1  Iwi-aor---    1 linear  800.00m /dev/sdb1(1)
  [lvol0_rimage_1] vg1  Iwi-aor---    1 linear  200.00m /dev/sdd1(0)
  [lvol0_rmeta_0]  vg1  ewi-aor---    1 linear    4.00m /dev/sda1(0)
  [lvol0_rmeta_1]  vg1  ewi-aor---    1 linear    4.00m /dev/sdb1(0)

# With a device removed:

[root@host-002 ~]# lvs -a -o +devices
  /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 8052408320: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 8052506624: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 4096: Input/output error
  Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO.
  LV               VG   Attr       LSize    Cpy%Sync Devices
  lvol0            vg1  rwi-a-r-p- 1000.00m 100.00   lvol0_rimage_0(0),lvol0_rimage_1(0)
  [lvol0_rimage_0] vg1  iwi-aor-p- 1000.00m          /dev/sda1(1)
  [lvol0_rimage_0] vg1  iwi-aor-p- 1000.00m          unknown device(0)
  [lvol0_rimage_1] vg1  iwi-aor--- 1000.00m          /dev/sdb1(1)
  [lvol0_rimage_1] vg1  iwi-aor--- 1000.00m          /dev/sdd1(0)
  [lvol0_rmeta_0]  vg1  ewi-aor-r-    4.00m          /dev/sda1(0)
  [lvol0_rmeta_1]  vg1  ewi-aor---    4.00m          /dev/sdb1(0)

[root@host-002 ~]# pvscan --cache /dev/sdc1

[root@host-002 ~]# lvconvert --alloc anywhere  --repair -y vg1/lvol0 /dev/sda1
  Option --alloc cannot be used with --repair.
  Run `lvconvert --help' for more information.

[root@host-002 ~]# lvconvert --repair -y vg1/lvol0 /dev/sda1
  /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 8052408320: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 8052506624: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 4096: Input/output error
  Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO.
  Insufficient suitable allocatable extents for logical volume : 251 more required
  Faulty devices in vg1/lvol0 successfully replaced.

[root@host-002 ~]# lvs --all --segments -o +devices
  /dev/sdc1: open failed: No such device or address
  Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO.
  LV               VG   Attr       #Str Type   SSize    Devices
  lvol0            vg1  rwi-a-r---    2 raid1  1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0)
  [lvol0_rimage_0] vg1  iwi-aor---    1 linear 1000.00m /dev/sda1(2)
  [lvol0_rimage_1] vg1  iwi-aor---    1 linear  800.00m /dev/sdb1(1)
  [lvol0_rimage_1] vg1  iwi-aor---    1 linear  200.00m /dev/sdd1(0)
  [lvol0_rmeta_0]  vg1  ewi-aor---    1 linear    4.00m /dev/sda1(1)
  [lvol0_rmeta_1]  vg1  ewi-aor---    1 linear    4.00m /dev/sdb1(0)

Comment 15 Zdenek Kabelac 2014-07-17 08:54:14 UTC
(In reply to Corey Marthaler from comment #14)
> This appears to work with the latest rpms. Marking verified.
> 
> 2.6.32-485.el6.x86_64
> lvm2-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> lvm2-libs-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> lvm2-cluster-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> udev-147-2.55.el6    BUILT: Wed Jun 18 06:30:21 CDT 2014
> device-mapper-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> device-mapper-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> device-mapper-event-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> device-mapper-event-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
> device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT
> 2014
> 
> 
> 
> [root@host-002 ~]# lvcreate -m 1 --type raid1 -n lvol0 vg1 -L 1000m
> /dev/sda1:0-200 /dev/sdb1:0-200 /dev/sdc1:0-200 /dev/sdd1:0-200
>   Logical volume "lvol0" created
> 
> # A working RAID mirror with two legs and two segments each:
> 
> [root@host-002 ~]# lvconvert --alloc anywhere  --repair -y vg1/lvol0
> /dev/sda1
>   Option --alloc cannot be used with --repair.
>   Run `lvconvert --help' for more information.
> 


This is unfortunately bug which has slipped-in - --alloc option needs to be allowed with --repair.

Comment 17 errata-xmlrpc 2014-10-14 08:23:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1387.html