Bug 1113180

Summary: LVM RAID: Allocator causing bad placement of new RAID images
Product: Red Hat Enterprise Linux 6 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Alasdair Kergon <agk>
lvm2 sub component: Mirroring and RAID (RHEL6) QA Contact: Cluster QE <mspqa-list>
Status: CLOSED DUPLICATE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 6.6   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-26 03:36:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Results from test suite showing behavior none

Description Jonathan Earl Brassow 2014-06-25 14:54:32 UTC
Description of problem:
If a RAID LV has images that are spread across more than one PV and you allocate a new image that requires more than one PV, parallel_areas is only honored for one segment.

Steps to Reproduce:
(I will check in a test for this into the test suite shortly.)

lvcreate --type raid1 -m 1 -l 3 -n $lv1 $vg \
    "$dev1:0-1" "$dev2:0-1" "$dev3:0-1" "$dev4:0-1"
aux wait_for_sync $vg $lv1

# Should not be enough non-overlapping space.
not lvconvert -m +1 $vg/$lv1 \
    "$dev5:0-1" "$dev1" "$dev2" "$dev3" "$dev4"

# Should work due to '--alloc anywhere'
lvconvert -m +1 --alloc anywhere $vg/$lv1 \
    "$dev5:0-1" "$dev1" "$dev2" "$dev3" "$dev4"

lvremove -ff $vg


Actual results:
'lvconvert' without '--alloc anywhere' should fail because there is not enough space on $dev5 (the only device w/o an existing image).

Comment 3 Alasdair Kergon 2014-06-25 15:15:43 UTC
Please attach -vvvv from the lvconvert that's not doing what you expect.

Comment 4 Jonathan Earl Brassow 2014-06-25 16:21:12 UTC
Created attachment 912147 [details]
Results from test suite showing behavior

Commit 1f1675b059d65768524398791b2e505b7dfe2497 added a test for this bug.

Comment 5 Alasdair Kergon 2014-06-25 16:45:39 UTC
The allocator in that trace is performing as designed: it did not allocate space parallel to the supplied extents.

Before:

 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 0:      0      1: LV1_rmeta_0(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 1:      1      1: LV1_rimage_0(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 2:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 0:      0      1: LV1_rmeta_1(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 1:      1      1: LV1_rimage_1(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 2:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv3 0:      0      2: LV1_rimage_0(1:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv3 1:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv4 0:      0      2: LV1_rimage_1(1:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv4 1:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv5 0:      0    131: NULL(0:0)


Request:
DEBUG: libdm-config.c:954   allocation/mirror_logs_require_separate_pvs not found in config: defaulting to 0
 DEBUG: libdm-config.c:954   allocation/maximise_cling not found in config: defaulting to 1
 DEBUG: metadata/pv_map.c:54   Allowing allocation on @TESTDIR@/dev/mapper/@PREFIX@pv5 start PE 0 length 2
 DEBUG: metadata/pv_map.c:54   Allowing allocation on @TESTDIR@/dev/mapper/@PREFIX@pv1 start PE 2 length 129
 DEBUG: metadata/pv_map.c:54   Allowing allocation on @TESTDIR@/dev/mapper/@PREFIX@pv2 start PE 2 length 129
 DEBUG: metadata/pv_map.c:54   Allowing allocation on @TESTDIR@/dev/mapper/@PREFIX@pv3 start PE 2 length 129
 DEBUG: metadata/pv_map.c:54   Allowing allocation on @TESTDIR@/dev/mapper/@PREFIX@pv4 start PE 2 length 129
 DEBUG: metadata/lv_manip.c:1356   Parallel PVs at LE 0 length 1: @TESTDIR@/dev/mapper/@PREFIX@pv1 @TESTDIR@/dev/mapper/@PREFIX@pv2 
 DEBUG: metadata/lv_manip.c:1356   Parallel PVs at LE 1 length 2: @TESTDIR@/dev/mapper/@PREFIX@pv3 @TESTDIR@/dev/mapper/@PREFIX@pv4 
 DEBUG: metadata/lv_manip.c:2459   Trying allocation using contiguous policy.
 DEBUG: metadata/lv_manip.c:2095   Areas to be sorted and filled sequentially.
 DEBUG: metadata/lv_manip.c:2046   Still need 4 total extents from 518 remaining:
 DEBUG: metadata/lv_manip.c:2049     1 (1 data/0 parity) parallel areas of 3 extents each
 DEBUG: metadata/lv_manip.c:2053     1 metadata area of 1 extents each


After:

 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 0:      0      1: LV1_rmeta_0(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 1:      1      1: LV1_rimage_0(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 2:      2      2: LV1_rimage_2(1:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv1 3:      4    127: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 0:      0      1: LV1_rmeta_1(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 1:      1      1: LV1_rimage_1(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv2 2:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv3 0:      0      2: LV1_rimage_0(1:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv3 1:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv4 0:      0      2: LV1_rimage_1(1:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv4 1:      2    129: NULL(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv5 0:      0      1: LV1_rmeta_2(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv5 1:      1      1: LV1_rimage_2(0:0)
 DEBUG: metadata/pv_manip.c:354   @TESTDIR@/dev/mapper/@PREFIX@pv5 2:      2    129: NULL(0:0)

Comment 7 Alasdair Kergon 2014-06-25 17:25:29 UTC
The allocator was told:
- Do not place LE 0 on pv1 or pv2.    It chose pv5.
- Do not place LE 1-2 on pv3 or pv4.  It chose pv1.

Comment 8 Alasdair Kergon 2014-06-25 21:55:37 UTC
Jon is developing a patch that will tell the allocator to avoid the whole of any PVs used by any part of the rimage/rmeta devices to see if that resolves this satisfactorily.

Comment 9 Jonathan Earl Brassow 2014-06-26 03:35:35 UTC
Fixed by upstream commit:
commit b35fb0b15af1d87693be286f0630e95622056a77
Author: Jonathan Brassow <jbrassow>
Date:   Wed Jun 25 21:20:41 2014 -0500

    raid/misc: Allow creation of parallel areas by LV vs segment
    
    I've changed build_parallel_areas_from_lv to take a new parameter
    that allows the caller to build parallel areas by LV vs by segment.
    Previously, the function created a list of parallel areas for each
    segment in the given LV.  When it came time for allocation, the
    parallel areas were honored on a segment basis.  This was problematic
    for RAID because any new RAID image must avoid being placed on any
    PVs used by other images in the RAID.  For example, if we have a
    linear LV that has half its space on one PV and half on another, we
    do not want an up-convert to use either of those PVs.  It should
    especially not wind up with the following, where the first portion
    of one LV is paired up with the second portion of the other:
    ------PV1-------  ------PV2-------
    [ 2of2 image_1 ]  [ 1of2 image_1 ]
    [ 1of2 image_0 ]  [ 2of2 image_0 ]
    ----------------  ----------------
    Previously, it was possible for this to happen.  The change makes
    it so that the returned parallel areas list contains one "super"
    segment (seg_pvs) with a list of all the PVs from every actual
    segment in the given LV and covering the entire logical extent range.
    
    This change allows RAID conversions to function properly when there
    are existing images that contain multiple segments that span more
    than one PV.


This commit is require for the solution to bug 877221, so this bug will be closed as a duplicate of that bug.

Comment 10 Jonathan Earl Brassow 2014-06-26 03:36:10 UTC

*** This bug has been marked as a duplicate of bug 877221 ***