Bug 989347

Summary: lvextend segfaults in _alloc_parallel_area when trying to extend 3-way striped logical volume
Product: Red Hat Enterprise Linux 6 Reporter: Brad Hubbard <bhubbard>
Component: lvm2Assignee: Alasdair Kergon <agk>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: high    
Version: 6.4CC: agk, bhubbard, cmarthal, dwysocha, heinzm, jbrassow, jkurik, jmagrini, jruemker, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac
Target Milestone: rcKeywords: ZStream
Target Release: 6.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.100-1.el6 Doc Type: Bug Fix
Doc Text:
Due to an error in the LVM allocation code, lvm2 attempted free space allocation contiguous to an existing striped space. When trying to extend a 3-way striped logical volume using the lvextend command, the lvm2 utility terminated unexpectedly with a segmentation fault. With this update, the behavior of LVM has been modified, and lvextend now completes the extension without a segmentation fault.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-21 23:26:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1016082, 1016083    

Description Brad Hubbard 2013-07-29 06:31:11 UTC
Description of problem:
lvextend segfaults when trying to extend 3-way striped logical volume


Version-Release number of selected component (if applicable):
lvm2-2.02.98-9.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. # pvcreate /dev/sda /dev/sdb /dev/sdc
  Physical volume "/dev/sda" successfully created
  Physical volume "/dev/sdb" successfully created
  Physical volume "/dev/sdc" successfully created
2. # vgcreate vg1 /dev/sda /dev/sdb /dev/sdc
  Volume group "vg1" successfully created
3. # lvcreate -L 10G -n lvol0 -i 3 -I 4096k vg1
  Rounding size (2560 extents) up to stripe boundary size (2562 extents)
  Logical volume "lvol0" created
4. # lvs -o +devices vg1
  LV      VG              Attr      LSize  Pool Origin Data%  Move Log Cpy%Sync Convert Devices                            
  lvol0   vg1             -wi-a---- 10.01g  
5. # lvs
  LV      VG              Attr      LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  lv_root VolGroup        -wi-ao---  13.54g                                             
  lv_swap VolGroup        -wi-ao--- 992.00m                                             
  lvol0   vg1             -wi-a---- 10.01g    
6. # lvextend -i1 -l+100%FREE vg1/lvol0
  Extending logical volume lvol0 to 24.99 GiB
Segmentation fault

Actual results:
Segmentation fault

Expected results:
No Segmentation fault

Additional info:

This may be related to rhel5 bz 715364.

(gdb) bt
#0  _alloc_parallel_area (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, log_count=4, 
    region_size=0, extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:1187
#1  _find_some_parallel_space (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, log_count=4, 
    region_size=0, extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:1937
#2  _find_max_parallel_space_for_one_policy (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, 
    log_count=4, region_size=0, extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:1999
#3  _allocate (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, log_count=4, region_size=0, 
    extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:2119
#4  allocate_extents (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, log_count=4, 
    region_size=0, extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:2247
#5  0x000000000046c52b in lv_extend (lv=0xc67ec8, segtype=0xc24790, stripes=1, stripe_size=0, mirrors=0, region_size=0, extents=2673052, thin_pool_name=0x0, 
    allocatable_pvs=0xc67970, alloc=ALLOC_INHERIT) at metadata/lv_manip.c:2725
#6  0x000000000042837d in _lvresize (cmd=<value optimized out>, argc=<value optimized out>, argv=<value optimized out>) at lvresize.c:818
#7  lvresize (cmd=<value optimized out>, argc=<value optimized out>, argv=<value optimized out>) at lvresize.c:904
#8  0x000000000042265c in lvm_run_command (cmd=0xbf70f0, argc=1, argv=0x7ffff5c2e8d0) at lvmcmdline.c:1120
#9  0x00000000004252c8 in lvm2_main (argc=4, argv=0x7ffff5c2e8b8) at lvmcmdline.c:1554
#10 0x000000346901ecdd in __libc_start_main (main=0x43be90 <main>, argc=4, ubp_av=0x7ffff5c2e8b8, init=<value optimized out>, fini=<value optimized out>, 
    rtld_fini=<value optimized out>, stack_end=0x7ffff5c2e8a8) at libc-start.c:226
#11 0x00000000004145b9 in _start ()

(gdb) p $pc
$3 = (void (*)()) 0x46bc99 <allocate_extents+5753>
(gdb) x/i $pc
=> 0x46bc99 <allocate_extents+5753>:    mov    0x0(%rbp),%rax
(gdb) inf reg rbp
rbp            0x0      0x0

(gdb) f
#0  _alloc_parallel_area (vg=<value optimized out>, lv=0xc67ec8, segtype=<value optimized out>, stripes=<value optimized out>, mirrors=<value optimized out>, log_count=4, 
    region_size=0, extents=2673052, allocatable_pvs=0xc67970, alloc=ALLOC_NORMAL, parallel_areas=0x0) at metadata/lv_manip.c:1187
1187                    aa[s].pv = pva->map->pv;

(gdb) p pva
$4 = (struct pv_area *) 0x0
(gdb) p *pva
Cannot access memory at address 0x0

Hence the segfault at 0 part of these messages, NULL pointer dereference.

kernel: lvextend[9132]: segfault at 0 ip 000000000046bc99 sp 00007fff8c96f250 error 4 in lvm[400000+e9000]

Comment 1 Brad Hubbard 2013-07-29 06:36:43 UTC
1155                    pva = alloc_state->areas[s + ix_log_skip].pva;
1156                    if (ah->alloc_and_split_meta) {
(gdb) 
1157                            /*
1158                             * The metadata area goes at the front of the allocated
1159                             * space for now, but could easily go at the end (or
1160                             * middle!).
1161                             *
1162                             * Even though we split these two from the same
1163                             * allocation, we store the images at the beginning
1164                             * of the areas array and the metadata at the end.
1165                             */
1166                            s += ah->area_count + ah->parity_count;
(gdb) 
1167                            aa[s].pv = pva->map->pv;
1168                            aa[s].pe = pva->start;
1169                            aa[s].len = ah->log_len;
1170
1171                            log_debug("Allocating parallel metadata area %" PRIu32
1172                                      " on %s start PE %" PRIu32
1173                                      " length %" PRIu32 ".",
1174                                      (s - (ah->area_count + ah->parity_count)),
1175                                      pv_dev_name(aa[s].pv), aa[s].pe,
1176                                      ah->log_len);
(gdb) 
1177
1178                            consume_pv_area(pva, ah->log_len);
1179                            dm_list_add(&ah->alloced_areas[s], &aa[s].list);
1180                            s -= ah->area_count + ah->parity_count;
1181                    }
1182                    aa[s].len = (ah->alloc_and_split_meta) ? len - ah->log_len : len;
1183                    /* Skip empty allocations */
1184                    if (!aa[s].len)
1185                            continue;
1186
(gdb) 
1187                    aa[s].pv = pva->map->pv;

(gdb) p s
$1 = 0
(gdb) p ix_log_skip
$2 = 0
(gdb) p alloc_state->areas[s + ix_log_skip].pva
$3 = (struct pv_area *) 0x0

I'm attaching logs collected with the following command. 

# lvextend -vvvv -i1 -l+100%FREE vg1/lvol0 &>/tmp/lvextend.out

NOTE: this log is not from the same run that produced the core and not even the same system. If matching core file and logs are required let me know and we will provide them.

Comment 4 Alasdair Kergon 2013-07-29 13:00:31 UTC
Reproduced with upstream code.

Comment 5 Alasdair Kergon 2013-07-29 13:57:34 UTC
There is an existing 3-striped allocation.
The code is being asked to extend this with just 1 stripe.

The allocation code is being supplied with the preceding 3 locations of the existing stripes so it can attempt 'contiguous' allocation of each those stripes if possible, but then it is only being asked to find 1 stripe.  It is preparing 3 'slots' to fill with extents.  It finds space contiguous to the 2nd existing stripe, so it puts it into the 2nd slot, assuming both the 1st and 3rd would be filled subsequently, but because it only needs 1 stripe it stops searching at that point.  Then it unexpectedly finds the 1st slot NULL and crashes.

The bug is that it should not have been asked to attempt allocation contiguous to existing striped space when extending with a different number of stripes.

Comment 7 Alasdair Kergon 2013-07-29 18:54:59 UTC
To test: create and extend LVs, varying the numbers of stripes.

This bug could be hit if the extension needed a smaller number of stripes than the original.  (If a larger number is needed, there is a different bug.)

If the number of stripes varies, the new code makes no attempt to allocate contiguously or clinging (either with or without tags) to the already-allocated extents.  (If more than one round of allocation is needed, later rounds do still try to cling to the layout from earlier rounds.)

Comment 22 Nenad Peric 2013-10-09 09:39:40 UTC
lvextend completes the extension without segfaulting
tested with 
lvm2-2.02.100-4.el6.x86_64
kernel-2.6.32-421.el6.x86_64

There is a bug with allocation when using -l100%FREE though, will open another BZ for that specific case. 

Marking this one as VERIFIED.

Comment 25 errata-xmlrpc 2013-11-21 23:26:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1704.html