Bug 502899 - segfault during attempted mirror down convert
Summary: segfault during attempted mirror down convert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Milan Broz
QA Contact: Cluster QE
URL:
Whiteboard:
: 502648 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-05-27 17:00 UTC by Corey Marthaler
Modified: 2013-03-01 04:07 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 11:56:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
log from taft-04 before the segfault (13.03 KB, text/plain)
2009-05-27 18:55 UTC, Corey Marthaler
no flags Details
core dump corresponding to comment #3 (81.72 KB, application/x-gzip)
2009-05-28 18:54 UTC, Nate Straz
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:1393 0 normal SHIPPED_LIVE lvm2 bug-fix and enhancement update 2009-09-01 12:00:22 UTC

Description Corey Marthaler 2009-05-27 17:00:12 UTC
Description of problem:
May 26 15:01:20 taft-04 qarshd[7933]: Running cmdline: lvconvert -m 0 helter_skelter/nonsyncd_log_4legs_1
May 26 15:01:20 taft-04 lvm[7244]: No longer monitoring mirror device helter_skelter-nonsyncd_log_4legs_1 for events
May 26 15:01:20 taft-04 kernel: lvconvert[7934]: segfault at 0000000000000010 rip 000000000045008a rsp 00007fff6591cc20 error 

This was while running single machine device failure testing:

Scenario: Kill disk log of non synced 4 leg mirror(s)                           

****** Mirror hash info for this scenario ******
* name:         nonsyncd_log_4legs              
* sync:         0                               
* num mirrors:  1                               
* disklog:      /dev/sdg1                       
* failpv(s):    /dev/sdg1                       
* leg devices:  /dev/sde1 /dev/sdf1 /dev/sdh1 /dev/sdd1
************************************************       

Creating mirror(s) on taft-04...
taft-04: lvcreate -m 3 -n nonsyncd_log_4legs_1 -L 600M helter_skelter /dev/sde1:0-1000 /dev/sdf1:0-1000 /dev/sdh1:0-1000 /dev/sdd1:0-1000 /dev/sdg1:0-150                                                                                                           

Continuing on without fully syncd mirrors, currently at...
        ( 1=2.83% )                                       

Creating ext on top of mirror(s) on taft-04...
mke2fs 1.39 (29-May-2006)                     
Mounting mirrored ext filesystems on taft-04...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-04 ----                              

<start name="taft-04_1" pid="8732" time="Tue May 26 15:00:35 2009" type="cmd" />
Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure     
Verifying files (checkit) on mirror(s) on...                                    
        ---- taft-04 ----                                                       
                       
Disabling device sdg on taft-04

Attempting I/O to cause mirror down conversion(s) on taft-04
10+0 records in                                             
10+0 records out                                            
41943040 bytes (42 MB) copied, 0.151064 seconds, 278 MB/s   
Verifying the down conversion of the failed mirror(s)       
  /dev/sdg1: open failed: No such device or address         
Verifying FAILED device /dev/sdg1 is *NOT* in the volume(s) 
  /dev/sdg1: open failed: No such device or address         
Verifying LOG device /dev/sdg1 is *NOT* in the linear(s)    
  /dev/sdg1: open failed: No such device or address         
Verifying LEG device /dev/sde1 *IS* in the volume(s)        
  /dev/sdg1: open failed: No such device or address         
Verifying LEG device /dev/sdf1 *IS* in the volume(s)        
  /dev/sdg1: open failed: No such device or address
Verifying LEG device /dev/sdh1 *IS* in the volume(s)
  /dev/sdg1: open failed: No such device or address
Verifying LEG device /dev/sdd1 *IS* in the volume(s)
  /dev/sdg1: open failed: No such device or address
Verify the dm devices associated with /dev/sdg1 are no longer present
Verify that the mirror image order remains the same after the down conversion
  /dev/sdg1: open failed: No such device or address
  /dev/sdg1: open failed: No such device or address
  /dev/sdg1: open failed: No such device or address
  /dev/sdg1: open failed: No such device or address

Verifying files (checkit) on mirror(s) on...
        ---- taft-04 ----

Enabling device sdg on taft-04

Recreating PVs /dev/sdg1
  WARNING: Volume group helter_skelter is not consistent
  WARNING: Volume Group helter_skelter is not consistent
  WARNING: Volume group helter_skelter is not consistent
Extending the recreated PVs back into VG helter_skelter
Since we can't yet up convert existing mirrors, down converting to linear(s)
on taft-04 before re-converting back to original mirror(s)
couldn't down convert mirror to linear


Version-Release number of selected component (if applicable):
2.6.18-149.el5
lvm2-2.02.46-2.el5
device-mapper-1.02.32-1.el5


I'll attempt to reproduce and add more info...

Comment 1 Corey Marthaler 2009-05-27 18:55:30 UTC
Created attachment 345664 [details]
log from taft-04 before the segfault

Comment 2 Corey Marthaler 2009-05-27 20:21:15 UTC
Reproduced this. 

May 27 14:52:47 taft-04 qarshd[19692]: Running cmdline: lvconvert -m 0 helter_skelter/nonsyncd_log_4legs_1
May 27 14:52:47 taft-04 lvm[7341]: No longer monitoring mirror device helter_skelter-nonsyncd_log_4legs_1 for events
May 27 14:52:48 taft-04 kernel: lvconvert[19693]: segfault at 0000000000000010 rip 000000000045008a rsp 00007fff93dcb0d0 error 4

Comment 3 Nate Straz 2009-05-28 18:07:28 UTC
I reproduces this also with lvm2-2.02.46-2.el5.  I was able to gather a core and produce this backtrace:


Core was generated by `lvconvert -m 0 /dev/mirror_sanity/mirror_2_linear'.
Program terminated with signal 11, Segmentation fault.
[New process 18262]
#0  0x000000000045008a in _lv_read_ahead_single (lv=<value optimized out>,
    data=0x7fff6fc0726c) at metadata/metadata.c:1427
1427                    dev_get_read_ahead(seg_pv(seg, 0)->dev, &seg_read_ahead);
(gdb) bt
#0  0x000000000045008a in _lv_read_ahead_single (lv=<value optimized out>,
    data=0x7fff6fc0726c) at metadata/metadata.c:1427
#1  0x000000000044eb0e in _lv_postorder_visit (lv=0x1c48550,
    fn=0x450050 <_lv_read_ahead_single>, data=0x7fff6fc0726c)
    at metadata/metadata.c:1334
#2  0x000000000044ebf6 in _lv_postorder (lv=0x1c485e8, fn=0x7fff6fc071ec,
    data=0x0) at metadata/metadata.c:1352
#3  0x000000000044ec79 in lv_calculate_readhead (lv=0x1c485e8)
    at metadata/metadata.c:1440
#4  0x000000000042aa14 in _lv_info (cmd=0x1c144b0, lv=0x1c48550,
    with_mknodes=0, info=0x7fff6fc07320, with_open_count=0, with_read_ahead=0,
    by_uuid_only=0) at activate/activate.c:475
#5  0x000000000042ac0b in lv_info (cmd=0x1c485e8, lv=0x7fff6fc071ec, info=0x0,
    with_open_count=-1400596269, with_read_ahead=29670480)
    at activate/activate.c:486
#6  0x000000000042b63f in _lv_activate (cmd=0x1c144b0,
    lvid_s=<value optimized out>, exclusive=0, filter=1)
    at activate/activate.c:1088
#7  0x0000000000469c59 in _file_lock_resource (cmd=0x1c144b0,
    resource=0x7fff6fc085d0 "T0RbQZw61u7Z53rz2kzlsWRAYuhd6i7QEMdSY2hUk5vFROdAvMJhg5khzhbcmNs0", flags=57) at locking/file_locking.c:258
#8  0x0000000000446ef7 in _lock_vol (cmd=0x1c144b0,
    resource=0x7fff6fc085d0 "T0RbQZw61u7Z53rz2kzlsWRAYuhd6i7QEMdSY2hUk5vFROdAvMJhg5khzhbcmNs0", flags=57) at locking/locking.c:349
#9  0x0000000000447330 in lock_vol (cmd=0x1c144b0,
    vol=0x1c3a500 "T0RbQZw61u7Z53rz2kzlsWRAYuhd6i7QEMdSY2hUk5vFROdAvMJhg5khzhbcmNs0", flags=57) at locking/locking.c:401
#10 0x00000000004558c9 in _delete_lv (mirror_lv=0x1c3a2f0, lv=0x1c3a500)
    at metadata/mirror.c:370
#11 0x0000000000456156 in _remove_mirror_images (lv=0x1c3a2f0, num_removed=1,
    removable_pvs=<value optimized out>, remove_log=1, collapse=0,
    removed=0x7fff6fc08814) at metadata/mirror.c:627
#12 0x0000000000456655 in remove_mirror_images (lv=0x1c3a2f0,
    num_mirrors=<value optimized out>, removable_pvs=0x0, remove_log=1)
    at metadata/mirror.c:673
#13 0x0000000000410e22 in lvconvert_single (cmd=0x1c144b0, lv=0x1c3a2f0,
    handle=0x7fff6fc088e0) at lvconvert.c:578
#14 0x00000000004116f5 in lvconvert (cmd=0x1c144b0,
    argc=<value optimized out>, argv=<value optimized out>) at lvconvert.c:876
#15 0x00000000004183aa in lvm_run_command (cmd=0x1c144b0, argc=1,
    argv=0x7fff6fc0ab68) at lvmcmdline.c:1007
#16 0x0000000000418798 in lvm2_main (argc=4, argv=0x7fff6fc0ab68)
    at lvmcmdline.c:1343
#17 0x00002b7face7b994 in __libc_start_main (main=0x42a780 <main>, argc=4,
    ubp_av=0x7fff6fc0ab68, init=<value optimized out>,
    fini=<value optimized out>, rtld_fini=<value optimized out>,
    stack_end=0x7fff6fc0ab58) at libc-start.c:231
#18 0x000000000040da69 in _start ()

Comment 4 Nate Straz 2009-05-28 18:54:36 UTC
Created attachment 345818 [details]
core dump corresponding to comment #3

Here's the core dump from my x86_64 system.

Comment 5 Milan Broz 2009-05-28 21:56:00 UTC
Apparent bug in new readahead code, during vgreduce is failed mirror image replaced with error segment, this segment type set always seg area_count to 0.
We cannot expect that first area is always here.

Comment 6 Milan Broz 2009-05-29 08:37:10 UTC
Patch sent for review here
https://www.redhat.com/archives/lvm-devel/2009-May/msg00237.html

Comment 7 Milan Broz 2009-06-01 12:35:59 UTC
*** Bug 502648 has been marked as a duplicate of this bug. ***

Comment 9 Milan Broz 2009-06-01 15:08:58 UTC
Fixed upstream, setting bug to POST for now.

Comment 11 Milan Broz 2009-06-01 19:01:27 UTC
Fixed in lvm2-2_02_46-3_el5.

Comment 14 Corey Marthaler 2009-07-02 19:03:57 UTC
Fix verified in lvm2-2.02.46-8.el5.

Comment 16 errata-xmlrpc 2009-09-02 11:56:45 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-1393.html


Note You need to log in before you can comment on or make changes to this bug.