Bug 1441334
| Summary: | RAID TAKEOVER: "Internal error: pool_free asked to free pointer not in pool" | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||||
| Component: | lvm2 | Assignee: | Heinz Mauelshagen <heinzm> | ||||||
| lvm2 sub component: | Mirroring and RAID | QA Contact: | cluster-qe <cluster-qe> | ||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||
| Severity: | medium | ||||||||
| Priority: | unspecified | CC: | agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, teigland, zkabelac | ||||||
| Version: | 7.4 | Keywords: | Reopened | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | lvm2-2.02.171-2.el7 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-08-01 21:52:19 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Corey Marthaler
2017-04-11 17:02:37 UTC
A pointer to an auto variable is being accessed as of the backtrace. Caller provides tha reference correctly. No pool free happens by the time. Assuming unrelated pool corruption leading to Trying to reproduce but didn't succeed to so far... FYI: I've seen this now with raid0 and raid0_meta -> raid4. Appears fairly easy to repo for me. Does the FS signature matter in the following cases? # raid0_meta: [root@host-121 ~]# lvcreate --type raid0_meta -i 2 -n takeover -L 500M centipede2 Using default stripesize 64.00 KiB. Rounding size 500.00 MiB (125 extents) up to stripe boundary size 504.00 MiB(126 extents). WARNING: ext4 signature detected on /dev/centipede2/takeover at offset 1080. Wipe it? [y/n]: y Wiping ext4 signature on /dev/centipede2/takeover. Logical volume "takeover" created. [root@host-121 ~]# lvs -a -o +devices LV VG Attr LSize Cpy%Sync Devices takeover centipede2 rwi-a-r--- 504.00m takeover_rimage_0(0),takeover_rimage_1(0) [takeover_rimage_0] centipede2 iwi-aor--- 252.00m /dev/sdg1(1) [takeover_rimage_1] centipede2 iwi-aor--- 252.00m /dev/sde1(1) [takeover_rmeta_0] centipede2 ewi-aor--- 4.00m /dev/sdg1(0) [takeover_rmeta_1] centipede2 ewi-aor--- 4.00m /dev/sde1(0) [root@host-121 ~]# lvconvert --yes --type raid4 centipede2/takeover Using default stripesize 64.00 KiB. Internal error: pool_free asked to free pointer not in pool Logical volume centipede2/takeover successfully converted. Segmentation fault (core dumped) # raid0: [root@host-122 ~]# lvcreate --type raid0 -i 2 -n takeover -L 500M centipede2 Using default stripesize 64.00 KiB. Rounding size 500.00 MiB (125 extents) up to stripe boundary size 504.00 MiB(126 extents). WARNING: xfs signature detected on /dev/centipede2/takeover at offset 0. Wipe it? [y/n]: y Wiping xfs signature on /dev/centipede2/takeover. Logical volume "takeover" created. [root@host-122 ~]# lvconvert --yes --type raid4 centipede2/takeover Using default stripesize 64.00 KiB. Internal error: pool_free asked to free pointer not in pool Segmentation fault (core dumped) Created attachment 1274020 [details]
verbose lvconvert attempt
I believe I was finally able to capture the -vvvv prior to this segfaulting.
Here's the back trace associated with the attached verbose log.
Scenario raid0: Convert Striped raid0 volume
********* Take over hash info for this scenario *********
* from type: raid0
* to type: raid4
* from legs: 2
* to legs: 2
* from region: 0
* to region: 2048.00k
* contiguous: 1
* snapshot: 0
******************************************************
Creating original volume on host-073...
host-073: lvcreate --type raid0 -i 2 -n takeover -L 4G centipede2
Waiting until all mirror|raid volumes become fully syncd...
1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec
Current volume device structure:
LV Attr LSize Cpy%Sync Devices
takeover rwi-a-r--- 4.00g takeover_rimage_0(0),takeover_rimage_1(0)
[takeover_rimage_0] iwi-aor--- 2.00g /dev/sde1(0)
[takeover_rimage_1] iwi-aor--- 2.00g /dev/sdf1(0)
Creating xfs on top of mirror(s) on host-073...
Mounting mirrored xfs filesystems on host-073...
TAKEOVER (with verbose): lvconvert --yes -R 2048.00k --type raid4 centipede2/takeover
sh: line 1: 26654 Segmentation fault (core dumped) lvconvert -vvvv --yes -R 2048.00k --type raid4 centipede2/takeover > /tmp/lvconvert.18843 2>&1
Core was generated by `lvconvert -vvvv --yes -R 2048.00k --type raid4 centipede2/takeover'.
Program terminated with signal 11, Segmentation fault.
#0 0x000055c2995f88bd in _raid_add_target_line (dm=0x55c29b1522f0, mem=<optimized out>, cmd=<optimized out>, target_state=<optimized out>, seg=0x55c29b127550, laopts=<optimized out>,
node=0x55c29b0ad928, len=8388608, pvmove_mirror_count=0x55c29b152308) at raid/raid.c:275
275 uint64_t status = seg_lv(seg, s)->status;
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.168-5.el7.x86_64 elfutils-libs-0.168-5.el7.x86_64 glibc-2.17-192.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-39.el7.x86_64 libcap-2.22-9.el7.x86_64 libgcc-4.8.5-14.el7.x86_64 libselinux-2.5-11.el7.x86_64 libsepol-2.5-6.el7.x86_64 libuuid-2.23.2-39.el7.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64 pcre-8.32-17.el7.x86_64 readline-6.2-10.el7.x86_64 systemd-libs-219-35.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) bt
#0 0x000055c2995f88bd in _raid_add_target_line (dm=0x55c29b1522f0, mem=<optimized out>, cmd=<optimized out>, target_state=<optimized out>, seg=0x55c29b127550, laopts=<optimized out>,
node=0x55c29b0ad928, len=8388608, pvmove_mirror_count=0x55c29b152308) at raid/raid.c:275
#1 0x000055c2995ff200 in _add_target_to_dtree (dm=<optimized out>, dnode=<optimized out>, seg=<optimized out>, laopts=<optimized out>) at activate/dev_manager.c:2512
#2 0x000055c299602153 in _add_segment_to_dtree (layer=0x0, laopts=<optimized out>, seg=<optimized out>, dnode=0x55c29b0ad928, dtree=0x55c29b0ad6f0, dm=0x55c29b1522f0)
at activate/dev_manager.c:2710
#3 _add_new_lv_to_dtree (dm=dm@entry=0x55c29b1522f0, dtree=dtree@entry=0x55c29b0ad6f0, lv=lv@entry=0x55c29b1270e0, laopts=laopts@entry=0x7ffd4157fe80, layer=0x0)
at activate/dev_manager.c:2899
#4 0x000055c299603637 in _tree_action (dm=dm@entry=0x55c29b1522f0, lv=lv@entry=0x55c29b1270e0, laopts=laopts@entry=0x7ffd4157fe80, action=action@entry=PRELOAD)
at activate/dev_manager.c:3147
#5 0x000055c299605d36 in dev_manager_preload (dm=dm@entry=0x55c29b1522f0, lv=lv@entry=0x55c29b1270e0, laopts=laopts@entry=0x7ffd4157fe80,
flush_required=flush_required@entry=0x7ffd4157fe2c) at activate/dev_manager.c:3210
#6 0x000055c29955caf4 in _lv_preload (lv=lv@entry=0x55c29b1270e0, laopts=laopts@entry=0x7ffd4157fe80, flush_required=flush_required@entry=0x7ffd4157fe2c) at activate/activate.c:1414
#7 0x000055c29956166d in _lv_suspend (error_if_not_suspended=0, lv_pre=0x55c29b1270e0, lv=0x55c29b13b100, laopts=0x7ffd4157fe80,
lvid_s=0x7ffd41580fa0 "0LFnCknKYd4plqZOBOIlVLoz5TSyGSEP5JQsqnZhuevg7SYOFxZQcMgOgjsmuycg", cmd=0x55c29b067020) at activate/activate.c:2152
#8 lv_suspend_if_active (cmd=cmd@entry=0x55c29b067020, lvid_s=lvid_s@entry=0x7ffd41580fa0 "0LFnCknKYd4plqZOBOIlVLoz5TSyGSEP5JQsqnZhuevg7SYOFxZQcMgOgjsmuycg",
origin_only=origin_only@entry=0, exclusive=exclusive@entry=0, lv=<optimized out>, lv_pre=<optimized out>) at activate/activate.c:2265
#9 0x000055c29961743d in _file_lock_resource (cmd=0x55c29b067020, resource=0x7ffd41580fa0 "0LFnCknKYd4plqZOBOIlVLoz5TSyGSEP5JQsqnZhuevg7SYOFxZQcMgOgjsmuycg", flags=60,
lv=<optimized out>) at locking/file_locking.c:114
#10 0x000055c299592888 in _lock_vol (cmd=cmd@entry=0x55c29b067020, resource=<optimized out>,
resource@entry=0x7ffd41580fa0 "0LFnCknKYd4plqZOBOIlVLoz5TSyGSEP5JQsqnZhuevg7SYOFxZQcMgOgjsmuycg", flags=flags@entry=60, lv_op=lv_op@entry=LV_SUSPEND, lv=lv@entry=0x55c29b1270e0)
at locking/locking.c:275
#11 0x000055c2995931d3 in lock_vol (cmd=0x55c29b067020, vol=<optimized out>, vol@entry=0x55c29b1270e0 "0LFnCknKYd4plqZOBOIlVLoz5TSyGSEP5JQsqnZhuevg7SYOFxZQcMgOgjsmuycg", flags=60,
lv=lv@entry=0x55c29b1270e0) at locking/locking.c:355
#12 0x000055c29959d17c in _lv_update_and_reload (lv=lv@entry=0x55c29b1270e0, origin_only=origin_only@entry=0) at metadata/lv_manip.c:6371
#13 0x000055c2995a5757 in lv_update_and_reload (lv=lv@entry=0x55c29b1270e0) at metadata/lv_manip.c:6397
#14 0x000055c2995d30b4 in _lv_update_reload_fns_reset_eliminate_lvs (lv=lv@entry=0x55c29b1270e0, origin_only=0, origin_only=0) at metadata/raid_manip.c:567
#15 0x000055c2995d6dfb in _takeover_upconvert_wrapper (lv=0x55c29b1270e0, new_segtype=0x55c29b0a5590, new_image_count=<optimized out>, new_data_copies=2, new_stripe_size=128,
new_region_size=4096, allocate_pvs=0x55c29b126378, force=<optimized out>, yes=<optimized out>, new_stripes=0) at metadata/raid_manip.c:5195
#16 0x000055c2995d9f63 in lv_raid_convert (lv=lv@entry=0x55c29b1270e0, new_segtype=0x55c29b0a5590, yes=1, force=0, new_stripes=0, new_stripe_size_supplied=<optimized out>,
new_stripe_size=128, new_region_size=4096, allocate_pvs=0x55c29b126378) at metadata/raid_manip.c:6072
#17 0x000055c29952515d in _lvconvert_raid (lv=lv@entry=0x55c29b1270e0, lp=lp@entry=0x7ffd41581940) at lvconvert.c:1401
#18 0x000055c299526f5c in _convert_striped_raid (cmd=<optimized out>, lp=0x7ffd41581940, lv=0x55c29b1270e0) at lvconvert.c:1609
#19 _convert_striped (lp=<optimized out>, lv=<optimized out>, cmd=<optimized out>) at lvconvert.c:1676
#20 _lvconvert_raid_types (cmd=cmd@entry=0x55c29b067020, lv=lv@entry=0x55c29b1270e0, lp=lp@entry=0x7ffd41581940) at lvconvert.c:1749
#21 0x000055c29952717a in _lvconvert_raid_types_single (cmd=cmd@entry=0x55c29b067020, lv=0x55c29b1270e0, handle=handle@entry=0x55c29b0b45e8) at lvconvert.c:4241
#22 0x000055c29954bc38 in process_each_lv_in_vg (cmd=cmd@entry=0x55c29b067020, vg=vg@entry=0x55c29b1262a0, arg_lvnames=arg_lvnames@entry=0x7ffd41581820,
---Type <return> to continue, or q <return> to quit---
tags_in=tags_in@entry=0x7ffd415817d0, stop_on_error=stop_on_error@entry=0, handle=handle@entry=0x55c29b0b45e8,
check_single_lv=check_single_lv@entry=0x55c299521230 <_lvconvert_raid_types_check>, process_single_lv=process_single_lv@entry=0x55c299527100 <_lvconvert_raid_types_single>)
at toollib.c:3134
#23 0x000055c29954d084 in _process_lv_vgnameid_list (process_single_lv=0x55c299527100 <_lvconvert_raid_types_single>, check_single_lv=0x55c299521230 <_lvconvert_raid_types_check>,
handle=0x55c29b0b45e8, arg_tags=0x7ffd415817d0, arg_lvnames=0x7ffd415817f0, arg_vgnames=0x7ffd415817e0, vgnameids_to_process=0x7ffd41581810, read_flags=1048576, cmd=0x55c29b067020)
at toollib.c:3629
#24 process_each_lv (cmd=cmd@entry=0x55c29b067020, argc=argc@entry=1, argv=<optimized out>, one_vgname=one_vgname@entry=0x0, one_lvname=one_lvname@entry=0x0,
read_flags=read_flags@entry=1048576, handle=handle@entry=0x55c29b0b45e8, check_single_lv=check_single_lv@entry=0x55c299521230 <_lvconvert_raid_types_check>,
process_single_lv=process_single_lv@entry=0x55c299527100 <_lvconvert_raid_types_single>) at toollib.c:3781
#25 0x000055c2995293b8 in lvconvert_raid_types_cmd (cmd=0x55c29b067020, argc=<optimized out>, argv=<optimized out>) at lvconvert.c:4328
#26 0x000055c2995350c8 in lvm_run_command (cmd=cmd@entry=0x55c29b067020, argc=1, argc@entry=8, argv=0x7ffd41581e10, argv@entry=0x7ffd41581dd8) at lvmcmdline.c:2925
#27 0x000055c299535db3 in lvm2_main (argc=8, argv=0x7ffd41581dd8) at lvmcmdline.c:3454
#28 0x00007fbc5ec02c05 in __libc_start_main () from /lib64/libc.so.6
#29 0x000055c299514e8e in _start ()
The above issue in comment #6 appeared to be an unrelated stack and ended up being filed as bug 1445987. Created attachment 1274935 [details] verbose lvconvert attempt Second attempt to provide the verbose output of this failing lvconvert. This time the stack looks more similar to what's originally in comment #0. ================================================================================ Iteration 26.1 started at Wed Apr 26 20:18:42 CDT 2017 ================================================================================ Scenario raid0_meta: Convert Striped raid0_meta volume ********* Take over hash info for this scenario ********* * from type: raid0_meta * to type: raid4 * from legs: 2 * to legs: 2 * from region: 0 * to region: 65536.00k * contiguous: 1 * snapshot: 1 ****************************************************** Creating original volume on host-073... host-073: lvcreate --type raid0_meta -i 2 -n takeover -L 4G centipede2 Waiting until all mirror|raid volumes become fully syncd... 1/1 mirror(s) are fully synced: ( 100.00% ) Current volume device structure: LV Attr LSize Cpy%Sync Devices takeover rwi-a-r--- 4.00g takeover_rimage_0(0),takeover_rimage_1(0) [takeover_rimage_0] iwi-aor--- 2.00g /dev/sda1(1) [takeover_rimage_1] iwi-aor--- 2.00g /dev/sdf1(1) [takeover_rmeta_0] ewi-aor--- 4.00m /dev/sda1(0) [takeover_rmeta_1] ewi-aor--- 4.00m /dev/sdf1(0) Creating xfs on top of mirror(s) on host-073... Mounting mirrored xfs filesystems on host-073... Writing verification files (checkit) to mirror(s) on... ---- host-073 ---- Verifying files (checkit) on mirror(s) on... ---- host-073 ---- TAKEOVER (with verbose): lvconvert --yes -R 65536.00k --type raid4 centipede2/takeover sh: line 1: 6683 Segmentation fault (core dumped) lvconvert -vvvv --yes -R 65536.00k --type raid4 centipede2/takeover > /tmp/lvconvert.24613 2>&1 Core was generated by `lvconvert -vvvv --yes -R 65536.00k --type raid4 centipede2/takeover'. Program terminated with signal 11, Segmentation fault. #0 0x0000557aedcbf66c in _rename_area_lvs (lv=lv@entry=0x557aee9e7230, suffix=suffix@entry=0x0) at metadata/raid_manip.c:4716 4716 if (!(seg_lv(seg, s)->name = _generate_raid_name(lv, sfx[0], s))) (gdb) bt #0 0x0000557aedcbf66c in _rename_area_lvs (lv=lv@entry=0x557aee9e7230, suffix=suffix@entry=0x0) at metadata/raid_manip.c:4716 #1 0x0000557aedcc7e1a in _takeover_upconvert_wrapper (lv=0x557aee9e7230, new_segtype=0x557aee965590, new_image_count=<optimized out>, new_data_copies=2, new_stripe_size=128, new_region_size=131072, allocate_pvs=0x557aee9e64c8, force=<optimized out>, yes=<optimized out>, new_stripes=0) at metadata/raid_manip.c:5203 #2 0x0000557aedccaf63 in lv_raid_convert (lv=lv@entry=0x557aee9e7230, new_segtype=0x557aee965590, yes=1, force=0, new_stripes=0, new_stripe_size_supplied=<optimized out>, new_stripe_size=128, new_region_size=131072, allocate_pvs=0x557aee9e64c8) at metadata/raid_manip.c:6072 #3 0x0000557aedc1615d in _lvconvert_raid (lv=lv@entry=0x557aee9e7230, lp=lp@entry=0x7ffc5b913b20) at lvconvert.c:1401 #4 0x0000557aedc17f5c in _convert_striped_raid (cmd=<optimized out>, lp=0x7ffc5b913b20, lv=0x557aee9e7230) at lvconvert.c:1609 #5 _convert_striped (lp=<optimized out>, lv=<optimized out>, cmd=<optimized out>) at lvconvert.c:1676 #6 _lvconvert_raid_types (cmd=cmd@entry=0x557aee927020, lv=lv@entry=0x557aee9e7230, lp=lp@entry=0x7ffc5b913b20) at lvconvert.c:1749 #7 0x0000557aedc1817a in _lvconvert_raid_types_single (cmd=cmd@entry=0x557aee927020, lv=0x557aee9e7230, handle=handle@entry=0x557aee9745e8) at lvconvert.c:4241 #8 0x0000557aedc3cc38 in process_each_lv_in_vg (cmd=cmd@entry=0x557aee927020, vg=vg@entry=0x557aee9e63f0, arg_lvnames=arg_lvnames@entry=0x7ffc5b913a00, tags_in=tags_in@entry=0x7ffc5b9139b0, stop_on_error=stop_on_error@entry=0, handle=handle@entry=0x557aee9745e8, check_single_lv=check_single_lv@entry=0x557aedc12230 <_lvconvert_raid_types_check>, process_single_lv=process_single_lv@entry=0x557aedc18100 <_lvconvert_raid_types_single>) at toollib.c:3134 #9 0x0000557aedc3e084 in _process_lv_vgnameid_list (process_single_lv=0x557aedc18100 <_lvconvert_raid_types_single>, check_single_lv=0x557aedc12230 <_lvconvert_raid_types_check>, handle=0x557aee9745e8, arg_tags=0x7ffc5b9139b0, arg_lvnames=0x7ffc5b9139d0, arg_vgnames=0x7ffc5b9139c0, vgnameids_to_process=0x7ffc5b9139f0, read_flags=1048576, cmd=0x557aee927020) at toollib.c:3629 #10 process_each_lv (cmd=cmd@entry=0x557aee927020, argc=argc@entry=1, argv=<optimized out>, one_vgname=one_vgname@entry=0x0, one_lvname=one_lvname@entry=0x0, read_flags=read_flags@entry=1048576, handle=handle@entry=0x557aee9745e8, check_single_lv=check_single_lv@entry=0x557aedc12230 <_lvconvert_raid_types_check>, process_single_lv=process_single_lv@entry=0x557aedc18100 <_lvconvert_raid_types_single>) at toollib.c:3781 #11 0x0000557aedc1a3b8 in lvconvert_raid_types_cmd (cmd=0x557aee927020, argc=<optimized out>, argv=<optimized out>) at lvconvert.c:4328 #12 0x0000557aedc260c8 in lvm_run_command (cmd=cmd@entry=0x557aee927020, argc=1, argc@entry=8, argv=0x7ffc5b913ff0, argv@entry=0x7ffc5b913fb8) at lvmcmdline.c:2925 #13 0x0000557aedc26db3 in lvm2_main (argc=8, argv=0x7ffc5b913fb8) at lvmcmdline.c:3454 #14 0x00007fda94e42c05 in __libc_start_main () from /lib64/libc.so.6 #15 0x0000557aedc05e8e in _start () Can't reproduce this, it must be fixed. Heinz, I tried to grab a few things from gdb, but saved the core file in case you want to look for more info.
lvm2-2.02.170-2.el7.x86_64
[root@host-126 ~]# lvm version
LVM version: 2.02.170(2)-RHEL7 (2017-04-13)
Library version: 1.02.139-RHEL7 (2017-04-13)
Driver version: 4.35.0
Configuration: ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-default-dm-run-dir=/run --with-default-run-dir=/run/lvm --with-default-pid-dir=/run --with-default-locking-dir=/run/lock/lvm --with-usrlibdir=/usr/lib64 --enable-lvm1_fallback --enable-fsadm --with-pool=internal --enable-write_install --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-applib --enable-cmdlib --enable-dmeventd --enable-blkid_wiping --enable-python2-bindings --with-cluster=internal --with-clvmd=corosync --enable-cmirrord --with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync --with-thin=internal --enable-lvmetad --with-cache=internal --enable-lvmpolld --enable-lockd-dlm --enable-lockd-sanlock --enable-dmfilemapd
Core was generated by `lvconvert --yes -R 2048.00k --type raid4 centipede2/takeover'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000556864b4966c in _rename_area_lvs (lv=lv@entry=0x556865536e20,
suffix=suffix@entry=0x0) at metadata/raid_manip.c:4716
4716 if (!(seg_lv(seg, s)->name = _generate_raid_name(lv, sfx[0], s)))
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.168-5.el7.x86_64 elfutils-libs-0.168-5.el7.x86_64 glibc-2.17-194.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-39.el7.x86_64 libcap-2.22-9.el7.x86_64 libgcc-4.8.5-14.el7.x86_64 libselinux-2.5-11.el7.x86_64 libsepol-2.5-6.el7.x86_64 libuuid-2.23.2-39.el7.x86_64 ncurses-libs-5.9-13.20130511.el7.x86_64 pcre-8.32-17.el7.x86_64 readline-6.2-10.el7.x86_64 systemd-libs-219-38.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
(gdb) threads
Undefined command: "threads". Try "help".
(gdb) thread
[Current thread is 1 (Thread 0x7fd577ad8880 (LWP 5921))]
(gdb) bt
#0 0x0000556864b4966c in _rename_area_lvs (lv=lv@entry=0x556865536e20,
suffix=suffix@entry=0x0) at metadata/raid_manip.c:4716
#1 0x0000556864b51e1a in _takeover_upconvert_wrapper (lv=0x556865536e20,
new_segtype=0x5568654b1590, new_image_count=<optimized out>, new_data_copies=2,
new_stripe_size=128, new_region_size=4096, allocate_pvs=0x5568655360b8,
force=<optimized out>, yes=<optimized out>, new_stripes=0) at metadata/raid_manip.c:5203
#2 0x0000556864b54f63 in lv_raid_convert (lv=lv@entry=0x556865536e20,
new_segtype=0x5568654b1590, yes=1, force=0, new_stripes=0,
new_stripe_size_supplied=<optimized out>, new_stripe_size=128, new_region_size=4096,
allocate_pvs=0x5568655360b8) at metadata/raid_manip.c:6072
#3 0x0000556864aa015d in _lvconvert_raid (lv=lv@entry=0x556865536e20,
lp=lp@entry=0x7ffc57c40440) at lvconvert.c:1401
#4 0x0000556864aa1f5c in _convert_striped_raid (cmd=<optimized out>, lp=0x7ffc57c40440,
lv=0x556865536e20) at lvconvert.c:1609
#5 _convert_striped (lp=<optimized out>, lv=<optimized out>, cmd=<optimized out>)
at lvconvert.c:1676
#6 _lvconvert_raid_types (cmd=cmd@entry=0x556865473020, lv=lv@entry=0x556865536e20,
lp=lp@entry=0x7ffc57c40440) at lvconvert.c:1749
#7 0x0000556864aa217a in _lvconvert_raid_types_single (cmd=cmd@entry=0x556865473020,
lv=0x556865536e20, handle=handle@entry=0x5568654c05e8) at lvconvert.c:4241
#8 0x0000556864ac6c38 in process_each_lv_in_vg (cmd=cmd@entry=0x556865473020,
vg=vg@entry=0x556865535fe0, arg_lvnames=arg_lvnames@entry=0x7ffc57c40320,
tags_in=tags_in@entry=0x7ffc57c402d0, stop_on_error=stop_on_error@entry=0,
handle=handle@entry=0x5568654c05e8,
check_single_lv=check_single_lv@entry=0x556864a9c230 <_lvconvert_raid_types_check>,
process_single_lv=process_single_lv@entry=0x556864aa2100 <_lvconvert_raid_types_single>)
at toollib.c:3134
#9 0x0000556864ac8084 in _process_lv_vgnameid_list (
process_single_lv=0x556864aa2100 <_lvconvert_raid_types_single>,
check_single_lv=0x556864a9c230 <_lvconvert_raid_types_check>, handle=0x5568654c05e8,
arg_tags=0x7ffc57c402d0, arg_lvnames=0x7ffc57c402f0, arg_vgnames=0x7ffc57c402e0,
vgnameids_to_process=0x7ffc57c40310, read_flags=1048576, cmd=0x556865473020)
at toollib.c:3629
#10 process_each_lv (cmd=cmd@entry=0x556865473020, argc=argc@entry=1, argv=<optimized out>,
one_vgname=one_vgname@entry=0x0, one_lvname=one_lvname@entry=0x0,
read_flags=read_flags@entry=1048576, handle=handle@entry=0x5568654c05e8,
check_single_lv=check_single_lv@entry=0x556864a9c230 <_lvconvert_raid_types_check>,
process_single_lv=process_single_lv@entry=0x556864aa2100 <_lvconvert_raid_types_single>)
at toollib.c:3781
#11 0x0000556864aa43b8 in lvconvert_raid_types_cmd (cmd=0x556865473020,
argc=<optimized out>, argv=<optimized out>) at lvconvert.c:4328
#12 0x0000556864ab00c8 in lvm_run_command (cmd=cmd@entry=0x556865473020, argc=1,
argc@entry=7, argv=0x7ffc57c40908, argv@entry=0x7ffc57c408d8) at lvmcmdline.c:2925
#13 0x0000556864ab0db3 in lvm2_main (argc=7, argv=0x7ffc57c408d8) at lvmcmdline.c:3454
#14 0x00007fd576853c05 in __libc_start_main () from /lib64/libc.so.6
#15 0x0000556864a8fe8e in _start ()
(gdb) p *lv
$1 = {lvid = {id = {{uuid = "5GrsqfTtYmqxUI8AmdYQHH9ruHwOAQko"}, {
uuid = "QI20xeVM2Kw8qwxS0uAdVfgH79HTp8Bk"}},
s = "5GrsqfTtYmqxUI8AmdYQHH9ruHwOAQkoQI20xeVM2Kw8qwxS0uAdVfgH79HTp8Bk\000\000\000\000\000\000\000"}, name = 0x556865536f78 "takeover", vg = 0x556865535fe0, status = 4294968128,
alloc = ALLOC_INHERIT, profile = 0x0, read_ahead = 4294967295, major = -1, minor = -1,
size = 8503296, le_count = 1038, origin_count = 0, external_count = 0, snapshot_segs = {
n = 0x556865536eb8, p = 0x556865536eb8}, snapshot = 0x0, rdevice = 0x0, rsites = {
n = 0x556865536ed8, p = 0x556865536ed8}, segments = {n = 0x556865537550,
p = 0x556865537550}, tags = {n = 0x556865536ef8, p = 0x556865536ef8},
segs_using_this_lv = {n = 0x556865536f08, p = 0x556865536f08}, indirect_glvs = {
n = 0x556865536f18, p = 0x556865536f18}, this_glv = 0x0, timestamp = 1495234049,
new_lock_args = 0, hostname = 0x556865536f88 "host-126.virt.lab.msp.redhat.com",
lock_args = 0x0}
(gdb) p *seg
$2 = {list = {n = 0x556865536ee8, p = 0x556865536ee8}, lv = 0x556865536e20,
segtype = 0x5568654b1590, le = 0, len = 1038, reshape_len = 0, status = 0,
stripe_size = 128, writebehind = 0, min_recovery_rate = 0, max_recovery_rate = 0,
data_offset = 0, area_count = 3, area_len = 1038, chunk_size = 0, origin = 0x0,
indirect_origin = 0x0, merge_lv = 0x0, cow = 0x0, origin_list = {n = 0x5568655375c8,
p = 0x5568655375c8}, region_size = 4096, data_copies = 2, extents_copied = 0,
log_lv = 0x0, pvmove_source_seg = 0x0, segtype_private = 0x0, tags = {n = 0x556865537600,
p = 0x556865537600}, areas = 0x556865524390, meta_areas = 0x5568655243e0,
metadata_lv = 0x0, transaction_id = 0, zero_new_blocks = THIN_ZERO_UNSELECTED,
discards = THIN_DISCARDS_UNSELECTED, thin_messages = {n = 0x556865537638,
p = 0x556865537638}, external_lv = 0x0, pool_lv = 0x0, device_id = 0,
cache_metadata_format = CACHE_METADATA_FORMAT_UNSELECTED,
cache_mode = CACHE_MODE_UNSELECTED, policy_name = 0x0, policy_settings = 0x0,
cleaner_policy = 0, replicator = 0x0, rlog_lv = 0x0, rlog_type = 0x0,
rdevice_index_highest = 0, rsite_index_highest = 0}
(gdb) p s
$3 = 0
(gdb) p (struct logical_volume *)seg->lv
$4 = (struct logical_volume *) 0x556865536e20
(gdb) p *(struct logical_volume *)seg->lv
$5 = {lvid = {id = {{uuid = "5GrsqfTtYmqxUI8AmdYQHH9ruHwOAQko"}, {
uuid = "QI20xeVM2Kw8qwxS0uAdVfgH79HTp8Bk"}},
s = "5GrsqfTtYmqxUI8AmdYQHH9ruHwOAQkoQI20xeVM2Kw8qwxS0uAdVfgH79HTp8Bk\000\000\000\000\000\000\000"}, name = 0x556865536f78 "takeover", vg = 0x556865535fe0, status = 4294968128,
alloc = ALLOC_INHERIT, profile = 0x0, read_ahead = 4294967295, major = -1, minor = -1,
size = 8503296, le_count = 1038, origin_count = 0, external_count = 0, snapshot_segs = {
n = 0x556865536eb8, p = 0x556865536eb8}, snapshot = 0x0, rdevice = 0x0, rsites = {
n = 0x556865536ed8, p = 0x556865536ed8}, segments = {n = 0x556865537550,
p = 0x556865537550}, tags = {n = 0x556865536ef8, p = 0x556865536ef8},
segs_using_this_lv = {n = 0x556865536f08, p = 0x556865536f08}, indirect_glvs = {
n = 0x556865536f18, p = 0x556865536f18}, this_glv = 0x0, timestamp = 1495234049,
new_lock_args = 0, hostname = 0x556865536f88 "host-126.virt.lab.msp.redhat.com",
lock_args = 0x0}
(gdb) p ((struct logical_volume *)seg->lv)->name
$6 = 0x556865536f78 "takeover"
(gdb) p sfx
$7 = {0x556865555140 "rimage", 0x556865555148 "rmeta"}
There may be several separate problems here. Firstly, we have some unnecessary dm_pool_frees (in reverse order), which I've removed: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=57492a609410c39809960a6f32bf67e1eddb7064 Secondly, I've added some more diagnostics to a set of code paths that appear in one of the traces: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=d1ddfc408535b9c4df432273657f952c59f16232 Thirdly, I've tried to fix some faults on some of the error paths triggered while testing the 2nd path: https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=fbe7464df5018d8ab85dd312fae8c13796eca59d (I haven't yet found a way to explain everything on this bug with these patches alone - there could be some additional things to find and fix - but it's worth applying them and retesting.) Upstream commit 57492a609410c39809960a6f32bf67e1eddb7064 Marking verified in the latest rpms. Running many iterations of this test case was no longer able to cause this issue.
3.10.0-651.el7.x86_64
lvm2-2.02.171-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
lvm2-libs-2.02.171-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
lvm2-cluster-2.02.171-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
device-mapper-1.02.140-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
device-mapper-libs-1.02.140-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
device-mapper-event-1.02.140-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
device-mapper-event-libs-1.02.140-2.el7 BUILT: Wed May 24 09:02:34 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7 BUILT: Mon Mar 27 10:15:46 CDT 2017
================================================================================
Iteration 199.1 started at Fri May 26 00:53:22 CDT 2017
================================================================================
Scenario raid0: Convert Striped raid0 volume
********* Take over hash info for this scenario *********
* from type: raid0
* to type: raid4
* from legs: 2
* to legs: 2
* from region: 0
* to region: 2048.00k
* contiguous: 1
* snapshot: 0
******************************************************
Creating original volume on harding-03...
harding-03: lvcreate --type raid0 -i 2 -n takeover -L 4G centipede2
WARNING: xfs signature detected on /dev/centipede2/takeover at offset 0. Wipe it? [y/n]: [n]
Aborted wiping of xfs.
1 existing signature left on the device.
Waiting until all mirror|raid volumes become fully syncd...
1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec
Current volume device structure:
LV Attr LSize Cpy%Sync Devices
takeover rwi-a-r--- 4.00g takeover_rimage_0(0),takeover_rimage_1(0)
[takeover_rimage_0] iwi-aor--- 2.00g /dev/mapper/mpatha1(0)
[takeover_rimage_1] iwi-aor--- 2.00g /dev/mapper/mpathc1(0)
Creating xfs on top of mirror(s) on harding-03...
warning: device is not properly aligned /dev/centipede2/takeover
Mounting mirrored xfs filesystems on harding-03...
Writing verification files (checkit) to mirror(s) on...
---- harding-03 ----
Sleeping 15 seconds to get some outsanding I/O locks before the failure
Verifying files (checkit) on mirror(s) on...
---- harding-03 ----
TAKEOVER: lvconvert --yes -R 2048.00k --type raid4 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
0/1 mirror(s) are fully synced: ( 27.63% )
0/1 mirror(s) are fully synced: ( 48.46% )
0/1 mirror(s) are fully synced: ( 76.83% )
1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec
Current volume device structure:
LV Attr LSize Cpy%Sync Devices
takeover rwi-aor--- 4.00g 100.00 takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0)
[takeover_rimage_0] iwi-aor--- 2.00g /dev/mapper/mpathb1(1)
[takeover_rimage_1] iwi-aor--- 2.00g /dev/mapper/mpatha1(0)
[takeover_rimage_2] iwi-aor--- 2.00g /dev/mapper/mpathc1(0)
[takeover_rmeta_0] ewi-aor--- 4.00m /dev/mapper/mpathb1(0)
[takeover_rmeta_1] ewi-aor--- 4.00m /dev/mapper/mpatha1(512)
[takeover_rmeta_2] ewi-aor--- 4.00m /dev/mapper/mpathc1(512)
Verifying files (checkit) on mirror(s) on...
---- harding-03 ----
Verifying files (checkit) on mirror(s) on...
---- harding-03 ----
Stopping the io load (collie/xdoio) on mirror(s)
Unmounting xfs and removing mnt point on harding-03...
Deactivating and removing raid(s)
lvchange -an /dev/centipede2/takeover
lvremove -f /dev/centipede2/takeover
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2222 |