Bug 2175220

Summary: Segmentation fault when executing non-reporting commands with -S|--select and log/command_log_report=1 setting
Product: [Fedora] Fedora Reporter: Tony Asleson <tasleson>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: unspecified    
Version: 39CC: agk, anprice, bmarzins, bmr, cfeist, heinzm, kzak, lvm-team, mcsontos, msnitzer, prajnoha, teigland, zkabelac
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.03.20-1.fc39 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-21 16:05:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tony Asleson 2023-03-03 15:21:11 UTC
Description of problem:

Segmentation fault during vgremove.

Version-Release number of selected component (if applicable):
I've reproduced this with:

Current upstream main branch commit b0e75bd356a070423754676872b7b6948913be2e

# lvm version
  LVM version:     2.03.20(2)-git (2023-02-21)
  Library version: 1.02.193-git (2023-02-21)
  Driver version:  4.47.0
  Configuration:   ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --program-prefix= --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-default-dm-run-dir=/run --with-default-run-dir=/run/lvm --with-default-pid-dir=/run --with-default-locking-dir=/run/lock/lvm --with-usrlibdir=/usr/lib64 --enable-fsadm --enable-write_install --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-cmdlib --enable-dmeventd --enable-blkid_wiping --disable-readline --enable-editline --enable-dependency-tracking --with-cluster=internal --enable-cmirrord --with-udevdir=/usr/lib/udev/rules.d --enable-udev_sync --with-thin=internal --with-cache=internal --enable-lvmpolld --enable-lvmlockd-dlm --enable-lvmlockd-dlmcontrol --enable-lvmlockd-sanlock --enable-dbus-service --enable-notify-dbus --enable-dmfilemapd --with-writecache=internal --with-vdo=internal --with-vdo-format=/usr/bin/vdoformat --with-integrity=internal --disable-silent-rules

and with EL9.2

LVM version:     2.03.17(2) (2022-11-10)
Library version: 1.02.187 (2022-11-10)
Driver version:  4.47.0

How reproducible:
100%

Steps to Reproduce:
1. Create duplicate vg name

# vgcreate --devices /dev/sdb foo /dev/sdb && vgcreate --devices /dev/sdc foo /dev/sdc

2. Delete one of the volume groups
# vgs
  WARNING: VG name foo is used by VGs Oysvt4-tBtU-SE5P-VCXu-06K3-YDQS-BBBTTM and jzmVUW-m7Sy-DEg8-BA2Y-6VJV-axqu-aMQ7sc.
  Fix duplicate VG names with vgrename uuid, a device filter, or system IDs.
  VG  #PV #LV #SN Attr   VSize     VFree    
  foo   1   0   0 wz--n- <1024.00g <1024.00g
  foo   1   0   0 wz--n- <1024.00g <1024.00g
[root@lvmdbusd lvmdbusd]# LVM_COMMAND_PROFILE=lvmdbusd /usr/sbin/lvm vgremove -f --select vg_uuid=Oysvt4-tBtU-SE5P-VCXu-06K3-YDQS-BBBTTM --config 'global/notify_dbus=0 log/command_log_selection="log_context!="'
  WARNING: VG name foo is used by VGs Oysvt4-tBtU-SE5P-VCXu-06K3-YDQS-BBBTTM and jzmVUW-m7Sy-DEg8-BA2Y-6VJV-axqu-aMQ7sc.
  Fix duplicate VG names with vgrename uuid, a device filter, or system IDs.
  {
      "log": [
      ]
  }
Segmentation fault (core dumped)
[root@lvmdbusd lvmdbusd]# echo $?
139


Actual results:
Segmentation fault (core dumped), exit code of 139

Expected results:
Success, exit code of 0

Additional info:

It appears if you don't include the LVM_COMMAND_PROFILE and additional --config to the command line it completes successfully.

Comment 1 Tony Asleson 2023-03-03 15:25:06 UTC
When running in the context of a unit test for lvmdbusd, I'm seeing the following output of the vgremove

WARNING: VG name bmcsxkhi_vg_LvMdBuS_TEST is used by VGs YwMqZc-ctna-buri-PsSd-ldyS-dfKd-V82Udp and rBtYAT-SZqh-Jy5Z-I2xv-rgvw-Es04-oBIxnW.
Fix duplicate VG names with vgrename uuid, a device filter, or system IDs.
malloc(): unsorted double linked list corrupted

exit code -6

Comment 2 Tony Asleson 2023-03-03 15:27:20 UTC
(In reply to Tony Asleson from comment #1)
> When running in the context of a unit test for lvmdbusd, I'm seeing the
> following output of the vgremove
> 
> WARNING: VG name bmcsxkhi_vg_LvMdBuS_TEST is used by VGs
> YwMqZc-ctna-buri-PsSd-ldyS-dfKd-V82Udp and
> rBtYAT-SZqh-Jy5Z-I2xv-rgvw-Es04-oBIxnW.
> Fix duplicate VG names with vgrename uuid, a device filter, or system IDs.
> malloc(): unsorted double linked list corrupted
> 
> exit code -6

Disregard the above, this appears to be the python process getting corrupted from lvm segmentation fault

Comment 3 Tony Asleson 2023-03-03 15:52:02 UTC
You don't need to have a duplicate VG to re-create this.  It occurs with simply removing a single VG, e.g.

# vgcreate --devices /dev/sdb foo /dev/sdb
 Volume group "foo" successfully created

# vgs -o all
  Fmt  VG UUID                                VG  Attr   VPerms     Extendable Exported   AutoAct Partial    AllocPol   Clustered  Shared  VSize  VFree  SYS ID System ID LockType VLockArgs Ext   #Ext   Free   MaxLV MaxPV #PV #PV Missing #LV #SN Seq VG Tags VProfile #VMda #VMdaUse VMdaFree  VMdaSize  #VMdaCps 
  lvm2 l5fw99-QJv8-eOfH-w5YR-CIMx-eOXN-vWjxVN foo wz--n- writeable  extendable            enabled            normal                        <2.00t <2.00t                                     4.00m 524287 524287     0     0   1           0   0   0   1                      1        1   508.50k  1020.00k unmanaged

# LVM_COMMAND_PROFILE=lvmdbusd /usr/sbin/lvm vgremove -f --select vg_uuid=l5fw99-QJv8-eOfH-w5YR-CIMx-eOXN-vWjxVN --config 'global/notify_dbus=0 log/command_log_selection="log_context!="'  {
      "log": [
      ]
  }
Segmentation fault (core dumped)
#

Comment 4 David Teigland 2023-03-06 17:56:33 UTC
It seems to be an effect of combining --select and the profile.  A slightly more direct way to reproduce it is
vgremove --commandprofile=lvmdbusd --select vg_uuid=<uuid>


#0  0x00007f43a54cb199 in __memset_evex_unaligned_erms () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install device-mapper-event-libs-1.02.187-7.el9.x86_64 device-mapper-libs-1.02.187-7.el9.x86_64 glibc-2.34-60.el9.x86_64 libaio-0.3.111-13.el9.x86_64 libblkid-2.37.4-10.el9.x86_64 libcap-2.48-8.el9.x86_64 libgcc-11.3.1-4.3.el9.x86_64 libgcrypt-1.10.0-9.el9_1.x86_64 libgpg-error-1.42-5.el9.x86_64 libselinux-3.5-0.rc3.1.el9.x86_64 libsepol-3.5-0.rc3.1.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64 ncurses-libs-6.2-8.20210508.el9.x86_64 readline-8.1-4.el9.x86_64 systemd-libs-252-6.el9.x86_64 xz-libs-5.2.5-8.el9_0.x86_64
(gdb) bt
#0  0x00007f43a54cb199 in __memset_evex_unaligned_erms () from /lib64/libc.so.6
#1  0x00005608fcf027ca in dm_pool_zalloc (p=0x5608fe08c5f0, s=64) at device_mapper/mm/pool.c:79
#2  0x00005608fcef4b79 in _do_report_object (rh=0x5608fe18d6f0, object=0x7ffc27440070, do_output=1, 
    selected=0x0) at device_mapper/libdm-report.c:2017
#3  0x00005608fcef541c in dm_report_object (rh=0x5608fe18d6f0, object=0x7ffc27440070)
    at device_mapper/libdm-report.c:2239
#4  0x00005608fce9b08d in report_cmdlog (handle=0x5608fe18d6f0, type=0x5608fcf6a813 "print", 
    context=0x5608fcf6aa83 "processing", object_type_name=0x5608fcf6aaaa "vg", 
    object_name=0x5608fdf44d20 "test", 
    object_id=0x7ffc27441850 "ZQq4CQ-9Ec6-dIUq-yuBr-kiIF-qtrq-TpfdwU", object_group=0x0, 
    object_group_id=0x0, msg=0x7ffc27440160 "Volume group \"test\" successfully removed", 
    current_errno=0, ret_code=0) at report/report.c:4597
#5  0x00005608fce1701f in _vprint_log (level=4, file=0x5608fcf744b4 "metadata/metadata.c", line=679, 
    dm_errno_or_class=0, format=0x5608fcf74ce8 "Volume group \"%s\" successfully removed", 
    orig_ap=0x7ffc27441648) at log/log.c:607
#6  0x00005608fce178b7 in print_log (level=4, file=0x5608fcf744b4 "metadata/metadata.c", line=679, 
    dm_errno_or_class=0, format=0x5608fcf74ce8 "Volume group \"%s\" successfully removed")
    at log/log.c:776
#7  0x00005608fce4d0a7 in vg_remove_direct (vg=0x5608fe0a4590) at metadata/metadata.c:679
#8  0x00005608fce4d102 in vg_remove (vg=0x5608fe0a4590) at metadata/metadata.c:690
#9  0x00005608fcdb095f in _vgremove_single (cmd=0x5608fdf35fd0, vg_name=0x5608fdf44d20 "test", 
    vg=0x5608fe0a4590, handle=0x5608fdf44d28) at vgremove.c:83
#10 0x00005608fcd9a068 in _process_vgnameid_list (cmd=0x5608fdf35fd0, read_flags=1310720, 
    vgnameids_to_process=0x7ffc27441950, arg_vgnames=0x7ffc27441970, arg_tags=0x7ffc27441980, 
    handle=0x5608fdf44d28, process_single_vg=0x5608fcdb06a0 <_vgremove_single>) at toollib.c:2216
#11 0x00005608fcd9acbc in process_each_vg (cmd=0x5608fdf35fd0, argc=0, argv=0x7ffc27441c98, 
    one_vgname=0x0, use_vgnames=0x0, read_flags=1310720, include_internal=0, handle=0x5608fdf44d28, 
    process_single_vg=0x5608fcdb06a0 <_vgremove_single>) at toollib.c:2526
#12 0x00005608fcdb0ac3 in vgremove (cmd=0x5608fdf35fd0, argc=0, argv=0x7ffc27441c98)
    at vgremove.c:109
#13 0x00005608fcd72038 in lvm_run_command (cmd=0x5608fdf35fd0, argc=0, argv=0x7ffc27441c98)
    at lvmcmdline.c:3317
#14 0x00005608fcd737d4 in lvm2_main (argc=6, argv=0x7ffc27441c68) at lvmcmdline.c:3847
#15 0x00005608fcdb4260 in main (argc=6, argv=0x7ffc27441c68) at lvm.c:23

Comment 5 Peter Rajnoha 2023-03-07 14:38:31 UTC
The issue here was that the command log report was destroyed prematurely in this case: when -S|--select was used together with log/command_log_report=1 for non-reporting commands (so the vgremove mentioned here, but also other non-reporting commands accepting -S|--select like pvremove, lvremove, pvchange, vgchange, lvchange...).

Fixed with: https://sourceware.org/git/?p=lvm2.git;a=commit;h=cd14d3fcc0e03136d0cea1ab1a9edff3b8b9dbeb

Comment 6 Fedora Release Engineering 2023-08-16 08:14:54 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle.
Changing version to 39.