Bug 1796524

Summary: pvmove segfault when attempting to move device below writecache internal cwpool_cvol volume
Product: Red Hat Enterprise Linux 8 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Cache Logical Volumes QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: agk, heinzm, jbrassow, mcsontos, msnitzer, pasik, prajnoha, teigland, zkabelac
Version: 8.2Flags: pm-rhel: mirror+
Target Milestone: rc   
Target Release: 8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.03.08-1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-28 16:58:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2020-01-30 16:13:24 UTC
Description of problem:

[root@hayes-02 ~]# lvs -a -o +devices | grep /dev/sdn2
  [cworigin_wcorig]   pv_shuffle_A owi-aoC---  752.00m                                                                         /dev/sdn2(3503)                                                            
  [cwpool_cvol]       pv_shuffle_A Cwi-aoC---  500.00m                                                                         /dev/sdn2(3691)                                                            

# this internal pvmove succeeds:
[root@hayes-02 ~]# pvmove -n pv_shuffle_B/cworigin_wcorig -v /dev/sdm1 /dev/sdb2
  Archiving volume group "pv_shuffle_B" metadata (seqno 58).
  Creating logical volume pvmove0
  activation/volume_list configuration setting not defined: Checking only host tags for pv_shuffle_B/cworigin.
  Moving 188 extents of logical volume pv_shuffle_B/cworigin_wcorig.
  activation/volume_list configuration setting not defined: Checking only host tags for pv_shuffle_B/cworigin.
  Loading table for pv_shuffle_B-cwpool_cvol (253:75).
  Suppressed pv_shuffle_B-cwpool_cvol (253:75) identical table reload.
  Creating pv_shuffle_B-pvmove0
  Loading table for pv_shuffle_B-pvmove0 (253:39).
  Loading table for pv_shuffle_B-cworigin_wcorig (253:76).
  Loading table for pv_shuffle_B-cworigin (253:77).
  Suppressed pv_shuffle_B-cworigin (253:77) identical table reload.
  Suspending pv_shuffle_B-cworigin (253:77) with device flush
  Suspending pv_shuffle_B-cwpool_cvol (253:75) with device flush
  Suspending pv_shuffle_B-cworigin_wcorig (253:76) with device flush
  Loading table for pv_shuffle_B-cwpool_cvol (253:75).
  Suppressed pv_shuffle_B-cwpool_cvol (253:75) identical table reload.
  Loading table for pv_shuffle_B-cworigin (253:77).
  Suppressed pv_shuffle_B-cworigin (253:77) identical table reload.
  Resuming pv_shuffle_B-pvmove0 (253:39).
  Resuming pv_shuffle_B-cwpool_cvol (253:75).
  Resuming pv_shuffle_B-cworigin_wcorig (253:76).
  Resuming pv_shuffle_B-cworigin (253:77).
  Creating volume group backup "/etc/lvm/backup/pv_shuffle_B" (seqno 59).
  activation/volume_list configuration setting not defined: Checking only host tags for pv_shuffle_B/pvmove0.
  Checking progress before waiting every 15 seconds.
  /dev/sdm1: Moved: 0.53%
  /dev/sdm1: Moved: 100.00%
  Polling finished successfully.

# this does not:
[root@hayes-02 ~]# pvmove -n pv_shuffle_B/cwpool_cvol -v /dev/sdm1 /dev/sdb2
  Archiving volume group "pv_shuffle_B" metadata (seqno 61).
  Creating logical volume pvmove0
  activation/volume_list configuration setting not defined: Checking only host tags for pv_shuffle_B/cworigin.
  Moving 125 extents of logical volume pv_shuffle_B/cwpool_cvol.
  activation/volume_list configuration setting not defined: Checking only host tags for pv_shuffle_B/cworigin.
Segmentation fault (core dumped)

[root@hayes-02 ~]# lvs -a -o +devices pv_shuffle_B
  LV                  VG           Attr       LSize    Pool          Origin            Data%  Meta%  Move Log Cpy%Sync Convert Devices                                                                    
  [...]
  cworigin            pv_shuffle_B Cwi-aoC---  752.00m [cwpool_cvol] [cworigin_wcorig]                                         cworigin_wcorig(0)                                                         
  [cworigin_wcorig]   pv_shuffle_B owi-aoC---  752.00m                                                                         /dev/sdb2(685)                                                             
  [cwpool_cvol]       pv_shuffle_B Cwi-aoC---  500.00m                                                                         /dev/sdm1(871)                                                             

Jan 30 10:05:22 hayes-02 kernel: traps: pvmove[30462] general protection fault ip:55cc9150b6e7 sp:7fff2d1633b0 error:0 in lvm[55cc91423000+21d000]
Jan 30 10:05:22 hayes-02 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Jan 30 10:05:22 hayes-02 systemd[1]: Started Process Core Dump (PID 30469/UID 0).
Jan 30 10:05:22 hayes-02 systemd-coredump[30471]: Process 30462 (pvmove) of user 0 dumped core.#012#012Stack trace of thread 30462:#012#0  0x000055cc9150b6e7 lv_on_pmem (lvm)#012#1  0x000055cc91554f27 _writecache_add_target_line (lvm)#012#2  0x000055cc9154262f _add_target_to_dtree (lvm)#012#3  0x000055cc91545663 _add_new_lv_to_dtree (lvm)#012#4  0x000055cc91547fc7 _tree_action (lvm)#012#5  0x000055cc9154ab3a dev_manager_preload (lvm)#012#6  0x000055cc914a9f71 _lv_preload (lvm)#012#7  0x000055cc914aedb4 lv_suspend_if_active (lvm)#012#8  0x000055cc914eaa95 _lv_update_and_reload (lvm)#012#9  0x000055cc91487675 _pvmove_setup_single (lvm)#012#10 0x000055cc9149ba1a process_each_pv (lvm)#012#11 0x000055cc91488b0d pvmove (lvm)#012#12 0x000055cc9147ab1d lvm_run_command (lvm)#012#13 0x000055cc9147be43 lvm2_main (lvm)#012#14 0x00007fae97f9f6a3 __libc_start_main (libc.so.6)#012#15 0x000055cc91457c3e _start (lvm)


Version-Release number of selected component (if applicable):
kernel-4.18.0-173.el8    BUILT: Fri Jan 24 06:02:03 CST 2020
lvm2-2.03.07-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
lvm2-libs-2.03.07-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
lvm2-dbusd-2.03.07-1.el8    BUILT: Mon Dec  2 00:12:23 CST 2019
lvm2-lockd-2.03.07-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
boom-boot-1.0-1.el8    BUILT: Fri Nov 29 05:18:30 CST 2019
device-mapper-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-libs-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-event-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-event-libs-1.02.167-1.el8    BUILT: Mon Dec  2 00:09:32 CST 2019
device-mapper-persistent-data-0.8.5-3.el8    BUILT: Wed Nov 27 07:05:21 CST 2019


How reproducible:
Every time

Comment 1 Corey Marthaler 2020-01-30 21:01:25 UTC
Core was generated by `pvmove -n pv_shuffle_B/cwpool_cvol -v /dev/sdm1 /dev/sdb2'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  lv_on_pmem (lv=0x558bff31c038) at metadata/metadata.c:4459
4459                            if (dev_is_pmem(pv->dev)) {
(gdb) 
(gdb) 
(gdb) 
(gdb) 
(gdb) 
(gdb) bt
#0  lv_on_pmem (lv=0x558bff31c038) at metadata/metadata.c:4459
#1  0x0000558bfd7e1f27 in _writecache_add_target_line (dm=<optimized out>, mem=0x558bff230120, cmd=<optimized out>, target_state=<optimized out>, seg=0x558bffb6e0e0, laopts=<optimized out>, 
    node=0x558bff1f64c0, len=1540096, pvmove_mirror_count=0x558bffb64078) at writecache/writecache.c:260
#2  0x0000558bfd7cf62f in _add_target_to_dtree (dm=<optimized out>, dnode=<optimized out>, seg=<optimized out>, laopts=<optimized out>) at activate/dev_manager.c:245
#3  0x0000558bfd7d2663 in _add_segment_to_dtree (layer=0x0, laopts=0x7ffd657bfb50, seg=<optimized out>, dnode=0x558bff1f64c0, dtree=0x558bff1f62e0, dm=0x558bffb64060) at activate/dev_manager.c:3104
#4  _add_new_lv_to_dtree (dm=dm@entry=0x558bffb64060, dtree=dtree@entry=0x558bff1f62e0, lv=lv@entry=0x558bff31baf0, laopts=laopts@entry=0x7ffd657bfb50, layer=0x0) at activate/dev_manager.c:3406
#5  0x0000558bfd7d4fc7 in _tree_action (dm=dm@entry=0x558bffb64060, lv=lv@entry=0x558bff31baf0, laopts=laopts@entry=0x7ffd657bfb50, action=action@entry=PRELOAD) at activate/dev_manager.c:3680
#6  0x0000558bfd7d7b3a in dev_manager_preload (dm=dm@entry=0x558bffb64060, lv=lv@entry=0x558bff31baf0, laopts=laopts@entry=0x7ffd657bfb50, flush_required=flush_required@entry=0x7ffd657bfaec)
    at activate/dev_manager.c:3744
#7  0x0000558bfd736f71 in _lv_preload (lv=lv@entry=0x558bff31baf0, laopts=laopts@entry=0x7ffd657bfb50, flush_required=flush_required@entry=0x7ffd657bfaec) at activate/activate.c:1455
#8  0x0000558bfd73bdb4 in _lv_suspend (error_if_not_suspended=0, lvid_s=0x0, lv_pre=0x558bff31baf0, lv=0x558bffb599c0, laopts=0x7ffd657bfb50, cmd=0x558bff0b2600) at activate/activate.c:2143
#9  lv_suspend_if_active (cmd=0x558bff0b2600, lvid_s=lvid_s@entry=0x0, origin_only=origin_only@entry=0, exclusive=exclusive@entry=0, lv=0x558bffb599c0, lv_pre=lv_pre@entry=0x558bff31baf0)
    at activate/activate.c:2276
#10 0x0000558bfd73d3de in suspend_lv (cmd=<optimized out>, lv=lv@entry=0x558bff31baf0) at activate/activate.c:2914
#11 0x0000558bfd777a95 in _lv_update_and_reload (lv=0x558bff31baf0, origin_only=<optimized out>, origin_only@entry=0) at metadata/lv_manip.c:6698
#12 0x0000558bfd780adb in lv_update_and_reload (lv=<optimized out>) at metadata/lv_manip.c:6724
#13 0x0000558bfd714675 in _update_metadata (exclusive=<optimized out>, lvs_changed=<optimized out>, lv_mirr=0x558bffb73090) at pvmove.c:546
#14 _pvmove_setup_single (cmd=cmd@entry=0x558bff0b2600, vg=vg@entry=0x558bff31a160, pv=pv@entry=0x558bff31ab00, handle=handle@entry=0x558bff1a9380) at pvmove.c:703
#15 0x0000558bfd728a1a in _process_pvs_in_vg (process_all_devices=0, process_single_pv=0x558bfd7142d0 <_pvmove_setup_single>, handle=0x558bff1a9380, error_flags=0, skip=0, process_all_pvs=0, 
    arg_tags=0x7ffd657bfe40, arg_devices=0x7ffd657bfe60, all_devices=0x7ffd657bfe80, vg=0x558bff31a160, cmd=0x558bff0b2600) at toollib.c:4275
#16 _process_pvs_in_vgs (process_all_devices=0, process_single_pv=0x558bfd7142d0 <_pvmove_setup_single>, handle=0x558bff1a9380, process_all_pvs=0, arg_tags=0x7ffd657bfe40, arg_devices=0x7ffd657bfe60, 
    all_devices=0x7ffd657bfe80, all_vgnameids=0x7ffd657bfe70, read_flags=1310720, cmd=0x558bff0b2600) at toollib.c:4397
#17 process_each_pv (cmd=cmd@entry=0x558bff0b2600, argc=argc@entry=1, argv=argv@entry=0x7ffd657c02e0, only_this_vgname=only_this_vgname@entry=0x0, all_is_set=all_is_set@entry=0, read_flags=1310720, 
    read_flags@entry=1048576, handle=<optimized out>, process_single_pv=<optimized out>) at toollib.c:4526
#18 0x0000558bfd715b0d in pvmove (cmd=0x558bff0b2600, argc=<optimized out>, argv=0x7ffd657c06a0) at pvmove.c:877
#19 0x0000558bfd707b1d in lvm_run_command (cmd=cmd@entry=0x558bff0b2600, argc=<optimized out>, argc@entry=6, argv=<optimized out>, argv@entry=0x7ffd657c0678) at lvmcmdline.c:3118
#20 0x0000558bfd708e43 in lvm2_main (argc=6, argv=0x7ffd657c0678) at lvmcmdline.c:3648
#21 0x00007f7cb49996a3 in __libc_start_main (main=0x558bfd6e4b70 <main>, argc=6, argv=0x7ffd657c0678, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd657c0668)
    at ../csu/libc-start.c:308
#22 0x0000558bfd6e4c3e in _start () at lvm.c:22

Comment 2 David Teigland 2020-02-03 22:01:48 UTC
fix in master
https://sourceware.org/git/?p=lvm2.git;a=commit;h=c1ee6f0eef24a44cc02ec941f560bc17ac61b3d8

# lvs -a foo -o+devices
  LV            VG  Attr       LSize  Pool        Origin        Data%  Meta%  Move Log Cpy%Sync Convert Devices       
  [fast_cvol]   foo Cwi-aoC--- 32.00m                                                                   /dev/sdg(0)   
  main          foo Cwi-a-C--- 64.00m [fast_cvol] [main_wcorig] 0.00                                    main_wcorig(0)
  [main_wcorig] foo owi-aoC--- 64.00m                                                                   /dev/sdd(0)   

# pvmove /dev/sdg /dev/sde
  Unable to pvmove device used for writecache.

# pvmove -n foo/fast_cvol /dev/sdg /dev/sde
  Unable to pvmove device used for writecache.

Comment 4 Corey Marthaler 2020-02-21 17:14:39 UTC
Fix verified in the latest rpms.

kernel-4.18.0-179.el8    BUILT: Fri Feb 14 17:03:01 CST 2020
lvm2-2.03.08-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
lvm2-libs-2.03.08-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
lvm2-dbusd-2.03.08-1.el8    BUILT: Tue Feb 11 07:42:51 CST 2020
lvm2-lockd-2.03.08-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
device-mapper-1.02.169-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
device-mapper-libs-1.02.169-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
device-mapper-event-1.02.169-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020
device-mapper-event-libs-1.02.169-1.el8    BUILT: Tue Feb 11 07:40:33 CST 2020



[root@hayes-02 ~]# lvs -a -o +devices
  LV                     VG                Attr       LSize Pool                 Origin                 Data%  Meta%  Move Log Cpy%Sync Convert Devices                
  rename_orig_A          writecache_sanity Cwi-a-C--- 4.00g [rename_pool_A_cvol] [rename_orig_A_wcorig] 0.00                                    rename_orig_A_wcorig(0)
  [rename_orig_A_wcorig] writecache_sanity owi-aoC--- 4.00g                                                                                     /dev/sdf1(0)           
  [rename_pool_A_cvol]   writecache_sanity Cwi-aoC--- 4.00g                                                                                     /dev/sdd1(0)           

# Origin can still be pvmoved
[root@hayes-02 ~]# pvmove -n writecache_sanity/rename_orig_A_wcorig -v /dev/sdf1 /dev/sde1
  Archiving volume group "writecache_sanity" metadata (seqno 31).
  Creating logical volume pvmove0
  activation/volume_list configuration setting not defined: Checking only host tags for writecache_sanity/rename_orig_A.
  Moving 1024 extents of logical volume writecache_sanity/rename_orig_A_wcorig.
  activation/volume_list configuration setting not defined: Checking only host tags for writecache_sanity/rename_orig_A.
  Loading table for writecache_sanity-rename_pool_A_cvol (253:0).
  Suppressed writecache_sanity-rename_pool_A_cvol (253:0) identical table reload.
  Creating writecache_sanity-pvmove0
  Loading table for writecache_sanity-pvmove0 (253:3).
  Loading table for writecache_sanity-rename_orig_A_wcorig (253:1).
  Loading table for writecache_sanity-rename_orig_A (253:2).
  Suppressed writecache_sanity-rename_orig_A (253:2) identical table reload.
  Suspending writecache_sanity-rename_orig_A (253:2) with device flush
  Suspending writecache_sanity-rename_pool_A_cvol (253:0) with device flush
  Suspending writecache_sanity-rename_orig_A_wcorig (253:1) with device flush
  Loading table for writecache_sanity-rename_pool_A_cvol (253:0).
  Suppressed writecache_sanity-rename_pool_A_cvol (253:0) identical table reload.
  Loading table for writecache_sanity-rename_orig_A (253:2).
  Suppressed writecache_sanity-rename_orig_A (253:2) identical table reload.
  Resuming writecache_sanity-pvmove0 (253:3).
  Resuming writecache_sanity-rename_pool_A_cvol (253:0).
  Resuming writecache_sanity-rename_orig_A_wcorig (253:1).
  Resuming writecache_sanity-rename_orig_A (253:2).
  Creating volume group backup "/etc/lvm/backup/writecache_sanity" (seqno 32).
  activation/volume_list configuration setting not defined: Checking only host tags for writecache_sanity/pvmove0.
  Checking progress before waiting every 15 seconds.
  /dev/sdf1: Moved: 1.17%
  /dev/sdf1: Moved: 31.93%
  /dev/sdf1: Moved: 62.79%
  /dev/sdf1: Moved: 93.65%
  /dev/sdf1: Moved: 100.00%
  Polling finished successfully.

[root@hayes-02 ~]# lvs -a -o +devices
  LV                     VG                Attr       LSize Pool                 Origin                 Data%  Meta%  Move Log Cpy%Sync Convert Devices                
  rename_orig_A          writecache_sanity Cwi-a-C--- 4.00g [rename_pool_A_cvol] [rename_orig_A_wcorig] 0.00                                    rename_orig_A_wcorig(0)
  [rename_orig_A_wcorig] writecache_sanity owi-aoC--- 4.00g                                                                                     /dev/sde1(0)           
  [rename_pool_A_cvol]   writecache_sanity Cwi-aoC--- 4.00g                                                                                     /dev/sdd1(0)           

# Pool can not be pvmoved
[root@hayes-02 ~]# pvmove -n writecache_sanity/rename_pool_A_cvol -v /dev/sdd1 /dev/sdf1
  Archiving volume group "writecache_sanity" metadata (seqno 34).
  Creating logical volume pvmove0
  Unable to pvmove device used for writecache.

[root@hayes-02 ~]# pvmove -v /dev/sdd1 /dev/sdf1
  Archiving volume group "writecache_sanity" metadata (seqno 34).
  Creating logical volume pvmove0
  Unable to pvmove device used for writecache.

Comment 6 errata-xmlrpc 2020-04-28 16:58:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:1881