Bug 1481403

Summary: [RFE] improve merge has already been initiated error
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
lvm2 sub component: Snapshots QA Contact: cluster-qe <cluster-qe>
Status: CLOSED WONTFIX Docs Contact:
Severity: low    
Priority: unspecified CC: agk, heinzm, jbrassow, msnitzer, prajnoha, zkabelac
Version: 7.4Keywords: FutureFeature
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1899193 (view as bug list) Environment:
Last Closed: 2020-11-18 17:22:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1899193    

Description Corey Marthaler 2017-08-14 20:03:08 UTC
Description of problem:
If we know the "lv_is_merging_cow" attribute, can the error be something more like "Merge attempt already initiated for cache_sanity/merge3"

[root@host-080 ~]# lvconvert --yes --merge cache_sanity/merge3
  Merging of snapshot cache_sanity/merge3 will occur on next activation of cache_sanity/corigin.

[root@host-080 ~]# lvconvert --yes --merge cache_sanity/merge3
  Command on LV cache_sanity/merge3 does not accept LV with properties: lv_is_merging_cow .
  Command not permitted on LV cache_sanity/merge3.

Oddly, if the vg is activated and this exact cmd is attempted quickly enough, you'll end up hitting bug 1481383 apparently.

[root@host-080 ~]# vgchange -ay cache_sanity
  3 logical volume(s) in volume group "cache_sanity" now active
  Background polling started for 1 logical volume(s) in volume group "cache_sanity"
[root@host-080 ~]# lvconvert --yes --merge cache_sanity/merge3
[Deadlock]


Aug 14 14:59:39 host-080 lvmpolld: W: #011LVPOLL: PID 16865: STDERR: '  Internal error: Performing unsafe table load while 15 device(s) are known to be suspended:  (253:8) '
Aug 14 15:00:01 host-080 systemd: Started Session 1190 of user root.
Aug 14 15:00:01 host-080 systemd: Starting Session 1190 of user root.
Aug 14 15:00:39 host-080 lvmpolld: W: LVMPOLLD: polling for output of the lvm cmd (PID 16865) has timed out
Aug 14 15:01:02 host-080 systemd: Started Session 1191 of user root.
Aug 14 15:01:02 host-080 systemd: Starting Session 1191 of user root.
Aug 14 15:01:39 host-080 lvmpolld: W: LVMPOLLD: polling for output of the lvm cmd (PID 16865) has timed out
[607440.178194] INFO: task kworker/0:5:16864 blocked for more than 120 seconds.
[607440.180021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[607440.181996] kworker/0:5     D ffff88001cd4d714     0 16864      2 0x00000080
[607440.183849] Workqueue: ksnaphd do_metadata [dm_snapshot]
[607440.185183]  ffff8800380a3c10 0000000000000046 ffff88002345cf10 ffff8800380a3fd8
[607440.187156]  ffff8800380a3fd8 ffff8800380a3fd8 ffff88002345cf10 ffff88003fc16cc0
[607440.189157]  0000000000000000 7fffffffffffffff ffff88002345cf10 ffff88002a13b6a0
[607440.191111] Call Trace:



Version-Release number of selected component (if applicable):
3.10.0-686.el7.x86_64

lvm2-2.02.171-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
lvm2-libs-2.02.171-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
lvm2-cluster-2.02.171-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
device-mapper-1.02.140-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
device-mapper-libs-1.02.140-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
device-mapper-event-1.02.140-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
device-mapper-event-libs-1.02.140-8.el7    BUILT: Wed Jun 28 13:28:58 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 2 Zdenek Kabelac 2017-08-22 13:11:19 UTC
This is not related to actual old-snapshot merging code - the bug is related to handling stacking of devices where certain properties propagate in device stack.

Comment 3 Zdenek Kabelac 2020-11-18 17:22:04 UTC
Cloned as bug  1899193 for later investigation - not going to happen in RH7.