Bug 1533932

Summary: Locking error when merging thin snapshot
Product: Red Hat Enterprise Linux 7 Reporter: Roman Bednář <rbednar>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
lvm2 sub component: Thin Provisioning QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, heinzm, jbrassow, mcsontos, msnitzer, prajnoha, prockai, rhandlin, thornber, zkabelac
Version: 7.5Keywords: Regression
Target Milestone: rcFlags: rhandlin: needinfo+
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.177-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 15:23:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Bednář 2018-01-12 15:01:27 UTC
Not 100% reproducible, possible race. 

=============================
SCENARIO - [invalidated_thin_snap_merge]
 Create "invalidated" (full) thin snapshots and then verify that merge attempts will not cause problem
 Making pool volume
 lvcreate --activate ey --thinpool POOL -L 500M  --zero n --poolmetadatasize 4M snapper_thinp
 
 Sanity checking pool device (POOL) metadata
 thin_check /dev/mapper/snapper_thinp-meta_swap.601
 examining superblock
 examining devices tree
 examining mapping tree
 checking space map counts
 
 
 Making origin volume
 lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n origin
 lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n other1
 lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other2
 lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other3
 lvcreate --activate ey --virtualsize 100M -T snapper_thinp/POOL -n other4
 lvcreate --activate ey -V 100M -T snapper_thinp/POOL -n other5
   WARNING: Sum of all thin volume sizes (600.00 MiB) exceeds the size of thin pool snapper_thinp/POOL (500.00 MiB).
 
 lvcreate --activate ey -y -k n -s /dev/snapper_thinp/origin -n invalid1
 Filling snapshot /dev/snapper_thinp/invalid1
 dd if=/dev/zero of=/dev/snapper_thinp/invalid1 bs=1M count=101
 dd: error writing ‘/dev/snapper_thinp/invalid1’: No space left on device
 101+0 records in
 100+0 records out
 104857600 bytes (105 MB) copied, 0.307725 s, 341 MB/s
 Attempt to merge back an invalidated snapshot volume
 lvconvert --merge /dev/snapper_thinp/invalid1 --yes
   Error locking on node UNKNOWN 1: Refusing activation of partial LV snapper_thinp/origin.  Use '--activationmode partial' to override.
   Failed to reactivate origin snapper_thinp/origin.
 couldn't merge invalidated snap


=============================
lvm2-2.02.176-5.el7.x86_64
kernel-3.10.0-826.el7.x86_64

Comment 2 Zdenek Kabelac 2018-01-12 15:04:55 UTC
The issue seems to popup as a race on cluster - where the reactivate origin needs to be activated exclusively.

Reasonable simple fix:


diff --git a/tools/lvconvert.c b/tools/lvconvert.c
index deb7cc909..618d81953 100644
--- a/tools/lvconvert.c
+++ b/tools/lvconvert.c
@@ -2170,7 +2170,7 @@ static int _lvconvert_merge_thin_snapshot(struct cmd_context *cmd,
 		log_print_unless_silent("Volume %s replaced origin %s.",
 					display_lvname(origin), display_lvname(lv));
 
-		if (origin_is_active && !activate_lv(cmd, lv)) {
+		if (origin_is_active && !activate_lv_excl(cmd, lv)) {
 			log_error("Failed to reactivate origin %s.",
 				  display_lvname(lv));
 			return 0;

Comment 4 Zdenek Kabelac 2018-01-17 16:02:56 UTC
Believed it's fixed upstream with these patch (fixing also few other possibly locking problems for lvconvert and stacking).


https://www.redhat.com/archives/lvm-devel/2018-January/msg00049.html

Comment 11 errata-xmlrpc 2018-04-10 15:23:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0853