Bug 757140

Summary: LVM fails to remove snapshot
Product: [Fedora] Fedora Reporter: eduardo.perezesteban
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 15CC: agk, bmarzins, bmr, dwysocha, heinzm, jonathan, lvm-team, mbroz, msnitzer, prajnoha, prockai, zkabelac
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-24 14:20:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description eduardo.perezesteban 2011-11-25 14:59:35 UTC
Description of problem:
I use LVM's snapshots within rsnapshot for my back-up's; everything was working flawlessly, until I changed how my root volume is mentioned in the /etc/fstab file.

It works with:
  /dev/mapper/vg_disk1-lv_fc15   / ext4  defaults,noatime 1 1

But fails with:
  LABEL=ROOT / ext4  defaults,noatime 1 1

Error message is:
  Can't remove open logical volume "..."

Version-Release number of selected component (if applicable):
  LVM version:     2.02.84(2) (2011-02-09)
  Library version: 1.02.63 (2011-02-09)
  Driver version:  4.21.0

How reproducible:
Fails almost always when LABEL is used, works always when device name is used.

Steps to Reproduce:
1. Use LABEL in /etc/fstab for a root volume.
2. Create a snapshot of root volume.
3. Remove snapshot.
  
Actual results:
  Can't remove open logical volume "..."

Expected results:
 Logical volume "..." successfully removed

Additional info:

Comment 1 Alasdair Kergon 2011-11-25 15:09:45 UTC
So you have 2 devices both with the same label now.
Which one gets mounted?

Comment 2 eduardo.perezesteban 2011-11-25 17:08:39 UTC
Sorry, no: I have one device (/dev/mapper/vg_disk1-lv_fc15) labeled as "ROOT", and there is no other device labeled as "ROOT"; I use either "LABEL=.." or "/dev/..." in fstab, but not both at the same time.

When I use "/dev/...", LVM creates and removes snapshots successfully; when I use "LABEL_=...", LVM fails to remove snapshots.

Comment 3 Alasdair Kergon 2011-11-25 17:37:31 UTC
You have two devices - the origin and its snapshot.  Or do you change the label on the snapshot immediately after you create it?

Comment 4 eduardo.perezesteban 2011-11-25 19:17:16 UTC
Oh, I see what you mean now; sorry.

Only the original device exists when the computer boots up, so there is no room for confusion here. Then, I create and mount / unmount the snapshot by hand (there is no entry for the snapshot at fstab), using only the device name (not the label, and hence there is no confusion here, either).

Comment 5 Alasdair Kergon 2011-11-25 19:53:28 UTC
LVM does not use /etc/fstab.
So there's no obvious answer here.
You'll need to trace the whole sequence of commands you're running (and possibly other system components that might interfere like udevd).

Do 2 sets of traces - when it works, and then repeat when it doesn't work - and compare the two to see what is done differently.

Maybe also capture /dev in both cases and compare them to see if all the symlinks are the same (does the label lead to symlinks?), and check relevant bits of /proc and sysfs.

Various ways to trace.  For lvm, you can add -vvvv to get more verbose messages from it.  strace/ltrace and others.

Comment 6 Alasdair Kergon 2011-11-25 19:56:22 UTC
Also, see if changing the label on the snapshot immediately after creating it makes any difference.

Comment 7 eduardo.perezesteban 2011-11-26 22:13:09 UTC
Well, looks like this is not related to fstab, after all: I reverted to the old configuration, and it is failing again. I'm thinking again what may I have changed that triggered this, because this has been working for ages; next suspect on my list is that I recently changed /tmp to a tmpfs.

Comment 8 Peter Rajnoha 2012-02-02 16:09:46 UTC
This looks like udev interaction again. Are you able to remove the snapshot with udev stopped temporarily? Please, try this:

  systemctl stop udev.service udev-control.socket udev-kernel.socket
  lvremove <the snapshot>
  systemctl start udev.service udev-control.socket udev-kernel.socket

My guess is this is a dup of bug #712100.

Comment 10 eduardo.perezesteban 2012-02-23 17:39:47 UTC
I tried to comment the line in /lib/udev/rules.d/80-udisks.rules, as explained in bug #712100; the situation improved, but the problem still happened now and then. Later I upgraded to Fedora 16, and activated again the rule in the udev config file; I have not detected any issue since then. Thanks.

Comment 11 Peter Rajnoha 2012-02-24 14:20:08 UTC
(In reply to comment #10)
> I tried to comment the line in /lib/udev/rules.d/80-udisks.rules, as explained
> in bug #712100; the situation improved, but the problem still happened now and
> then. Later I upgraded to Fedora 16, and activated again the rule in the udev
> config file; I have not detected any issue since then. Thanks.

OK, I'm marking this one as a dup of bug #712100 as it's udev-interaction problem with 99% certainty. If you encounter this problem again in F16, we also have a clone of the former bug for F16 as well - bug #753105, but the update hasn't been done yet (this will probably end up with CLOSED NEXTRELEASE).

F17 will have all the retry-remove logic in libdevmapper/lvm2 directly that should prevent this problem to occur (the problem is that we can't synchronize with synthesized events, like commented in those bug previous reports already).

If you encounter the problem again in F16, please, feel free to add a comment in the bug #753105 and we'll see whether there is also another cause of this problem or whether it's the same one. Thanks.

*** This bug has been marked as a duplicate of bug 712100 ***