Bug 1024347

Summary: "device-mapper: remove ioctl on failed" error when converting to thin pool volume
Product: Red Hat Enterprise Linux 6 Reporter: Peter Rajnoha <prajnoha>
Component: lvm2Assignee: Peter Rajnoha <prajnoha>
lvm2 sub component: Thin Provisioning (RHEL6) QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: agk, dwysocha, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac
Version: 6.6   
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.107-1.el6 Doc Type: Bug Fix
Doc Text:
Cause: When converting existing logical volume to thin pool logical volume, LVM needs to open the volume temporarily for it to be initialized with zeroes at its start. However, this initialization step caused WATCH udev rule to trigger. The WATCH udev rule is set for all top-level logical volumes to update udev database records if device content changes - the watch udev rule triggers on each close of the device once if it was open for writing before. This is exactly the case of opening the volume for thin-pool initialization. All udev rules are reevaluated based on the WATCH rule which causes subsequent scanning of the device for changes. At the same time, LVM may try to close the device, ending up with an error to be issued since LVM is unable to remove open device. Consequence: There was an error message issued while converting to thin pool volume: "device-mapper: remove ioctl on failed: Device or resource busy". Fix: LVM now uses proper flags for temporary volumes which are used during conversions as intermediate step. These flags direct udev to avoid setting WATCH rule or initiate any scanning on such devices until they're properly initialized. Result: There's no message about device-mapper device removal issued anymore during logical volume conversion to thin pool logical volume.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 08:24:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Peter Rajnoha 2013-10-29 13:11:13 UTC
(this is a second part to the original problem reported in bug #1003441)

Original report from Corey:

Still seeing this on 3 separate machines during convert. Did something get missed in the build? Do I need the latest kernel? Or are converts not covered by this bug?


SCENARIO - [linear_to_pool_conversion]
Create linear volumes and convert them to pool and pool meta volumes
lvcreate -L 100M -n to_pool_convert snapper_thinp
lvcreate -L 100M -n to_pool_meta_convert snapper_thinplvconvert --thinpool snapper_thinp/to_pool_convert --poolmetadata to_pool_meta_convert
  device-mapper: remove ioctl on  failed: Device or resource busy


SCENARIO - [raid1_to_pool_conversion]
Create raid1 volumes and convert them to pool and pool meta volumes
lvcreate --type raid1 -m 1 -L 100M -n to_pool_convert snapper_thinp
lvcreate --type raid1 -m 1 -L 100M -n to_pool_meta_convert snapper_thinp
lvconvert --thinpool snapper_thinp/to_pool_convert --poolmetadata to_pool_meta_convert
  device-mapper: remove ioctl on  failed: Device or resource busy


SCENARIO - [convert_vol_to_snap]
Making origin volume
lvcreate --thinpool POOL --zero y -L 1G snapper_thinp
Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
examining superblock
examining devices tree
examining mapping tree
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other1
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other2
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other3
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other4
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other5
Making a linear volume
Converting the linear to be a snapshot of the origin volume
  device-mapper: remove ioctl on  failed: Device or resource busy


2.6.32-423.el6.x86_64

lvm2-2.02.100-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
lvm2-libs-2.02.100-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
lvm2-cluster-2.02.100-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
udev-147-2.50.el6    BUILT: Fri Oct 11 12:58:10 CEST 2013
device-mapper-1.02.79-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
device-mapper-libs-1.02.79-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
device-mapper-event-1.02.79-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
device-mapper-event-libs-1.02.79-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013
device-mapper-persistent-data-0.2.8-2.el6    BUILT: Mon Oct 21 16:14:25 CEST 2013
cmirror-2.02.100-7.el6    BUILT: Wed Oct 23 17:19:11 CEST 2013

Comment 1 Peter Rajnoha 2013-10-29 13:15:36 UTC
The fix for lvconvert part - proper flagging with LV_TEMPORARY flag:

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=f1a42aa8ec4c6d693b57832a8575ee68b9e71b12

Comment 3 Nenad Peric 2014-07-01 13:00:50 UTC
Unfortunately the SCENARIO - [convert_vol_to_snap] still has the issue. 
The other scenarios did not hit it in my short test of 3 iterations. 
I can repeat the tests any time if needed with other test scenarios as well. 
But for now moving it back to ASSIGNED.

Comment 4 Peter Rajnoha 2014-07-01 13:23:03 UTC
The problem here is that there's existing and active LV we're trying to convert to snapshot without deactivating it first and activating with proper udev flags to mark this device as "private".

When running lvconvert, we're directly calling:

#metadata/lv_manip.c:5768     Initializing 4.00 KiB of logical volume "snapper_thinp/snap_to_convert" with value 0.
...
#activate/activate.c:2137         Deactivating snapper_thinp/snap_to_convert.
...
#ioctl/libdm-iface.c:1773   device-mapper: remove ioctl on  failed: Device or resource busy

The steps taken:
  - zeroing
  - deactivating

The steps that should be taken instead:
  - deactivating
  - activating with proper flags (LV_TEMPORARY)
  - zeroing
  - deactivating

Comment 5 Peter Rajnoha 2014-07-31 12:17:30 UTC
The fix should be already in! (v104, 6.6 build has v107/v108 already) Moving to MODIFIED.

Comment 7 Nenad Peric 2014-07-31 12:27:11 UTC
Ran 5 iterations of all the scenarios, repeated the convert_vol_to_snap with another 5 just to be sure. 
Did not encounter the issue reported anymore. 

Marking VERIFIED with:
kernel: 2.6.32-491.el6.x86_64


lvm2-2.02.107-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
lvm2-libs-2.02.107-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
lvm2-cluster-2.02.107-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
udev-147-2.56.el6    BUILT: Fri Jul 11 16:53:07 CEST 2014
device-mapper-1.02.86-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
device-mapper-libs-1.02.86-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
device-mapper-event-1.02.86-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
device-mapper-event-libs-1.02.86-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 15:43:06 CEST 2014
cmirror-2.02.107-2.el6    BUILT: Fri Jul 11 15:47:33 CEST 2014

Comment 8 errata-xmlrpc 2014-10-14 08:24:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1387.html