Bug 927437

Summary: "device-mapper: remove ioctl on failed" error when creating thin pool volume
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Default / Unclassified QA Contact: cluster-qe <cluster-qe>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: low    
Priority: low CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Version: 7.0Keywords: Reopened, Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.103-1.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1003441 (view as bug list) Environment:
Last Closed: 2014-06-13 09:41:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1003441    
Attachments:
Description Flags
-vvvv of convert cmd
none
-vvvv of create cmd none

Description Corey Marthaler 2013-03-25 21:59:41 UTC
Description of problem:
SCENARIO - [display_snap]
Create a snapshot and then display it a couple ways
Making origin volume
lvcreate --thinpool POOL -L 1G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

The command passes in creating the thin pool volume, the error is just annoying.

Version-Release number of selected component (if applicable):
3.7.0-0.36.el7.x86_64

lvm2-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
lvm2-libs-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
lvm2-cluster-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-libs-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-event-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-event-libs-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
cmirror-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013


How reproducible:
Everytime

Comment 1 Zdenek Kabelac 2013-05-14 09:28:03 UTC
Please attach  -vvvv trace for lvcreate command.

Comment 2 Corey Marthaler 2013-05-15 16:43:38 UTC
I can no longer reproduce this with the latest build from today (15-May-2013 15:29).

[root@qalvm-01 ~]# lvcreate --thinpool POOL -L 1G test
  Logical volume "POOL" created


3.8.0-0.40.el7.x86_64
lvm2-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-libs-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-cluster-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
cmirror-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013

Comment 3 Corey Marthaler 2013-05-15 21:40:52 UTC
Not so fast :) this error still exists when converting a linear to be a snapshot of a thin origin volume. I'll attach the full -vvvv file.

lvconvert -vvvv -s snapper_thinp/origin snapper_thinp/snap_to_convert



#libdm-deptree.c:947     Removing snapper_thinp-snap_to_convert (253:12)
#libdm-common.c:2136         Udev cookie 0xd4d6d12 (semid 14974976) created
#libdm-common.c:2156         Udev cookie 0xd4d6d12 (semid 14974976) incremented to 1
#libdm-common.c:2028         Udev cookie 0xd4d6d12 (semid 14974976) incremented to 2
#libdm-common.c:2269         Udev cookie 0xd4d6d12 (semid 14974976) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK (0x20)
#ioctl/libdm-iface.c:1724         dm remove   (253:12) NFT    [16384] (*1)
#ioctl/libdm-iface.c:1742   device-mapper: remove ioctl on  failed: Device or resource busy
#ioctl/libdm-iface.c:1724         dm remove   (253:12) NFT    [16384] (*2)
#libdm-common.c:1340         snapper_thinp-snap_to_convert: Stacking NODE_DEL [trust_udev]

Comment 4 Corey Marthaler 2013-05-15 21:42:58 UTC
Created attachment 748523 [details]
-vvvv of convert cmd

Comment 5 Corey Marthaler 2013-05-15 22:01:24 UTC
Actually, I can reproduce this still during normal pool creates, it's just not as often now. I'll try and get a -vvvv from a standard pool create as well.

SCENARIO - [snap_of_multiple_virt_origins]
Create snapshots of each of the many virtual devices
Making origin volume
lvcreate --thinpool POOL -L 1G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

[...]

SCENARIO - [fs_io_B]
Create snapshots of origin with fs, and then write fs data to each snapshot
Making origin volume
lvcreate --thinpool POOL -L 4G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

[...]

SCENARIO - [contiguous_opt_check]
Sanity check for the contiguous allocation option
Recreating PVs/VG with smaller sizes
Making origin volume
lvcreate --thinpool POOL -L 250M snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy


3.8.0-0.40.el7.x86_64
lvm2-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-libs-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-cluster-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
package udev is not installed
date: invalid date â1970-01-01 UTC package udev is not installed secondsâ
device-mapper-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
cmirror-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013

Comment 6 Corey Marthaler 2013-05-15 22:54:11 UTC
It took many iterations but I finally found one. 

 for i in $(seq 1 100); do lvcreate -vvvv --thinpool pool$i -L 1G FOO > /tmp/lvcreate$i 2>&1; sleep 1; done


#libdm-deptree.c:947     Removing FOO-pool31 (253:122)
#libdm-common.c:2136         Udev cookie 0xd4de7b2 (semid 31490048) created
#libdm-common.c:2156         Udev cookie 0xd4de7b2 (semid 31490048) incremented to 1
#libdm-common.c:2028         Udev cookie 0xd4de7b2 (semid 31490048) incremented to 2
#libdm-common.c:2269         Udev cookie 0xd4de7b2 (semid 31490048) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK (0x20)
#ioctl/libdm-iface.c:1724         dm remove   (253:122) NFT    [16384] (*1)
#ioctl/libdm-iface.c:1742   device-mapper: remove ioctl on  failed: Device or resource busy
#ioctl/libdm-iface.c:1724         dm remove   (253:122) NFT    [16384] (*2)
#libdm-common.c:1340         FOO-pool31: Stacking NODE_DEL [trust_udev]
#mm/memlock.c:421         Leaving critical section (deactivated).

Comment 7 Corey Marthaler 2013-05-15 22:54:56 UTC
Created attachment 748536 [details]
-vvvv of create cmd

Comment 8 Zdenek Kabelac 2013-05-16 07:32:04 UTC
Ok, I think I start to see the problem here - it's the long outstanding issue we have - and the upstream code has only partial solution for deactivation retry - my own code has a better version - but needs some extra polishing for upstream inclusion.

I'll try to submit better patch for deactivation retry for review.

Comment 9 Zdenek Kabelac 2013-05-27 11:53:46 UTC
Upstream support for better testing of mounted filesystems - which should improve retry-ability of some lvm operations 

https://www.redhat.com/archives/lvm-devel/2013-May/msg00065.html

Can you please eventually retest with packages build from sources containing patches from 20 May?

Comment 10 Zdenek Kabelac 2013-05-31 08:04:10 UTC
Ok, I've managed to reproduce the problem with my latest code - here is the story behind:

Code retries 'delete' of device which could be scanned in parallel by udev - but for each such retry it also prints this a bit scary error message, but the code works correctly in this case as the next retry iteration will remove device (udev has finished it's scan) (you might see up-to 25 such message in row) before the code gives-up.

I'll see if there is a way to not print error message during retry loop and use just a debug notification for 'in-retry' loop.

Comment 11 Zdenek Kabelac 2013-07-24 12:14:25 UTC
So for non-clustered VGs there is now code path that skip reactivation - thus it should avoid reporting udev errors.

Upstream fix:
https://www.redhat.com/archives/lvm-devel/2013-July/msg00163.html

For clustered VG there is still a chance clvmd will report so retry message, since lvm currently is unable to synchronize with udev watch rule execution (basically random device access anytime later after written device is closed).

Comment 13 Corey Marthaler 2014-03-17 22:04:37 UTC
Fix verified in the latest rpms. I ran the loop in comment #6 a few times and was unable to hit this error when creating a thin pool volume.


3.10.0-110.el7.x86_64
lvm2-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
lvm2-libs-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
lvm2-cluster-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-libs-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-event-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-event-libs-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014

Comment 14 Ludek Smid 2014-06-13 09:41:33 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.