Bug 927437 - "device-mapper: remove ioctl on failed" error when creating thin pool volume
"device-mapper: remove ioctl on failed" error when creating thin pool volume
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2 (Show other bugs)
7.0
x86_64 Linux
low Severity low
: rc
: ---
Assigned To: Zdenek Kabelac
Cluster QE
: Reopened
Depends On:
Blocks: 1003441
  Show dependency treegraph
 
Reported: 2013-03-25 17:59 EDT by Corey Marthaler
Modified: 2014-06-17 21:18 EDT (History)
9 users (show)

See Also:
Fixed In Version: lvm2-2.02.103-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1003441 (view as bug list)
Environment:
Last Closed: 2014-06-13 05:41:33 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
-vvvv of convert cmd (108.33 KB, text/plain)
2013-05-15 17:42 EDT, Corey Marthaler
no flags Details
-vvvv of create cmd (499.71 KB, text/plain)
2013-05-15 18:54 EDT, Corey Marthaler
no flags Details

  None (edit)
Description Corey Marthaler 2013-03-25 17:59:41 EDT
Description of problem:
SCENARIO - [display_snap]
Create a snapshot and then display it a couple ways
Making origin volume
lvcreate --thinpool POOL -L 1G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

The command passes in creating the thin pool volume, the error is just annoying.

Version-Release number of selected component (if applicable):
3.7.0-0.36.el7.x86_64

lvm2-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
lvm2-libs-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
lvm2-cluster-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-libs-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-event-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
device-mapper-event-libs-1.02.78-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013
cmirror-2.02.99-0.5.el7    BUILT: Wed Mar 13 09:27:55 CDT 2013


How reproducible:
Everytime
Comment 1 Zdenek Kabelac 2013-05-14 05:28:03 EDT
Please attach  -vvvv trace for lvcreate command.
Comment 2 Corey Marthaler 2013-05-15 12:43:38 EDT
I can no longer reproduce this with the latest build from today (15-May-2013 15:29).

[root@qalvm-01 ~]# lvcreate --thinpool POOL -L 1G test
  Logical volume "POOL" created


3.8.0-0.40.el7.x86_64
lvm2-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-libs-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-cluster-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
cmirror-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
Comment 3 Corey Marthaler 2013-05-15 17:40:52 EDT
Not so fast :) this error still exists when converting a linear to be a snapshot of a thin origin volume. I'll attach the full -vvvv file.

lvconvert -vvvv -s snapper_thinp/origin snapper_thinp/snap_to_convert



#libdm-deptree.c:947     Removing snapper_thinp-snap_to_convert (253:12)
#libdm-common.c:2136         Udev cookie 0xd4d6d12 (semid 14974976) created
#libdm-common.c:2156         Udev cookie 0xd4d6d12 (semid 14974976) incremented to 1
#libdm-common.c:2028         Udev cookie 0xd4d6d12 (semid 14974976) incremented to 2
#libdm-common.c:2269         Udev cookie 0xd4d6d12 (semid 14974976) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK (0x20)
#ioctl/libdm-iface.c:1724         dm remove   (253:12) NFT    [16384] (*1)
#ioctl/libdm-iface.c:1742   device-mapper: remove ioctl on  failed: Device or resource busy
#ioctl/libdm-iface.c:1724         dm remove   (253:12) NFT    [16384] (*2)
#libdm-common.c:1340         snapper_thinp-snap_to_convert: Stacking NODE_DEL [trust_udev]
Comment 4 Corey Marthaler 2013-05-15 17:42:58 EDT
Created attachment 748523 [details]
-vvvv of convert cmd
Comment 5 Corey Marthaler 2013-05-15 18:01:24 EDT
Actually, I can reproduce this still during normal pool creates, it's just not as often now. I'll try and get a -vvvv from a standard pool create as well.

SCENARIO - [snap_of_multiple_virt_origins]
Create snapshots of each of the many virtual devices
Making origin volume
lvcreate --thinpool POOL -L 1G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

[...]

SCENARIO - [fs_io_B]
Create snapshots of origin with fs, and then write fs data to each snapshot
Making origin volume
lvcreate --thinpool POOL -L 4G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy

[...]

SCENARIO - [contiguous_opt_check]
Sanity check for the contiguous allocation option
Recreating PVs/VG with smaller sizes
Making origin volume
lvcreate --thinpool POOL -L 250M snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy


3.8.0-0.40.el7.x86_64
lvm2-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-libs-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
lvm2-cluster-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
package udev is not installed
date: invalid date ‘1970-01-01 UTC package udev is not installed seconds’
device-mapper-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
device-mapper-event-libs-1.02.78-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
cmirror-2.02.99-0.32.el7    BUILT: Wed May 15 08:28:08 CDT 2013
Comment 6 Corey Marthaler 2013-05-15 18:54:11 EDT
It took many iterations but I finally found one. 

 for i in $(seq 1 100); do lvcreate -vvvv --thinpool pool$i -L 1G FOO > /tmp/lvcreate$i 2>&1; sleep 1; done


#libdm-deptree.c:947     Removing FOO-pool31 (253:122)
#libdm-common.c:2136         Udev cookie 0xd4de7b2 (semid 31490048) created
#libdm-common.c:2156         Udev cookie 0xd4de7b2 (semid 31490048) incremented to 1
#libdm-common.c:2028         Udev cookie 0xd4de7b2 (semid 31490048) incremented to 2
#libdm-common.c:2269         Udev cookie 0xd4de7b2 (semid 31490048) assigned to REMOVE task(2) with flags DISABLE_LIBRARY_FALLBACK (0x20)
#ioctl/libdm-iface.c:1724         dm remove   (253:122) NFT    [16384] (*1)
#ioctl/libdm-iface.c:1742   device-mapper: remove ioctl on  failed: Device or resource busy
#ioctl/libdm-iface.c:1724         dm remove   (253:122) NFT    [16384] (*2)
#libdm-common.c:1340         FOO-pool31: Stacking NODE_DEL [trust_udev]
#mm/memlock.c:421         Leaving critical section (deactivated).
Comment 7 Corey Marthaler 2013-05-15 18:54:56 EDT
Created attachment 748536 [details]
-vvvv of create cmd
Comment 8 Zdenek Kabelac 2013-05-16 03:32:04 EDT
Ok, I think I start to see the problem here - it's the long outstanding issue we have - and the upstream code has only partial solution for deactivation retry - my own code has a better version - but needs some extra polishing for upstream inclusion.

I'll try to submit better patch for deactivation retry for review.
Comment 9 Zdenek Kabelac 2013-05-27 07:53:46 EDT
Upstream support for better testing of mounted filesystems - which should improve retry-ability of some lvm operations 

https://www.redhat.com/archives/lvm-devel/2013-May/msg00065.html

Can you please eventually retest with packages build from sources containing patches from 20 May?
Comment 10 Zdenek Kabelac 2013-05-31 04:04:10 EDT
Ok, I've managed to reproduce the problem with my latest code - here is the story behind:

Code retries 'delete' of device which could be scanned in parallel by udev - but for each such retry it also prints this a bit scary error message, but the code works correctly in this case as the next retry iteration will remove device (udev has finished it's scan) (you might see up-to 25 such message in row) before the code gives-up.

I'll see if there is a way to not print error message during retry loop and use just a debug notification for 'in-retry' loop.
Comment 11 Zdenek Kabelac 2013-07-24 08:14:25 EDT
So for non-clustered VGs there is now code path that skip reactivation - thus it should avoid reporting udev errors.

Upstream fix:
https://www.redhat.com/archives/lvm-devel/2013-July/msg00163.html

For clustered VG there is still a chance clvmd will report so retry message, since lvm currently is unable to synchronize with udev watch rule execution (basically random device access anytime later after written device is closed).
Comment 13 Corey Marthaler 2014-03-17 18:04:37 EDT
Fix verified in the latest rpms. I ran the loop in comment #6 a few times and was unable to hit this error when creating a thin pool volume.


3.10.0-110.el7.x86_64
lvm2-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
lvm2-libs-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
lvm2-cluster-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-libs-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-event-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-event-libs-1.02.84-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-12.el7    BUILT: Wed Mar 12 10:49:52 CDT 2014
Comment 14 Ludek Smid 2014-06-13 05:41:33 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.