Hide Forgot
Description of problem: This is a regression of the test case for bug 501473. SCENARIO - [pvmove_suspend_verification] Create a linear and a fake left over pvmove target and verify that doesn't cause a pvmove attempt to leave the linear suspended grant-01: lvcreate -n suspended -L 50M mirror_sanity grant-01: dmsetup create mirror_sanity-pvmove0 --notable Attempting pvmove of /dev/sdc3 on grant-01 grant-01: pvmove /dev/sdc3 Error locking on node grant-01: device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended Verifying the linear's dm state grant-01: dmsetup info mirror_sanity-suspended | grep ACTIVE grant-01: dmsetup info mirror_sanity-suspended | grep SUSPENDED grant-01: dmsetup remove mirror_sanity-pvmove0 Deactivating mirror suspended... [DEADLOCK] qarshd[18179]: Running cmdline: lvchange -an /dev/mirror_sanity/suspended udevd[540]: worker [18156] unexpectedly returned with status 0x0100 udevd[540]: worker [18156] failed while handling '/devices/virtual/block/dm-3' kernel: INFO: task lvchange:18180 blocked for more than 120 seconds. kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: lvchange D 0000000000000002 0 18180 18179 0x00000080 kernel: ffff88021c4f3b18 0000000000000086 ffff88021c4f3ad8 ffffffffa00041cc kernel: ffff88021c4f3ae8 00000000996a433b ffff88021c4f3b08 ffff88011a3ea240 kernel: ffff88021cafa5f8 ffff88021c4f3fd8 000000000000f508 ffff88021cafa5f8 kernel: Call Trace: kernel: [<ffffffffa00041cc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] kernel: [<ffffffff8109b769>] ? ktime_get_ts+0xa9/0xe0 kernel: [<ffffffff814ec413>] io_schedule+0x73/0xc0 kernel: [<ffffffff811b15be>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 kernel: [<ffffffff811b1aae>] __blockdev_direct_IO+0x5e/0xd0 kernel: [<ffffffff811ae3b0>] ? blkdev_get_blocks+0x0/0xc0 kernel: [<ffffffff811af217>] blkdev_direct_IO+0x57/0x60 kernel: [<ffffffff811ae3b0>] ? blkdev_get_blocks+0x0/0xc0 kernel: [<ffffffff811126bb>] generic_file_aio_read+0x6bb/0x700 kernel: [<ffffffff81213181>] ? avc_has_perm+0x71/0x90 kernel: [<ffffffff8120cc7f>] ? security_inode_permission+0x1f/0x30 kernel: [<ffffffff81175f3a>] do_sync_read+0xfa/0x140 kernel: [<ffffffff81090b70>] ? autoremove_wake_function+0x0/0x40 kernel: [<ffffffff811ae7ec>] ? block_ioctl+0x3c/0x40 kernel: [<ffffffff81188ed2>] ? vfs_ioctl+0x22/0xa0 kernel: [<ffffffff8121877b>] ? selinux_file_permission+0xfb/0x150 kernel: [<ffffffff8120bb16>] ? security_file_permission+0x16/0x20 kernel: [<ffffffff81176935>] vfs_read+0xb5/0x1a0 kernel: [<ffffffff810d4602>] ? audit_syscall_entry+0x272/0x2a0 kernel: [<ffffffff81176a71>] sys_read+0x51/0x90 kernel: [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b The problem is that instead of failing "gracefully" like it used to: device-mapper: create ioctl failed: Device or resource busy Temporary pvmove mirror activation failed. It now attempts to do the pvmove and leaves pvmove targets on all nodes, causing any other lvm cmds to deadlock. Error locking on node grant-01: device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended Version-Release number of selected component (if applicable): 2.6.32-192.el6.x86_64 lvm2-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 lvm2-libs-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 lvm2-cluster-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 udev-147-2.37.el6 BUILT: Wed Aug 10 07:48:15 CDT 2011 device-mapper-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-libs-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-event-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-event-libs-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 cmirror-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 How reproducible: Everytime
Here's the same test case run in single machine mode on the current 6.2 rpms versus the 6.1 stable rpms. ### SINGLE NODE (Current 6.2 RPMS) [root@grant-03 ~]# lvs -a -o +devices LV VG Attr LSize Devices suspended mirror_sanity -wi-a- 52.00m /dev/sdc3(0) [root@grant-03 ~]# dmsetup create mirror_sanity-pvmove0 --notable [root@grant-03 ~]# pvmove /dev/sdc3 device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended 2.6.32-192.el6.x86_64 lvm2-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 udev-147-2.37.el6 BUILT: Wed Aug 10 07:48:15 CDT 2011 device-mapper-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 ### SINGLE NODE (6.1 RPMS) [root@grant-03 ~]# lvs -a -o +devices LV VG Attr LSize Devices suspended mirror_sanity -wi-a- 52.00m /dev/sdb1(0) [root@grant-03 ~]# dmsetup create mirror_sanity-pvmove0 --notable [root@grant-03 ~]# dmsetup ls mirror_sanity-suspended (253, 2) mirror_sanity-pvmove0 (253, 4) [root@grant-03 ~]# pvmove /dev/sdb1 device-mapper: create ioctl failed: Device or resource busy Temporary pvmove mirror activation failed. 2.6.32-131.0.15.el6.x86_64 lvm2-2.02.83-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011 udev-147-2.35.el6 BUILT: Wed Mar 30 07:32:05 CDT 2011 device-mapper-1.02.62-3.el6 BUILT: Fri Mar 18 09:31:10 CDT 2011
Corey, can you try it reproduce with the latest scratch build (which contains retry on remove)? (lvm2-2.02.87-2.1.el6.x86_64)
This still fails with the latest scratch build. [root@taft-01 ~]# vgcreate mirror_sanity /dev/sd[bcdefgh]1 Volume group "mirror_sanity" successfully created [root@taft-01 ~]# lvcreate -n suspended -L 50M mirror_sanity Rounding up size to full physical extent 52.00 MiB [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Devices suspended mirror_sanity -wi-a- 52.00m /dev/sdb1(0) [root@taft-01 ~]# dmsetup create mirror_sanity-pvmove0 --notable [root@taft-01 ~]# dmsetup ls mirror_sanity-suspended (253, 4) mirror_sanity-pvmove0 (253, 3) [root@taft-01 ~]# pvmove /dev/sdb1 device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended 2.6.32-195.el6.x86_64 lvm2-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 lvm2-libs-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 lvm2-cluster-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 udev-147-2.38.el6 BUILT: Fri Sep 9 16:25:50 CDT 2011 device-mapper-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-libs-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-event-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-event-libs-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 cmirror-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011
Well, there is slight incompatibility in dmsetup, you should fix test to use dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate (but that's not the real problem though)
I've updated the test to create with the '--addnodeoncreate' flag. grant-01: dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate Attempting pvmove of /dev/sdc6 on grant-01 device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended
In the latest rpms as well. SCENARIO - [pvmove_suspend_verification] Create a linear and a fake left over pvmove target and verify that doesn't cause a pvmove attempt to leave the linear suspended grant-02: lvcreate -n suspended -L 50M mirror_sanity grant-02: dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate Attempting pvmove of /dev/sdb1 on grant-02 Failed messages found, possible regression of 736509 Error locking on node grant-02: device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended Verifying the linear's dm state grant-02: dmsetup remove mirror_sanity-pvmove0 qarshd[11716]: Running cmdline: lvchange -an /dev/mirror_sanity/suspended kernel: INFO: task lvchange:11717 blocked for more than 120 seconds. kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kernel: lvchange D 0000000000000001 0 11717 11716 0x00000080 kernel: ffff880102061b18 0000000000000082 ffff880102061ad8 ffffffffa00041cc kernel: ffff880102061ae8 00000000f187fa8f ffff880102061b08 ffff88021d169c80 kernel: ffff88011ad07af8 ffff880102061fd8 000000000000f508 ffff88011ad07af8 kernel: Call Trace: kernel: [<ffffffffa00041cc>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] kernel: [<ffffffff8109b779>] ? ktime_get_ts+0xa9/0xe0 kernel: [<ffffffff814ecc33>] io_schedule+0x73/0xc0 kernel: [<ffffffff811b17de>] __blockdev_direct_IO_newtrunc+0x6fe/0xb90 kernel: [<ffffffff811b1cce>] __blockdev_direct_IO+0x5e/0xd0 kernel: [<ffffffff811ae5d0>] ? blkdev_get_blocks+0x0/0xc0 kernel: [<ffffffff811af437>] blkdev_direct_IO+0x57/0x60 kernel: [<ffffffff811ae5d0>] ? blkdev_get_blocks+0x0/0xc0 kernel: [<ffffffff811127db>] generic_file_aio_read+0x6bb/0x700 kernel: [<ffffffff81213741>] ? avc_has_perm+0x71/0x90 kernel: [<ffffffff8120d23f>] ? security_inode_permission+0x1f/0x30 kernel: [<ffffffff811761ca>] do_sync_read+0xfa/0x140 kernel: [<ffffffff81090b60>] ? autoremove_wake_function+0x0/0x40 kernel: [<ffffffff811aea0c>] ? block_ioctl+0x3c/0x40 kernel: [<ffffffff811890f2>] ? vfs_ioctl+0x22/0xa0 kernel: [<ffffffff81218d3b>] ? selinux_file_permission+0xfb/0x150 kernel: [<ffffffff8120c0d6>] ? security_file_permission+0x16/0x20 kernel: [<ffffffff81176bc5>] vfs_read+0xb5/0x1a0 kernel: [<ffffffff810d4612>] ? audit_syscall_entry+0x272/0x2a0 kernel: [<ffffffff81176d01>] sys_read+0x51/0x90 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b 2.6.32-207.el6.x86_64 lvm2-2.02.87-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 lvm2-libs-2.02.87-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 lvm2-cluster-2.02.87-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.66-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 device-mapper-libs-1.02.66-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 device-mapper-event-1.02.66-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 device-mapper-event-libs-1.02.66-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011 cmirror-2.02.87-5.el6 BUILT: Wed Oct 12 10:47:46 CDT 2011
In the latest as well: lvm2-2.02.87-6.el6 BUILT: Wed Oct 19 06:46:31 CDT 2011 lvm2-cluster-2.02.87-6.el6 BUILT: Wed Oct 19 06:46:31 CDT 2011
So. Current code non-clustered gives me: device-mapper: create ioctl on vg2-pvmove0 failed: Device or resource busy Failed to suspend lvol0 ABORTING: Volume group metadata update failed. (first_time: 1) Checking the code, it's correct behaviour, even though the last message is bogus (the metadata update did not fail).
Error message fixed upstream: device-mapper: create ioctl on vg2-pvmove0 failed: Device or resource busy Failed to suspend lvol0 ABORTING: Temporary pvmove mirror activation failed. http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/tools/pvmove.c.diff?r1=1.93&r2=1.94&cvsroot=lvm2
So it's worth re-testing this with the 6.3 RPMs in a proper cluster now: apart from the cosmetic error message problem I can't get it to leave things in a mess here now.
This test case still fails with the latest rpms/kernel. SCENARIO - [pvmove_suspend_verification] Create a linear and a fake left over pvmove target and verify that doesn't cause a pvmove attempt to leave the linear suspended grant-02: lvcreate -n suspended -L 50M mirror_sanity grant-02: dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate [root@grant-02 ~]# lvs -a -o +devices LV VG Attr LSize Devices suspended mirror_sanity -wi-a--- 52.00m /dev/sdc6(0) [root@grant-02 ~]# dmsetup ls mirror_sanity-suspended (253:2) mirror_sanity-pvmove0 (253:4) [root@grant-02 ~]# pvscan PV /dev/sdc6 VG mirror_sanity lvm2 [54.49 GiB / 54.44 GiB free] PV /dev/sdc5 VG mirror_sanity lvm2 [54.48 GiB / 54.48 GiB free] PV /dev/sdc3 VG mirror_sanity lvm2 [54.49 GiB / 54.49 GiB free] PV /dev/sdc2 VG mirror_sanity lvm2 [54.48 GiB / 54.48 GiB free] PV /dev/sdc1 VG mirror_sanity lvm2 [54.49 GiB / 54.49 GiB free] PV /dev/sdb6 VG mirror_sanity lvm2 [40.87 GiB / 40.87 GiB free] PV /dev/sdb5 VG mirror_sanity lvm2 [40.86 GiB / 40.86 GiB free] PV /dev/sdb3 VG mirror_sanity lvm2 [40.87 GiB / 40.87 GiB free] PV /dev/sdb2 VG mirror_sanity lvm2 [40.87 GiB / 40.87 GiB free] PV /dev/sdb1 VG mirror_sanity lvm2 [40.86 GiB / 40.86 GiB free] PV /dev/sda2 VG vg_grant02 lvm2 [74.01 GiB / 0 free] Total: 11 [550.79 GiB] / in use: 11 [550.79 GiB] / in no VG: 0 [0 ] Attempting pvmove of /dev/sdc6 on grant-02 [root@grant-02 ~]# pvmove /dev/sdc6 Error locking on node grant-02: device-mapper: create ioctl on mirror_sanity-pvmove0 failed: Device or resource busy Failed to suspend suspended ABORTING: Volume group metadata update failed. (first_time: 1) 2.6.32-251.el6.x86_64 lvm2-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 lvm2-libs-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 lvm2-cluster-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-libs-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-event-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-event-libs-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 cmirror-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012
I think the current lvm package works properly for originally report. There should not be left any device in 'suspend' state as reported in the description of this bugzilla. The tool now currently properly 'aborts' since it finds conflicting device with name -pvmove0 and pvmove tool currently doesn't try to use any other name (we may think about smarter behavior in future) So I think, the test needs to be fixed. Abort is to be expected, but no suspended devices should be left. So are there any devices left in suspend on any of cluster nodes ? (Since in our local test environment we do not get them) If not - I think this bug could be closed - eventually replaced with a new bz requesting smarter behavior. Currently lvm does not expect user modifies dm tables and takes away device, lvm tries to use).