Hide Forgot
This exist in rhel5.8 as well. SCENARIO - [pvmove_suspend_verification] Create a linear and a fake left over pvmove target and verify that doesn't cause a pvmove attempt to leave the linear suspended grant-01: lvcreate -n suspended -L 50M mirror_sanity grant-01: dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate Attempting pvmove of /dev/sdb1 on grant-01 Failed messages found, possible regression of 736509 Error locking on node grant-01: device-mapper: create ioctl failed: Device or resource busy Failed to suspend suspended Verifying the linear's dm state grant-01: dmsetup remove mirror_sanity-pvmove0 Dec 9 00:07:17 grant-03 qarshd[1546]: Running cmdline: lvchange -an /dev/mirror_sanity/suspended Dec 9 00:09:36 grant-03 kernel: INFO: task lvchange:1549 blocked for more than 120 seconds. Dec 9 00:09:36 grant-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Dec 9 00:09:36 grant-03 kernel: lvchange D ffff81000100caa0 0 1549 1546 (NOTLB) Dec 9 00:09:36 grant-03 kernel: ffff8101e9b7bcb8 0000000000000082 00000000000000ff ffffffff8001c452 Dec 9 00:09:36 grant-03 kernel: ffff810114abe300 0000000000000008 ffff810202795820 ffff81012396a860 Dec 9 00:09:36 grant-03 kernel: 0000629ea2c6e3c1 000000000002c57b ffff810202795a08 0000000200000000 Dec 9 00:09:36 grant-03 kernel: Call Trace: Dec 9 00:09:36 grant-03 kernel: [<ffffffff8001c452>] generic_make_request+0x211/0x228 Dec 9 00:09:36 grant-03 kernel: [<ffffffff8006ec8f>] do_gettimeofday+0x40/0x90 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800637ce>] io_schedule+0x3f/0x67 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800f7fb6>] __blockdev_direct_IO+0x8d5/0xa88 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800e885d>] blkdev_direct_IO+0x32/0x37 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800e8795>] blkdev_get_blocks+0x0/0x96 Dec 9 00:09:36 grant-03 kernel: [<ffffffff8000c6fd>] __generic_file_aio_read+0xb8/0x198 Dec 9 00:09:36 grant-03 kernel: [<ffffffff8012e9a3>] inode_has_perm+0x56/0x63 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800c935a>] generic_file_read+0xac/0xc5 Dec 9 00:09:36 grant-03 kernel: [<ffffffff8012e9a3>] inode_has_perm+0x56/0x63 Dec 9 00:09:36 grant-03 kernel: [<ffffffff800a2e5d>] autoremove_wake_function+0x0/0x2e Dec 9 00:09:36 grant-03 kernel: [<ffffffff80041ef3>] do_ioctl+0x21/0x6b Dec 9 00:09:36 grant-03 kernel: [<ffffffff80131530>] selinux_file_permission+0x9f/0xb4 Dec 9 00:09:36 grant-03 kernel: [<ffffffff8000b7c7>] vfs_read+0xcb/0x171 Dec 9 00:09:36 grant-03 kernel: [<ffffffff80011d5a>] sys_read+0x45/0x6e Dec 9 00:09:36 grant-03 kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 2.6.18-274.el5 lvm2-2.02.88-5.el5 BUILT: Fri Dec 2 12:25:45 CST 2011 lvm2-cluster-2.02.88-5.el5 BUILT: Fri Dec 2 12:48:37 CST 2011 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-14.el5 BUILT: Wed Nov 2 17:25:33 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009
I think this is fixed upstream. We should compare relevant parts of the RHEL5 tree with upstream and see what's needed.
Corey, this test is flawed (I added that to simulate specific situation but mixing dmsetup with lvm operation is not correct workflow). Are we able to simulate it with normal lvm commands? If not, I think this is not blocker. I'll check the code but any changes in pvmove are risky and if we have no real world reproducer it is not worth to do it in this phase IMHO. (Cond nack reproducer - I mean without manual dmsetup case involved.)
This has not been seen in normal lvm commands. The hack to create that left over device appears to be required in order for this problem to occur. I'll remove the blocker flag.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
Corey, please can you check that there are no devices left in suspended state (with 5.8 rpms) and using normal lvm commands? That's the only concern, otherwise mixing dmsetup & lvm commands is not supported configuration, so I will close the same as RHEL6 bug #736509 suggesting removing this test case.
I ran this test case on the latest 5.9 tree and just like in the rhel6 version of this bug (736509), this test case will produce the locking errors and "Failed to suspend" messages but will no longer deadlock due to device stuck in the SUSPEND state. So, marking closed... 2.6.18-333.el5 lvm2-2.02.88-9.el5 BUILT: Wed Jul 25 10:13:00 CDT 2012 lvm2-cluster-2.02.88-9.el5 BUILT: Wed Jul 25 10:09:34 CDT 2012 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-15.el5 BUILT: Thu Apr 26 17:07:01 CDT 2012 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009