Bug 765981 - pvmove now fails to revert changes when left over pvmove target remains
Summary: pvmove now fails to revert changes when left over pvmove target remains
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.8
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 736509
Blocks: 807971
TreeView+ depends on / blocked
 
Reported: 2011-12-09 19:51 UTC by Corey Marthaler
Modified: 2012-08-08 15:46 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 736509
Environment:
Last Closed: 2012-08-08 15:46:56 UTC
Target Upstream Version:


Attachments (Terms of Use)

Comment 1 Corey Marthaler 2011-12-09 19:53:23 UTC
This exist in rhel5.8 as well.

SCENARIO - [pvmove_suspend_verification]
Create a linear and a fake left over pvmove target and verify
that doesn't cause a pvmove attempt to leave the linear suspended
grant-01: lvcreate -n suspended -L 50M mirror_sanity
grant-01: dmsetup create mirror_sanity-pvmove0 --notable --addnodeoncreate
Attempting pvmove of /dev/sdb1 on grant-01
Failed messages found, possible regression of 736509
  Error locking on node grant-01: device-mapper: create ioctl failed: Device or resource busy
  Failed to suspend suspended
Verifying the linear's dm state
grant-01: dmsetup remove mirror_sanity-pvmove0


Dec  9 00:07:17 grant-03 qarshd[1546]: Running cmdline: lvchange -an /dev/mirror_sanity/suspended
Dec  9 00:09:36 grant-03 kernel: INFO: task lvchange:1549 blocked for more than 120 seconds. 
Dec  9 00:09:36 grant-03 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec  9 00:09:36 grant-03 kernel: lvchange      D ffff81000100caa0     0  1549   1546                     (NOTLB)
Dec  9 00:09:36 grant-03 kernel:  ffff8101e9b7bcb8 0000000000000082 00000000000000ff ffffffff8001c452
Dec  9 00:09:36 grant-03 kernel:  ffff810114abe300 0000000000000008 ffff810202795820 ffff81012396a860
Dec  9 00:09:36 grant-03 kernel:  0000629ea2c6e3c1 000000000002c57b ffff810202795a08 0000000200000000
Dec  9 00:09:36 grant-03 kernel: Call Trace:
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8001c452>] generic_make_request+0x211/0x228
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8006ec8f>] do_gettimeofday+0x40/0x90
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800637ce>] io_schedule+0x3f/0x67
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800f7fb6>] __blockdev_direct_IO+0x8d5/0xa88
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800e885d>] blkdev_direct_IO+0x32/0x37
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800e8795>] blkdev_get_blocks+0x0/0x96
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8000c6fd>] __generic_file_aio_read+0xb8/0x198
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8012e9a3>] inode_has_perm+0x56/0x63
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800c935a>] generic_file_read+0xac/0xc5
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8012e9a3>] inode_has_perm+0x56/0x63
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff800a2e5d>] autoremove_wake_function+0x0/0x2e
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff80041ef3>] do_ioctl+0x21/0x6b
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff80131530>] selinux_file_permission+0x9f/0xb4
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8000b7c7>] vfs_read+0xcb/0x171
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff80011d5a>] sys_read+0x45/0x6e
Dec  9 00:09:36 grant-03 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0


2.6.18-274.el5

lvm2-2.02.88-5.el5    BUILT: Fri Dec  2 12:25:45 CST 2011
lvm2-cluster-2.02.88-5.el5    BUILT: Fri Dec  2 12:48:37 CST 2011
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 2 Alasdair Kergon 2011-12-09 21:06:16 UTC
I think this is fixed upstream.  We should compare relevant parts of the RHEL5 tree with upstream and see what's needed.

Comment 4 Milan Broz 2011-12-19 10:06:52 UTC
Corey, this test is flawed (I added that to simulate specific situation but mixing dmsetup with lvm operation is not correct workflow).

Are we able to simulate it with normal lvm commands? If not, I think this is not blocker.

I'll check the code but any changes in pvmove are risky and if we have no real world reproducer it is not worth to do it in this phase IMHO.

(Cond nack reproducer - I mean without manual dmsetup case involved.)

Comment 5 Corey Marthaler 2012-01-11 15:06:28 UTC
This has not been seen in normal lvm commands. The hack to create that left over device appears to be required in order for this problem to occur. I'll remove the blocker flag.

Comment 9 RHEL Program Management 2012-04-02 10:25:13 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 10 Milan Broz 2012-04-20 13:36:58 UTC
Corey, please can you check that there are no devices left in suspended state (with 5.8 rpms) and using normal lvm commands?

That's the only concern, otherwise mixing dmsetup & lvm commands is not supported configuration, so I will close the same as RHEL6 bug #736509 suggesting removing this test case.

Comment 11 Corey Marthaler 2012-08-08 15:46:56 UTC
I ran this test case on the latest 5.9 tree and just like in the rhel6 version of this bug (736509), this test case will produce the locking errors and "Failed to suspend" messages but will no longer deadlock due to device stuck in the SUSPEND state. So, marking closed...


2.6.18-333.el5

lvm2-2.02.88-9.el5    BUILT: Wed Jul 25 10:13:00 CDT 2012
lvm2-cluster-2.02.88-9.el5    BUILT: Wed Jul 25 10:09:34 CDT 2012
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-15.el5    BUILT: Thu Apr 26 17:07:01 CDT 2012
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009


Note You need to log in before you can comment on or make changes to this bug.