Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 684083

Summary:	pvmove stuck waiting for I/O to complete
Product:	Red Hat Enterprise Linux 6	Reporter:	Lachlan McIlroy <lmcilroy>
Component:	lvm2	Assignee:	Alasdair Kergon <agk>
Status:	CLOSED ERRATA	QA Contact:	Corey Marthaler <cmarthal>
Severity:	medium	Docs Contact:
Priority:	high
Version:	6.0	CC:	agk, dejohnso, dwysocha, heinzm, jbrassow, mbroz, prajnoha, prockai, pyu, thornber, vgaikwad, zkabelac
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	lvm2-2.02.86-1.el6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	602516	Environment:
Last Closed:	2011-12-06 16:54:39 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	602516, 706036

Comment 1 Lachlan McIlroy 2011-03-11 04:19:28 UTC

Customer has reproduced this bug on RHEL6.

"Were are trying to move the content of a physical disk in a volume group with 20 open logical volumes via the command "pvmove" to another physical disk freshly added to this volume group. To simulate database I/O we started two parallel iozone programs on two different logical volumes of the mentioned 20 logical volumes which are all mounted. We can reproduce that the pvmove command hangs after some time and the two iozone processes are stalled, too.

We would expect that the pvmove command moves the physical volume even when there is some load on the volumes as we often have to move disks when there is a database accessing this volume."

$ grep -E 'Suspend|Resume' pvmove_verbose_1.txt 
#libdm-deptree.c:1077     Suspending TEST1-test.1 (253:22) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.2 (253:23) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.3 (253:24) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.4 (253:25) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.5 (253:26) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.6 (253:27) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.7 (253:28) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.8 (253:29) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.9 (253:30) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.10 (253:31) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.11 (253:32) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.12 (253:33) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.13 (253:34) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.14 (253:35) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.15 (253:36) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.16 (253:37) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.17 (253:38) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.18 (253:39) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.19 (253:40) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.20 (253:41) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.1 (253:22) with device flush
#libdm-deptree.c:1077     Suspending TEST1-pvmove0 (253:42) with device flush    <---- pvmove0 suspended
#libdm-deptree.c:1077     Suspending TEST1-test.2 (253:23) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.3 (253:24) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.4 (253:25) with device flush
#libdm-deptree.c:1077     Suspending TEST1-test.5 (253:26) with device flush     <---- TEST1-test.5 is waiting for I/O to complete that's stuck in pvmove0



Mar  8 10:20:57 degtlun1843 kernel: pvmove        S ffff8801a7828800     0 17762  17761 0x00000001
Mar  8 10:20:57 degtlun1843 kernel: ffff88018011dcb8 0000000000000082 0000000000000000 ffff88019258ec00
Mar  8 10:20:57 degtlun1843 kernel: ffff88018011dc38 ffffffff8123b274 ffff88019bc58ec0 0000000103f42701
Mar  8 10:20:57 degtlun1843 kernel: ffff88019d116678 ffff88018011dfd8 0000000000010518 ffff88019d116678
Mar  8 10:20:57 degtlun1843 kernel: Call Trace:
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff8123b274>] ? blk_unplug+0x34/0x70
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff814c9533>] io_schedule+0x73/0xc0
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa000298b>] dm_wait_for_completion+0x9b/0x100 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff8105c530>] ? default_wake_function+0x0/0x20
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa0002af8>] dm_suspend+0x108/0x1f0 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa00085a6>] dev_suspend+0x76/0x240 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa0008530>] ? dev_suspend+0x0/0x240 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa0008fc3>] ctl_ioctl+0x1a3/0x240 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffffa0009073>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff8117fa12>] vfs_ioctl+0x22/0xa0
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff810c711c>] ? utrace_stop+0x12c/0x1e0
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff8117fbb4>] do_vfs_ioctl+0x84/0x580
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff810c865e>] ? utrace_report_syscall_entry+0x10e/0x160
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff81180131>] sys_ioctl+0x81/0xa0
Mar  8 10:20:57 degtlun1843 kernel: [<ffffffff81013387>] tracesys+0xd9/0xde

Comment 3 RHEL Program Management 2011-04-04 02:03:34 UTC

Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 5 Corey Marthaler 2011-06-03 18:04:24 UTC

Adding QA ack for 6.2.

Devel will need to provide unit testing results however before this bug can be
ultimately verified by QA.

Comment 6 hank 2011-07-01 09:33:19 UTC

The same bug is reported in rhel5:
https://bugzilla.redhat.com/show_bug.cgi?id=706036

And someone is working on fix it.

On comment 13, Alasdair provide a method to work around:

In the meantime, as a workaround, try using the -n option of pvmove to move
only one LV at once.

List of LVS in VG:  lvs --noheadings -o name $vg
Move one LV:  pvmove -i0 -n $lvname

Comment 7 Alasdair Kergon 2011-07-06 16:57:56 UTC

This passed the upstream test suite for the first time last night.  However, due to the complexity of the change and the amount of regression testing I believe it needs, I am not offering this as a Z-stream release, but only releasing it as part of the next scheduled update, viz. 6.2.  In the meantime, I'm afraid the above workaround is the best I can offer.

Comment 8 Alasdair Kergon 2011-07-08 21:36:10 UTC

Upstream release 2.02.86 include in Fedora rawhide.  Please test.

Comment 10 Corey Marthaler 2011-10-07 20:17:25 UTC

I added a basic pvmove during I/O regression test case. I didn't see any issues while running it on the latest rpms. Marking this verified (SanityOnly).

SCENARIO - [pvmove_during_io]
Pvmove a volume during active I/O
grant-01: lvcreate -n move_during_io -L 800M mirror_sanity
Starting io to linear to be pvmoved
Attempting pvmove of /dev/sdc6 on grant-01
Deactivating mirror move_during_io... and removing


2.6.32-203.el6.x86_64

lvm2-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-libs-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-cluster-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
cmirror-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011

Comment 11 errata-xmlrpc 2011-12-06 16:54:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1522.html