Bug 500849 - pvmove on a cluster logical volume results in suspends devices
pvmove on a cluster logical volume results in suspends devices
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2 (Show other bugs)
4.7
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Milan Broz
Cluster QE
: ZStream
Depends On:
Blocks: 506030
  Show dependency treegraph
 
Reported: 2009-05-14 10:31 EDT by Matthew Whitehead
Modified: 2013-02-28 23:07 EST (History)
15 users (show)

See Also:
Fixed In Version: lvm2-2.02.42-8.el4
Doc Type: Bug Fix
Doc Text:
Under certain circumstances, running the pvmove utility could cause a device to be left in a suspended state. This update fixes the pvmove abort to be cluster-aware when a temporary mirror activation fails.
Story Points: ---
Clone Of:
: 501473 (view as bug list)
Environment:
Last Closed: 2011-02-16 08:56:07 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matthew Whitehead 2009-05-14 10:31:30 EDT
Description of problem: doing a pvmove causes suspended devices that hangs clustered disks.


Version-Release number of selected component (if applicable): lvm2-cluster-2.02.37-3.el4


How reproducible: Intermittant, but frequent


Steps to Reproduce:
1. Do a 'pvmove /dev/sd??'
2. 
3.
  
Actual results:
The move may report success but still fail, leaving the device mappder dev in a suspended state.


Expected results: 
Should not suspends devices after 'success'.


Additional info:
Comment 5 Milan Broz 2009-05-14 12:34:14 EDT
This is a simple case how pvmove can leave some suspended devices on local node
(this is not clustered VG)

# lvs -o +devices vg_test
  LV   VG      Attr   LSize Origin Snap%  Move Log Copy%  Convert Devices
  lv   vg_test -wi-ao 1.00G                                       /dev/sda3(0)

# create fake pvmove target - possible some remainder from previous fail.
# note that there is no PVMOVE flag in metadata!

# dmsetup create vg_test-pvmove0 --notable

# dmsetup info vg_test-lv
Name:              vg_test-lv
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 3
Number of targets: 1
UUID: LVM-QLxnTqNJIayeQgPG6TU32fsFJ56gAW2YLu6xaNTn42zKCPfHuAwTsfLwlBefINup

# pvmove /dev/sda3
  device-mapper: create ioctl failed: Device or resource busy
  ABORTING: Temporary mirror activation failed.  Run pvmove --abort.
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument

# dmsetup info vg_test-lv
Name:              vg_test-lv
State:             SUSPENDED
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 3
Number of targets: 1
UUID: LVM-QLxnTqNJIayeQgPG6TU32fsFJ56gAW2YLu6xaNTn42zKCPfHuAwTsfLwlBefINup

(and pvmove --abort fix this situation)
Comment 6 Milan Broz 2009-05-18 17:24:41 EDT
Patch is here, it should fail and revert changes, so no suspended devices are left behind
https://www.redhat.com/archives/lvm-devel/2009-May/msg00142.html

Situation above can happen if for some reason there is residual temporary pvmove device left but lvm metadata have no pvmove flag.

In cluster this can happen for example when VG clustered flag is removed, a local pvmove is run and later the local pvmove opertion is not properly aborted (or is aborted from other node). The clustered pvmove then must fail because one node have already another local temporary pvmove device.
Comment 7 Milan Broz 2009-05-27 15:33:29 EDT
Fixed upstream in lvm2 2.02.47.
Comment 8 James G. Brown III 2009-05-28 07:51:25 EDT
Milan, Is there a way to avoid this or any specific steps to take after it has happened in the meantime?

- James
Comment 19 Milan Broz 2010-10-21 13:15:55 EDT
Fixed in lvm2-2.02.42-8.el4.
Comment 21 Corey Marthaler 2011-01-18 16:13:55 EST
I was unable to reproduce the 'SUSPEND' state. Marking verified.

2.6.9-94.ELsmp

lvm2-2.02.42-9.el4    BUILT: Thu Oct 21 15:49:57 CDT 2010
lvm2-cluster-2.02.42-10.el4    BUILT: Tue Jan 18 06:17:17 CST 2011
device-mapper-1.02.28-3.el4    BUILT: Thu Mar  4 14:48:16 CST 2010
cmirror-1.0.2-1.el4    BUILT: Thu Feb 26 15:29:27 CST 2009
cmirror-kernel-2.6.9-43.14.el4    BUILT: Wed Dec 22 16:24:19 CST 2010


[root@taft-01 ~]# lvs -o +devices vg_test
  LV   VG      Attr   LSize Origin Snap%  Move Log Copy%  Convert Devices
  lv   vg_test -wi-a- 1.00G                                       /dev/sdb1(0)

[root@taft-01 ~]# dmsetup create vg_test-pvmove0 --notable

[root@taft-01 ~]# dmsetup info vg_test-lv
Name:              vg_test-lv
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 2
Number of targets: 1
UUID: LVM-X27StDfNRoRDCiQLaGO5LPpwlbOJ1GzS5BG9ehTK3YuPNrgIf5tYlveeqtyYKxS0

[root@taft-01 ~]# pvmove  /dev/sdb1
  device-mapper: create ioctl failed: Device or resource busy
  Temporary pvmove mirror activation failed.

[root@taft-01 ~]# dmsetup info vg_test-lv
Name:              vg_test-lv
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 2
Number of targets: 1
UUID: LVM-X27StDfNRoRDCiQLaGO5LPpwlbOJ1GzS5BG9ehTK3YuPNrgIf5tYlveeqtyYKxS0
Comment 22 Jaromir Hradilek 2011-01-20 07:48:39 EST
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Under certain circumstances, running the pvmove utility could cause a device to be left in a suspended state. This update fixes the pvmove abort to be cluster-aware when a temporary mirror activation fails.
Comment 23 errata-xmlrpc 2011-02-16 08:56:07 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0236.html

Note You need to log in before you can comment on or make changes to this bug.