Bug 669111

Summary: Avoid partial pvmoves by only committing the change after all the data has been moved.
Product: Red Hat Enterprise Linux 6 Reporter: Chris Williams <cww>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
lvm2 sub component: Mirroring and RAID (RHEL6) QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: low CC: agk, cmarthal, coughlan, cww, dwysocha, fhirtz, gborsuk, heinzm, jbrassow, jharriga, joe.thornber, msnitzer, nperic, pbatkowski, prajnoha, prockai, spurrier, ssaha, thornber, zkabelac
Version: 6.3Keywords: FutureFeature, Triaged
Target Milestone: rc   
Target Release: 6.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.107-1.el6 Doc Type: Enhancement
Doc Text:
A new command-line option has been added to pvmove: '--atomic'. The current behaviour of pvmove is to identify all the segments of logical volumes to be moved from a given physical volume to another and process them one-by-one. The downside to this behaviour is that an aborted or failed pvmove may result in some parts of LVs being on the destination device while others remain on the source. The '--atomic' option causes all identified logical volumes to be moved together. The commit that places each logical volume on its final destination does not happen until the last LV is processed. Therefore, any abort of the pvmove will ensure that all affected logical volumes remain on the source device. Synopsis: # pvmove --atomic <source PV> <dest PV> NOTE: There is not yet a persistent log for the pvmove mirror. This means a system crash or VG deactivation will force the move to restart from the beginning.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 08:22:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 726492, 756082, 782183, 835616, 960054, 994246, 1075263    

Comment 1 Alasdair Kergon 2011-01-19 17:14:38 UTC
Use case:
We are looking for a builtin LVM way of migrating data from a source array to a
target storage.

The new feature would effectively do following
1)Commands to establish intentions to move/evacuate
2)Monitor progress
3)Once operations are complete remove pv orginating from source array out of
volume group.

Existing method of pvmove is not suitable as at any point in time during the
move a logical volume can span across pv from both and target array, and
failure of one array will result in loss of data.

Though using lvmirror followed by a mirror split avoids pitfalls of pvmove it
has following issues currently:
a) Mirroring striped volumes not supported
b) No single command wraps arround entire migration operation(which needs to be
treated as a single logical operation).

Comment 2 Alasdair Kergon 2011-01-19 17:19:17 UTC
Like a combination of vgsplit with a built-in pvmove.

Perhaps pvmove could be converted to use a real mirror log.

pvmove could be changed to not 'commit' the constituent pvmoves until the end.

It could initiate some (or all) in parallel.

Comment 23 Jonathan Earl Brassow 2014-04-23 18:54:02 UTC
I'd like to be clear as to what is being asked for here.

Is the user asking for something akin to the '-k' option in vxevac?  That is, the ability to hold the move as tentative until a "commit" or "rollback" operation is issued?

Currently, LVM uses what might be called a migrating mirror.  That is, it sets up the entire move but moves through each LV on that PV one at a time.  For each LV in sequence, a mirror is made and upon completion of the copy, the mirror is broken and the destination PV is kept.  Thus, any completed LVs are on the destination, any in-progress moves are on both PVs, and any up-coming LVs are only on the source PV.  A 'pvmove --abort' causes some of the LVs to be on the source and some to be on the destination.

Changing 'pvmove' to allow a tentative move (similar to the '-k' option to vxevac) would allow LVs that have completed their copy to maintain their mirror until all LVs on the PV have been moved.  This would allow an abort to keep all LVs on the original source PV.  The mirror(s) would not be broken until all LVs have moved to the destination entirely (or a "commit" is issued).

Is this what the customer is asking for?

Comment 30 Alasdair Kergon 2014-05-10 12:48:43 UTC
We held an internal discussion about this last week and concluded that we will experiment to see whether we can offer an alternative pvmove mechanism that makes use of a single mirror (with log).  The final commit will be performed automatically when the mirror sync is complete unless the user provides a cmdline option (or lvm.conf setting) that means an explicit 'pvmove --commit' will be supplied by the user.  pvmove --abort will abort the entire operation.

Comment 31 Jonathan Earl Brassow 2014-06-12 14:24:50 UTC
Criteria for QA testing:

Currently, if a pvmove is aborted that is in the process of moving several LVs, those LVs that have completed the move will stay on the destination device while the others remain on the source.  When testing the atomic pvmove, an abort would cause all LVs to remain on the source drive.  i.e. All LVs will get moved or none.

I will have the CLI option for specifying atomic pvmove when finished.

Comment 32 Jonathan Earl Brassow 2014-06-18 04:05:40 UTC
Feature checked in upstream:

commit 5ebff6cc9f631b7409d99b72fa0b39ccec30bf1f
Author: Jonathan Brassow <jbrassow>
Date:   Tue Jun 17 22:59:36 2014 -0500

    pvmove: Enable all-or-nothing (atomic) pvmoves

Comment 33 Jonathan Earl Brassow 2014-06-18 04:10:13 UTC
There is one new CLI option, '--atomic'.  The check-in is still fresh and there may be some issue with calling it '--atomic'.  However, for now if you want the all-or-nothing pvmove, you would do the following:

# pvmove --atomic <source PV> <dest PV>

If a 'pvmove --abort' is performed, all LVs will remain on the source PV.

NOTE: I have not yet included a persistent log for the pvmove mirror.  This means a system crash or VG deactivation will force the move to start from the beginning.  It is trivial to add a log, I think; and I will try to have this done for 6.6 if no other priority takes over.

Comment 34 Alasdair Kergon 2014-06-18 11:25:38 UTC
See comment #30: there's a small additional patch required: the requirement included being able to control the commit at the end manually.

" The final commit will be performed automatically when the mirror sync is complete unless the user provides a cmdline option (or lvm.conf setting) that means an explicit 'pvmove --commit' will be supplied by the user.  pvmove --abort will abort the entire operation."

Comment 36 Alasdair Kergon 2014-06-23 18:26:11 UTC
So three further pieces here:

1) use a persistent log (a patch is in progress)

2) add a --waitforcommit or --manualcommit option (with lvm.conf equivalent).  This means the committing of the move will not happen automatically.  --abort will still abort it at any stage.  If the move is complete --commit will commit it.  If the move is not complete, --commit will override --waitforcommit and ensure the commit is performed automatically when the move is complete.

3) extend the pvmove syntax to allow more than one PV to be moved at once with a single commit.

Comment 38 Jonathan Earl Brassow 2014-08-15 02:32:19 UTC
(In reply to Alasdair Kergon from comment #36)
> So three further pieces here:
> 
> 1) use a persistent log (a patch is in progress)

bug 1130352

> 
> 2) add a --waitforcommit or --manualcommit option (with lvm.conf
> equivalent).  This means the committing of the move will not happen
> automatically.  --abort will still abort it at any stage.  If the move is
> complete --commit will commit it.  If the move is not complete, --commit
> will override --waitforcommit and ensure the commit is performed
> automatically when the move is complete.

bug 1130353

> 
> 3) extend the pvmove syntax to allow more than one PV to be moved at once
> with a single commit.

bug 1130354

Comment 42 Nenad Peric 2014-08-20 11:22:16 UTC
[root@virt-065 ~]# lvs -a -o+devices
  LV      VG   Attr       LSize Data%  Meta%  Move Log Cpy%Sync Convert Devices                                
  radi_lv test rwi-a-r--- 5.00g                        100.00           radi_lv_rimage_0(0),radi_lv_rimage_1(0)
  [radi_lv_rimage_0] test iwi-aor--- 5.00g                                         /dev/sdb1(1)                           
  [radi_lv_rmeta_0]  test ewi-aor--- 4.00m                                         /dev/sdb1(0)                           
  [radi_lv_rimage_1] test iwi-aor--- 5.00g                                         /dev/sdc1(1)                           
  [radi_lv_rmeta_1]  test ewi-aor--- 4.00m                                         /dev/sdc1(0)                           
  lv_root            vg_virt065 -wi-ao---- 6.71g                                         /dev/vda2(0)                           
  lv_swap            vg_virt065 -wi-ao---- 816.00m                                         /dev/vda2(1718)          

[root@virt-065 ~]# pvmove --atomic /dev/sdb1 /dev/sde1 &

[root@virt-065 ~]# pvmove --abort

[root@virt-065 ~]# lvs -a -o+devices
  LV      VG   Attr       LSize Data%  Meta%  Move Log Cpy%Sync Convert Devices                                
  radi_lv test rwi-a-r--- 5.00g                        100.00           radi_lv_rimage_0(0),radi_lv_rimage_1(0)
  [radi_lv_rimage_0] test iwi-aor--- 5.00g                                         /dev/sdb1(1)                           
  [radi_lv_rmeta_0]  test ewi-aor--- 4.00m                                         /dev/sdb1(0)                           
  [radi_lv_rimage_1] test iwi-aor--- 5.00g                                         /dev/sdc1(1)                           
  [radi_lv_rmeta_1]  test ewi-aor--- 4.00m                                         /dev/sdc1(0)                           
  lv_root            vg_virt065 -wi-ao---- 6.71g                                         /dev/vda2(0)                           
  lv_swap            vg_virt065 -wi-ao---- 816.00m                                         /dev/vda2(1718)        



[root@virt-065 ~]# pvmove --atomic /dev/sdb1 /dev/sde1 &
[root@virt-065 ~]# lvs -a -o+devices
  LV      VG   Attr       LSize Data%  Meta%  Move Log Cpy%Sync Convert Devices                                
  radi_lv test rwi-a-r--- 5.00g                        100.00           radi_lv_rimage_0(0),radi_lv_rimage_1(0)
  [radi_lv_rimage_0] test iwI-aor--- 5.00g                                         pvmove0(0)                             
  [radi_lv_rmeta_0]  test ewI-aor--- 4.00m                                         pvmove0(1280)                          
  [radi_lv_rimage_1] test iwi-aor--- 5.00g                                         /dev/sdc1(1)                           
  [radi_lv_rmeta_1]  test ewi-aor--- 4.00m                                         /dev/sdc1(0)                           
  [pvmove0]          test p-C-aom--- 5.00g               /dev/sdb1     15.53            pvmove0_mimage_0(0),pvmove0_mimage_1(0)
  [pvmove0_mimage_0] test Iwi-aom--- 5.00g                                              /dev/sdb1(1)                           
  [pvmove0_mimage_0] test Iwi-aom--- 5.00g                                              /dev/sdb1(0)                           
  [pvmove0_mimage_1] test Iwi-aom--- 5.00g                                              /dev/sde1(0)                           
  [pvmove0_mimage_1] test Iwi-aom--- 5.00g                                              /dev/sde1(1280)                        
  lv_root            vg_virt065 -wi-ao---- 6.71g                                              /dev/vda2(0)                           
  lv_swap            vg_virt065 -wi-ao---- 816.00m                                              /dev/vda2(1718)                
[root@virt-065 ~]# lvs -a -o+devices
  LV      VG   Attr       LSize Data%  Meta%  Move Log Cpy%Sync Convert Devices                                
  radi_lv test rwi-a-r--- 5.00g                        100.00           radi_lv_rimage_0(0),radi_lv_rimage_1(0)
  [radi_lv_rimage_0] test iwi-aor--- 5.00g                                         /dev/sde1(0)                           
  [radi_lv_rmeta_0]  test ewi-aor--- 4.00m                                         /dev/sde1(1280)                        
  [radi_lv_rimage_1] test iwi-aor--- 5.00g                                         /dev/sdc1(1)                           
  [radi_lv_rmeta_1]  test ewi-aor--- 4.00m                                         /dev/sdc1(0)                           
  lv_root            vg_virt065 -wi-ao---- 6.71g                                         /dev/vda2(0)                           
  lv_swap            vg_virt065 -wi-ao---- 816.00m                                         /dev/vda2(1718)                   


marking this behaviour as suggested in comment 33 as VERIFIED with:

lvm2-2.02.109-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
lvm2-libs-2.02.109-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
lvm2-cluster-2.02.109-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
udev-147-2.57.el6    BUILT: Thu Jul 24 15:48:47 CEST 2014
device-mapper-1.02.88-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
device-mapper-libs-1.02.88-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
device-mapper-event-1.02.88-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
device-mapper-event-libs-1.02.88-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 15:43:06 CEST 2014
cmirror-2.02.109-1.el6    BUILT: Tue Aug  5 17:36:23 CEST 2014

Comment 44 errata-xmlrpc 2014-10-14 08:22:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1387.html