Hide Forgot
Support the ability to replace specific devices in a RAID array. RAID is not like traditional LVM mirroring. LVM mirroring required failed devices to be removed or the logical volume would simply hang. RAID arrays can keep on running with failed devices. In fact, for RAID types other than RAID1, removing a device would mean substituting an error target or converting to a lower level RAID (e.g. RAID6 -> RAID5, or RAID4/5 to RAID0). Therefore, rather than removing a failed device unconditionally and potentially allocating a replacement, RAID allows the user to "replace" a device with a new one. This approach is a 1-step solution vs the current 2-step solution. example> lvconvert --replace <dev to remove> vg/lv [possible replacements]
Release criteria (test requirements): 1) Ability to replace a device in an array [root@bp-01 LVM2]# lvcreate --type raid1 -m2 -L 1G -n lv vg Logical volume "lv" created [root@bp-01 LVM2]# devices vg LV Copy% Devices lv 100.00 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0) [lv_rimage_0] /dev/sdb1(1) [lv_rimage_1] /dev/sdb2(1) [lv_rimage_2] /dev/sdc1(1) [lv_rmeta_0] /dev/sdb1(0) [lv_rmeta_1] /dev/sdb2(0) [lv_rmeta_2] /dev/sdc1(0) [root@bp-01 LVM2]# lvconvert --replace /dev/sdb2 vg/lv [root@bp-01 LVM2]# devices vg LV Copy% Devices lv 37.50 lv_rimage_0(0),lv_rimage_1(0),lv_rimage_2(0) [lv_rimage_0] /dev/sdb1(1) [lv_rimage_1] /dev/sdc2(1) [lv_rimage_2] /dev/sdc1(1) [lv_rmeta_0] /dev/sdb1(0) [lv_rmeta_1] /dev/sdc2(0) [lv_rmeta_2] /dev/sdc1(0) 2) Device being rebuilt should be sync'ed properly. See in #1 how after the convert the 'Copy%' reflects that the device is being rebuilt. You can even tell which specific device by checking 'dmsetup status' and looking for the small 'a', which means "(a)live" but resyncing. ('A' means "(A)live" and synced.) [root@bp-01 LVM2]# dmsetup status vg-lv 0 2097152 raid raid1 3 AaA 524800/2097152 3) Check the integrity of the replacement by writing a pattern to a two-way RAID1, replacing one device, then replacing the other, and verifying the pattern. 4) Test the ability to specify a replacement device: example> lvconvert --replace <old PV> vg/lv <new PV> 5) Try replacing more than one device at a time by specifying multiple 'replace' arguments. This should work for n-1 devices of RAID1, 2 devices of RAID6, and 1 device for RAID 4/5. example> lvconvert --replace <old PV1> --replace <old PV2> vg/lv 6) It should be forbidden to replace devices while the array is not in-sync. 7) Replacement drives should never be allocated from extra space on drives already used in the array. IOW, lv_rimage_0 and lv_rimage_1 should not be located on the same PV.
Feature checked in upstream - version 2.02.89 Git commit id: 02941f999ce0f8fa68b923f13cd48219db1fbab6
Adding QA ack for 6.3.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: New Feature to 6.3. No documentation required. Bug 732458 is the bug that requires a release note for the RAID features. Other documentation is found in the LVM manual. Operational bugs need no documentation because they are being fixed before their initial release.
Feature verified in the latest rpms. lvm2-2.02.95-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 lvm2-libs-2.02.95-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 lvm2-cluster-2.02.95-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 udev-147-2.41.el6 BUILT: Thu Mar 1 13:01:08 CST 2012 device-mapper-1.02.74-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 device-mapper-libs-1.02.74-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 device-mapper-event-1.02.74-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 device-mapper-event-libs-1.02.74-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 cmirror-2.02.95-5.el6 BUILT: Thu Apr 19 10:29:01 CDT 2012 ./raid_shuffle -o taft-01 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -i 20 [...] === Iteration 20 of 20 started on taft-01 at Mon Apr 23 17:46:14 CDT 2012 === INUSE PVS IN VG: /dev/sdf2 /dev/sdg1 /dev/sdg2 /dev/sdh1 /dev/sdh2 NOT INUSE PVS IN VG: /dev/sde2 /dev/sdf1 FREE PVS: /dev/sdc2 /dev/sdd1 /dev/sdd2 /dev/sde1 Adding /dev/sde1 to volume group vgextend shuffle /dev/sde1 Moving data (replacing raid image) from /dev/sdg1 to /dev/sde1 on taft-01 lvconvert --replace /dev/sdg1 shuffle/raid /dev/sde1 Waiting until all mirror|raid volumes become fully syncd... 0/1 mirror(s) are fully synced: ( 78.02% ) 1/1 mirror(s) are fully synced: ( 100.00% ) Verify the device moved from /dev/sdg1 is no longer present Verify the device moved to /dev/sde1 is present Checking files on /mnt/raid /usr/tests/sts-rhel6.3/bin/checkit -w /mnt/raid -f /tmp/raid_shuffleA.15500 -v checkit starting with: VERIFY Verify XIOR Stream: /tmp/raid_shuffleA.15500 Working dir: /mnt/raid Removing /dev/sde2 from volume group
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0962.html