Bug 877221
Summary: | lvconvert --repair won't reuse physical volumes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | benscott |
Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> |
lvm2 sub component: | Mirroring and RAID (RHEL6) | QA Contact: | Cluster QE <mspqa-list> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | high | CC: | agk, cmarthal, dwysocha, heinzm, jbrassow, mkarg, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac |
Version: | 6.5 | ||
Target Milestone: | beta | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | lvm2-2.02.107-2.el6 | Doc Type: | Bug Fix |
Doc Text: |
No Documentation needed.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2014-10-14 08:23:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 960054, 1056252, 1075263 |
Description
benscott
2012-11-16 00:41:47 UTC
While waiting for the code to be able to do this automatically, here are the steps to do what you want manually: 1) Remove failed device (removes the failed device from the VG) # vgreduce --removemissing --force <vg> 2) Down-convert the LV (removes the empty slot from the RAID LV) # lvconvert -m -1 <vg>/<lv> 3) Up-convert the LV (use available VG space to allocate another RAID image) # lvconvert --type raid1 -m 1 <vg>/<lv> Example: Couldn't find device with uuid ObYo59-GiW9-QS50-VIGT-dYu8-E3Gt-8Hhhou. LV Attr Cpy%Sync Devices lv rwi-a-r-p- 100.00 lv_rimage_0(0),lv_rimage_1(0) [lv_rimage_0] iwi-aor-p- unknown device(1) [lv_rimage_0] iwi-aor-p- /dev/sdd1(0) [lv_rimage_1] iwi-aor--- /dev/sdc1(1) [lv_rimage_1] iwi-aor--- /dev/sde1(0) [lv_rmeta_0] ewi-aor-p- unknown device(0) [lv_rmeta_1] ewi-aor--- /dev/sdc1(0) [root@bp-01 lvm2]# vgreduce --removemissing --force vg /dev/sdb1: read failed after 0 of 2048 at 0: Input/output error /dev/sdb1: read failed after 0 of 512 at 898381381632: Input/output error /dev/sdb1: read failed after 0 of 512 at 898381488128: Input/output error /dev/sdb1: read failed after 0 of 512 at 0: Input/output error /dev/sdb1: read failed after 0 of 512 at 4096: Input/output error Couldn't find device with uuid ObYo59-GiW9-QS50-VIGT-dYu8-E3Gt-8Hhhou. Wrote out consistent volume group vg [root@bp-01 lvm2]# devices vg /dev/sdb1: open failed: No such device or address LV Attr Cpy%Sync Devices lv rwi-a-r-r- 100.00 lv_rimage_0(0),lv_rimage_1(0) [lv_rimage_0] vwi-aor-r- [lv_rimage_1] iwi-aor--- /dev/sdc1(1) [lv_rimage_1] iwi-aor--- /dev/sde1(0) [lv_rmeta_0] ewi-aor-r- [lv_rmeta_1] ewi-aor--- /dev/sdc1(0) [root@bp-01 lvm2]# lvconvert -m 0 vg/lv /dev/sdb1: open failed: No such device or address [root@bp-01 lvm2]# devices vg /dev/sdb1: open failed: No such device or address LV Attr Cpy%Sync Devices lv -wi-a----- /dev/sdc1(1) lv -wi-a----- /dev/sde1(0) [root@bp-01 lvm2]# lvconvert --type raid1 -m 1 vg/lv /dev/sdb1: open failed: No such device or address [root@bp-01 lvm2]# devices vg /dev/sdb1: open failed: No such device or address LV Attr Cpy%Sync Devices lv rwi-a-r--- 100.00 lv_rimage_0(0),lv_rimage_1(0) [lv_rimage_0] iwi-aor--- /dev/sdc1(1) [lv_rimage_0] iwi-aor--- /dev/sde1(0) [lv_rimage_1] iwi-aor--- /dev/sdd1(1) [lv_rmeta_0] ewi-aor--- /dev/sdc1(0) [lv_rmeta_1] ewi-aor--- /dev/sdd1(0) 3 upstream check-ins are needed to fix this bug (plus an extra one for a test that will cause you merge conflicts otherwise): commit ed3c2537b82be4e326a53c7e3e6d5eccdd833800 Author: Jonathan Brassow <jbrassow> Date: Wed Jun 25 22:26:06 2014 -0500 raid: Allow repair to reuse PVs from same image that suffered a PV failure When repairing RAID LVs that have multiple PVs per image, allow replacement images to be reallocated from the PVs that have not failed in the image if there is sufficient space. This allows for scenarios where a 2-way RAID1 is spread across 4 PVs, where each image lives on two PVs but doesn't use the entire space on any of them. If one PV fails and there is sufficient space on the remaining PV in the image, the image can be reallocated on just the remaining PV. commit 7028fd31a0f2d2234ffdd1b94ea6ae6128ca9362 Author: Jonathan Brassow <jbrassow> Date: Wed Jun 25 22:04:58 2014 -0500 misc: after releasing a PV segment, merge it with any adjacent free space Previously, the seg_pvs used to track free and allocated space where left in place after 'release_pv_segment' was called to free space from an LV. Now, an attempt is made to combine any adjacent seg_pvs that also track free space. Usually, this doesn't provide much benefit, but in a case where one command might free some space and then do an allocation, it can make a difference. One such case is during a repair of a RAID LV, where one PV of a multi-PV image fails. This new behavior is used when the replacement image can be allocated from the remaining space of the PV that did not fail. (First the entire image with the failed PV is removed. Then the image is reallocated from the remaining PVs.) commit b35fb0b15af1d87693be286f0630e95622056a77 Author: Jonathan Brassow <jbrassow> Date: Wed Jun 25 21:20:41 2014 -0500 raid/misc: Allow creation of parallel areas by LV vs segment I've changed build_parallel_areas_from_lv to take a new parameter that allows the caller to build parallel areas by LV vs by segment. Previously, the function created a list of parallel areas for each segment in the given LV. When it came time for allocation, the parallel areas were honored on a segment basis. This was problematic for RAID because any new RAID image must avoid being placed on any PVs used by other images in the RAID. For example, if we have a linear LV that has half its space on one PV and half on another, we do not want an up-convert to use either of those PVs. It should especially not wind up with the following, where the first portion of one LV is paired up with the second portion of the other: ------PV1------- ------PV2------- [ 2of2 image_1 ] [ 1of2 image_1 ] [ 1of2 image_0 ] [ 2of2 image_0 ] ---------------- ---------------- Previously, it was possible for this to happen. The change makes it so that the returned parallel areas list contains one "super" segment (seg_pvs) with a list of all the PVs from every actual segment in the given LV and covering the entire logical extent range. This change allows RAID conversions to function properly when there are existing images that contain multiple segments that span more than one PV. commit 1f1675b059d65768524398791b2e505b7dfe2497 Author: Jonathan Brassow <jbrassow> Date: Sat Jun 21 15:33:52 2014 -0500 test: Test addition to show incorrect allocator behavior If a RAID LV has images that are spread across more than one PV and you allocate a new image that requires more than one PV, parallel_areas is only honored for one segment. This commit adds a test for this condition. *** Bug 1113180 has been marked as a duplicate of this bug. *** This appears to work with the latest rpms. Marking verified. 2.6.32-485.el6.x86_64 lvm2-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 lvm2-libs-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 lvm2-cluster-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 udev-147-2.55.el6 BUILT: Wed Jun 18 06:30:21 CDT 2014 device-mapper-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 device-mapper-libs-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 device-mapper-event-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 device-mapper-event-libs-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT 2014 [root@host-002 ~]# lvcreate -m 1 --type raid1 -n lvol0 vg1 -L 1000m /dev/sda1:0-200 /dev/sdb1:0-200 /dev/sdc1:0-200 /dev/sdd1:0-200 Logical volume "lvol0" created # A working RAID mirror with two legs and two segments each: [root@host-002 ~]# lvs --all --segments -o +devices LV VG Attr #Str Type SSize Devices lvol0 vg1 rwi-a-r--- 2 raid1 1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0) [lvol0_rimage_0] vg1 Iwi-aor--- 1 linear 800.00m /dev/sda1(1) [lvol0_rimage_0] vg1 Iwi-aor--- 1 linear 200.00m /dev/sdc1(0) [lvol0_rimage_1] vg1 Iwi-aor--- 1 linear 800.00m /dev/sdb1(1) [lvol0_rimage_1] vg1 Iwi-aor--- 1 linear 200.00m /dev/sdd1(0) [lvol0_rmeta_0] vg1 ewi-aor--- 1 linear 4.00m /dev/sda1(0) [lvol0_rmeta_1] vg1 ewi-aor--- 1 linear 4.00m /dev/sdb1(0) # With a device removed: [root@host-002 ~]# lvs -a -o +devices /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error /dev/sdc1: read failed after 0 of 2048 at 8052408320: Input/output error /dev/sdc1: read failed after 0 of 2048 at 8052506624: Input/output error /dev/sdc1: read failed after 0 of 2048 at 4096: Input/output error Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO. LV VG Attr LSize Cpy%Sync Devices lvol0 vg1 rwi-a-r-p- 1000.00m 100.00 lvol0_rimage_0(0),lvol0_rimage_1(0) [lvol0_rimage_0] vg1 iwi-aor-p- 1000.00m /dev/sda1(1) [lvol0_rimage_0] vg1 iwi-aor-p- 1000.00m unknown device(0) [lvol0_rimage_1] vg1 iwi-aor--- 1000.00m /dev/sdb1(1) [lvol0_rimage_1] vg1 iwi-aor--- 1000.00m /dev/sdd1(0) [lvol0_rmeta_0] vg1 ewi-aor-r- 4.00m /dev/sda1(0) [lvol0_rmeta_1] vg1 ewi-aor--- 4.00m /dev/sdb1(0) [root@host-002 ~]# pvscan --cache /dev/sdc1 [root@host-002 ~]# lvconvert --alloc anywhere --repair -y vg1/lvol0 /dev/sda1 Option --alloc cannot be used with --repair. Run `lvconvert --help' for more information. [root@host-002 ~]# lvconvert --repair -y vg1/lvol0 /dev/sda1 /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error /dev/sdc1: read failed after 0 of 2048 at 8052408320: Input/output error /dev/sdc1: read failed after 0 of 2048 at 8052506624: Input/output error /dev/sdc1: read failed after 0 of 2048 at 4096: Input/output error Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO. Insufficient suitable allocatable extents for logical volume : 251 more required Faulty devices in vg1/lvol0 successfully replaced. [root@host-002 ~]# lvs --all --segments -o +devices /dev/sdc1: open failed: No such device or address Couldn't find device with uuid 9tpQJq-xTsJ-xSHU-VDfs-36qn-JzIl-O3cDaO. LV VG Attr #Str Type SSize Devices lvol0 vg1 rwi-a-r--- 2 raid1 1000.00m lvol0_rimage_0(0),lvol0_rimage_1(0) [lvol0_rimage_0] vg1 iwi-aor--- 1 linear 1000.00m /dev/sda1(2) [lvol0_rimage_1] vg1 iwi-aor--- 1 linear 800.00m /dev/sdb1(1) [lvol0_rimage_1] vg1 iwi-aor--- 1 linear 200.00m /dev/sdd1(0) [lvol0_rmeta_0] vg1 ewi-aor--- 1 linear 4.00m /dev/sda1(1) [lvol0_rmeta_1] vg1 ewi-aor--- 1 linear 4.00m /dev/sdb1(0) (In reply to Corey Marthaler from comment #14) > This appears to work with the latest rpms. Marking verified. > > 2.6.32-485.el6.x86_64 > lvm2-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > lvm2-libs-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > lvm2-cluster-2.02.107-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > udev-147-2.55.el6 BUILT: Wed Jun 18 06:30:21 CDT 2014 > device-mapper-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > device-mapper-libs-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > device-mapper-event-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > device-mapper-event-libs-1.02.86-2.el6 BUILT: Fri Jul 11 08:47:33 CDT 2014 > device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT > 2014 > > > > [root@host-002 ~]# lvcreate -m 1 --type raid1 -n lvol0 vg1 -L 1000m > /dev/sda1:0-200 /dev/sdb1:0-200 /dev/sdc1:0-200 /dev/sdd1:0-200 > Logical volume "lvol0" created > > # A working RAID mirror with two legs and two segments each: > > [root@host-002 ~]# lvconvert --alloc anywhere --repair -y vg1/lvol0 > /dev/sda1 > Option --alloc cannot be used with --repair. > Run `lvconvert --help' for more information. > This is unfortunately bug which has slipped-in - --alloc option needs to be allowed with --repair. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1387.html |