Description of problem: [root@hayes-01 bin]# pvscan PV /dev/etherd/e1.1p9 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p8 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p7 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p6 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p5 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p4 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p3 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p2 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p10 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] PV /dev/etherd/e1.1p1 VG raid_sanity lvm2 [908.23 GiB / 908.23 GiB free] [root@hayes-01 bin]# lvcreate --type raid4 -i 2 -n alloc_anywhere --alloc anywhere -L 50M raid_sanity /dev/etherd/e1.1p4:0-1500 /dev/etherd/e1.1p1:0-1500 Using default stripesize 64.00 KiB Rounding up size to full physical extent 52.00 MiB Rounding size (13 extents) up to stripe boundary size (14 extents) Segment with extent 1 in PV /dev/etherd/e1.1p1 not found Failed to extend alloc_anywhere_rmeta_1 in alloc_anywhere. [root@hayes-01 bin]# lvcreate --type raid4 -i 3 -n alloc_anywhere --alloc anywhere -L 50M raid_sanity /dev/etherd/e1.1p4:0-1500 /dev/etherd/e1.1p1:0-1500 Using default stripesize 64.00 KiB Rounding up size to full physical extent 52.00 MiB Rounding size (13 extents) up to stripe boundary size (15 extents) Inconsistent length: 1 0 PV segment pe_alloc_count mismatch: 12 != 4294734804 Inconsistent length: 1 0 PV segment pe_alloc_count mismatch: 12 != 4294734804 PV segment VG free_count mismatch: 2325036 != 2790044 Internal error: PV segments corrupted in raid_sanity. LV alloc_anywhere_rimage_0: segment 1 has inconsistent PV area 0 LV alloc_anywhere_rimage_0: segment 2 has inconsistent PV area 0 Internal error: LV segments corrupted in alloc_anywhere_rimage_0. LV alloc_anywhere_rimage_1: segment 1 has inconsistent PV area 0 LV alloc_anywhere_rimage_1: segment 2 has inconsistent PV area 0 Internal error: LV segments corrupted in alloc_anywhere_rimage_1. Version-Release number of selected component (if applicable): 2.6.32-278.el6.x86_64 lvm2-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-libs-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-cluster-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 udev-147-2.41.el6 BUILT: Thu Mar 1 13:01:08 CST 2012 device-mapper-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 cmirror-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 How reproducible: Everytime
1) The allocation section of the -vvvv trace will show what the code is actually doing. (The pasted messages are just some internal checks that detect something went wrong earlier.) 2) Let's see what the simplest failure case is. These ones have --raid4 *and* --stripes as well as the anywhere policy. Are all those parts required for this sort of failure?
First idea: I induced a simple test case to add empty areas to allocated_areas[s] - which is nonsensical. (It found 1 extent for each metadata area, but nothing for any data areas.) --- a/lib/metadata/lv_manip.c +++ b/lib/metadata/lv_manip.c @@ -1110,9 +1122,14 @@ static int _alloc_parallel_area(struct alloc_handle *ah, uint32_t max_to_allocat dm_list_add(&ah->alloced_areas[s], &aa[s].list); s -= ah->area_count + ah->parity_count; } + + aa[s].len = (ah->alloc_and_split_meta) ? len - ah->log_len : len; + /* Skip empty allocations */ + if (!aa[s].len) + continue; + aa[s].pv = pva->map->pv; aa[s].pe = pva->start; - aa[s].len = (ah->alloc_and_split_meta) ? len - ah->log_len : len; log_debug("Allocating parallel area %" PRIu32 " on %s start PE %" PRIu32 " length %" PRIu32 ".",
Output without the above patch: notice the incorrect allocations listed with 'length 0'. Allocating parallel metadata area 0 on /dev/loop2 start PE 3 length 1. Allocating parallel area 0 on /dev/loop2 start PE 4 length 0. Allocating parallel metadata area 1 on /dev/loop2 start PE 0 length 1. Allocating parallel area 1 on /dev/loop2 start PE 0 length 0. Allocating parallel metadata area 2 on /dev/loop3 start PE 3 length 1. Allocating parallel area 2 on /dev/loop3 start PE 4 length 0.
https://lists.fedorahosted.org/pipermail/lvm2-commits/2012-June/000049.html
Verified by running raid_sainty tests, except the tests which needed a missing PV due to Bug 867644 which crashes kernel if attempt is made at aprtially activating a raid LV. Verified with: lvm2-2.02.98-2.el6.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0501.html