Bug 1122698

Summary: lvextend and lvresize -l+100%FREE failed to extend or resize raid or mirror volumes
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Alasdair Kergon <agk>
lvm2 sub component: Mirroring and RAID (RHEL6) QA Contact: Cluster QE <mspqa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, tlavigne, zkabelac
Version: 6.6Keywords: Regression
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.110-1.el6 Doc Type: Enhancement
Doc Text:
LVM includes considerable improvements to the calculation of an appropriate amount of space to allocate when using percentages in commands such as lvresize -l+50%FREE. The new behaviour tries to be more intuitive. Sizes specified are treated either as relating to Logical Extents or to Physical Extents, as appropriate, and the new logical size desired for the LV is calculated. Rounding is then performed to make sure any parallel stripes / mirror legs will be the same size as each other.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-14 08:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2014-07-23 19:38:58 UTC
Description of problem:
This is most likely the same issue as bug 1120989, but the errors reported with mirror and raid appear different then reported there.


# RAID EXTENDS BUT WITH ERRORS
[root@host-002 ~]# lvcreate --type raid1 -m 1 -n 100_percent -L 100M raid_sanity
  Logical volume "100_percent" created
[root@host-002 ~]# lvs -a -o +devices
  LV                     Attr       LSize   Cpy%Sync Devices
  100_percent            rwi-a-r--- 100.00m 100.00   100_percent_rimage_0(0),100_percent_rimage_1(0)
  [100_percent_rimage_0] iwi-aor--- 100.00m          /dev/sdf2(1)
  [100_percent_rimage_1] iwi-aor--- 100.00m          /dev/sdb1(1)
  [100_percent_rmeta_0]  ewi-aor---   4.00m          /dev/sdf2(0)
  [100_percent_rmeta_1]  ewi-aor---   4.00m          /dev/sdb1(0)
[root@host-002 ~]# lvextend -l100%FREE raid_sanity/100_percent
  Extending 2 mirror images.
  Extending logical volume 100_percent to 1.79 GiB
  device-mapper: resume ioctl on  failed: Invalid argument
  Unable to resume raid_sanity-100_percent (253:6)
  Problem reactivating 100_percent
  Releasing activation in critical section.
  libdevmapper exiting with 1 device(s) still suspended.
[root@host-002 ~]# lvs -a -o +devices
  LV                     Attr       LSize    Cpy%Sync Devices
  100_percent            rwi-s-r--- 1016.00m 100.00   100_percent_rimage_0(0),100_percent_rimage_1(0)
  [100_percent_rimage_0] iwi-aor---  916.00m          /dev/sdf2(1)
  [100_percent_rimage_1] iwi-aor---  916.00m          /dev/sdb1(1)
  [100_percent_rmeta_0]  ewi-aor---    4.00m          /dev/sdf2(0)
  [100_percent_rmeta_1]  ewi-aor---    4.00m          /dev/sdb1(0)
[root@host-002 ~]# vgchange -an raid_sanity
  Attempted to decrement suspended device counter below zero.
  0 logical volume(s) in volume group "raid_sanity" now active



# MIRROR FAILS
[root@host-002 ~]# lvcreate -m 1 --alloc anywhere -n mirror --type mirror -L 100M raid_sanity
  Logical volume "mirror" created
[root@host-002 ~]# lvs -a -o +devices
  LV                Attr       LSize   Log         Cpy%Sync Devices
  mirror            mwa-a-m--- 100.00m mirror_mlog 100.00   mirror_mimage_0(0),mirror_mimage_1(0)
  [mirror_mimage_0] iwa-aom--- 100.00m                      /dev/sdd2(0)
  [mirror_mimage_1] iwa-aom--- 100.00m                      /dev/sdf2(0)
  [mirror_mlog]     lwa-aom---   4.00m                      /dev/sdf2(25)
[root@host-002 ~]# lvresize -l100%FREE raid_sanity/mirror
  Extending 2 mirror images.
  Extending logical volume mirror to 1.79 GiB
  LV mirror: mirrored LV segment 0 has wrong size 229 (should be 254).
  LV mirror: mirrored LV segment 1 has wrong size 229 (should be 254).
  Internal error: LV segments corrupted in mirror.
[root@host-002 ~]# lvs -a -o +devices
  LV                Attr       LSize   Log         Cpy%Sync Devices
  mirror            mwa-a-m--- 100.00m mirror_mlog 100.00   mirror_mimage_0(0),mirror_mimage_1(0)
  [mirror_mimage_0] iwa-aom--- 100.00m                      /dev/sdd2(0)
  [mirror_mimage_1] iwa-aom--- 100.00m                      /dev/sdf2(0)
  [mirror_mlog]     lwa-aom---   4.00m                      /dev/sdf2(25)



# LINEAR WORKS
[root@host-002 ~]# lvcreate -L 100M -n linear raid_sanity
  Logical volume "linear" created
[root@host-002 ~]# lvs -a -o +devices
  LV      Attr       LSize   Devices
  linear  -wi-a----- 100.00m /dev/sdd2(0)
[root@host-002 ~]# lvextend -l100%FREE raid_sanity/linear
  Extending logical volume linear to 1.89 GiB
  Logical volume linear successfully resized
[root@host-002 ~]# lvs -a -o +devices
  LV      Attr       LSize   Devices
  linear  -wi-a-----   1.89g /dev/sdd2(0)
  linear  -wi-a-----   1.89g /dev/sdf2(0)


# STRIPE WORKS
[root@host-002 ~]# lvcreate -L 100M -n stripe -i 2 raid_sanity
  Using default stripesize 64.00 KiB
  Rounding size (25 extents) up to stripe boundary size (26 extents).
  Logical volume "stripe" created
[root@host-002 ~]# lvs -a -o +devices
  LV      Attr       LSize   Devices
  stripe  -wi-a----- 104.00m /dev/sdd2(0),/dev/sdf2(0)
[root@host-002 ~]# lvextend -l100%FREE raid_sanity/stripe
  Using stripesize of last segment 64.00 KiB
  Extending logical volume stripe to 1.89 GiB
  Logical volume stripe successfully resized
[root@host-002 ~]# lvs -a -o +devices
  LV      Attr       LSize   Devices
  stripe  -wi-a-----   1.89g /dev/sdd2(0),/dev/sdf2(0)



Version-Release number of selected component (if applicable):
2.6.32-485.el6.x86_64

lvm2-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-libs-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-cluster-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
udev-147-2.55.el6    BUILT: Wed Jun 18 06:30:21 CDT 2014
device-mapper-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014
cmirror-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014


How reproducible:
Everytime

Comment 1 Corey Marthaler 2014-07-23 20:19:57 UTC
This test case can also cause the system to deadlock when a pvscan is run afterwards.

[root@host-001 ~]# pvscan

device-mapper: table: 253:10: dm-3 too small for target: start=0, len=1114112, dev_size=1007616
INFO: task pvscan:9720 blocked for more than 120 seconds.
      Not tainted 2.6.32-485.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
pvscan        D 0000000000000000     0  9720   2533 0x00000080
 ffff88003b8aba18 0000000000000086 ffff88003b8ab9a8 ffffffff81041e88
 ffff880000000000 0000166dc6ddb5da ffff88003b8ab9e8 ffff88003e399c20
 00000000000d002a ffffffffac301ac7 ffff88003b29a638 ffff88003b8abfd8
Call Trace:
 [<ffffffff81041e88>] ? pvclock_clocksource_read+0x58/0xd0
 [<ffffffff810aa101>] ? ktime_get_ts+0xb1/0xf0
 [<ffffffff81527d33>] io_schedule+0x73/0xc0
 [<ffffffff811cdb2d>] __blockdev_direct_IO_newtrunc+0xb7d/0x1270
 [<ffffffff81233c21>] ? avc_has_perm+0x71/0x90
 [<ffffffff811c9490>] ? blkdev_get_block+0x0/0x20
 [<ffffffff81278b3d>] ? get_disk+0x7d/0xf0
 [<ffffffff811ce297>] __blockdev_direct_IO+0x77/0xe0
 [<ffffffff811c9490>] ? blkdev_get_block+0x0/0x20
 [<ffffffff811ca517>] blkdev_direct_IO+0x57/0x60
 [<ffffffff811c9490>] ? blkdev_get_block+0x0/0x20
 [<ffffffff8112587b>] generic_file_aio_read+0x6bb/0x700
 [<ffffffff81233c21>] ? avc_has_perm+0x71/0x90
 [<ffffffff81235ad2>] ? selinux_inode_permission+0x72/0xb0
 [<ffffffff811c98d1>] blkdev_aio_read+0x51/0x80
 [<ffffffff8118d28a>] do_sync_read+0xfa/0x140
 [<ffffffff8109e1e0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff811c982c>] ? block_ioctl+0x3c/0x40
 [<ffffffff811a2ae2>] ? vfs_ioctl+0x22/0xa0
 [<ffffffff8123962b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff8122c486>] ? security_file_permission+0x16/0x20
 [<ffffffff8118dc45>] vfs_read+0xb5/0x1a0
 [<ffffffff8118dd81>] sys_read+0x51/0x90
 [<ffffffff810e518e>] ? __audit_syscall_exit+0x25e/0x290
 [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

Comment 7 Alasdair Kergon 2014-08-22 01:18:39 UTC
First attempt at these fixes:

https://lists.fedorahosted.org/pipermail/lvm2-commits/2014-August/002679.html

Setting this to POST for now, but it might not be the final version.

Even things like 'lvextend -l50%VG' should work now: after the command has run the LV itself should take up approximately half the VG.  (Subject as ever to there being enough space to extend the mirror on suitable disks.)

Percentages are converted into absolute numbers of extents.
This shows up with -vvvv as follows:

"Converted 50%FREE into at most 42 physical extents."

If the words 'at most' appear it means that the command should still succeed if it can't find all the space requested.

Sizes specified are then treated either as relating to Logical Extents or to Physical Extents and the new logical size desired for the LV is calculated.
Rounding is then performed to make sure all parallel stripes / mirror legs will be the same size as each other.

"New size for vg6/lvol0: 36. Existing logical extents: 6 / physical extents: 12.
"

"New size for vg5/lvol2: 37. Existing logical extents: 38 / physical extents: 0."   The 0 there indicates it's a virtual volume such as a thin volume.

Then the allocation (or reduction) routine is performed, just like before.

Comment 9 Corey Marthaler 2014-08-27 23:15:58 UTC
Fix verified in the latest build.

2.6.32-497.el6.x86_64
lvm2-2.02.110-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
lvm2-libs-2.02.110-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
lvm2-cluster-2.02.110-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
udev-147-2.57.el6    BUILT: Thu Jul 24 15:48:47 CEST 2014
device-mapper-1.02.89-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
device-mapper-libs-1.02.89-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
device-mapper-event-1.02.89-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
device-mapper-event-libs-1.02.89-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 15:43:06 CEST 2014
cmirror-2.02.110-1.el6    BUILT: Tue Aug 26 17:53:52 CEST 2014


SCENARIO (raid1) - [extend_100_percent_vg_same_sized_pvs]
Create a raid on a VG and then extend it using -l100%FREE with PVs being the same size
Recreating PVs/VG with same sized devices and only enough needed to make raid type volume
host-110.virt.lab.msp.redhat.com: pvcreate --setphysicalvolumesize 1G /dev/sdg1 /dev/sde2
host-110.virt.lab.msp.redhat.com: vgcreate raid_sanity /dev/sdg1 /dev/sde2
lvcreate --type raid1 -m 1 -n 100_percent -L 100M raid_sanity
lvextend -l100%FREE raid_sanity/100_percent

Quick regression check: performing raid scrubbing (lvchange --syncaction check) on raid raid_sanity/100_percent
  raid_sanity/100_percent state is currently "resync".  Unable to switch to "check".
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 93.89% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Deactivating raid 100_percent... and removing
Restoring VG back to default parameters
vgremove raid_sanity
pvremove /dev/sdg1 /dev/sde2
pvcreate /dev/sdb1 /dev/sdb2 /dev/sde1 /dev/sde2 /dev/sdf1 /dev/sdf2 /dev/sdg1 /dev/sdg2 /dev/sdh1 /dev/sdh2
vgcreate raid_sanity /dev/sdb1 /dev/sdb2 /dev/sde1 /dev/sde2 /dev/sdf1 /dev/sdf2 /dev/sdg1 /dev/sdg2 /dev/sdh1 /dev/sdh2

Comment 10 errata-xmlrpc 2014-10-14 08:25:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1387.html