Bug 986084 - pool meta device resizing can lead to problems [NEEDINFO]
pool meta device resizing can lead to problems
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2 (Show other bugs)
7.0
x86_64 Linux
unspecified Severity high
: rc
: ---
Assigned To: Zdenek Kabelac
Cluster QE
:
Depends On: 1055944
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-18 19:07 EDT by Corey Marthaler
Modified: 2014-06-17 21:19 EDT (History)
8 users (show)

See Also:
Fixed In Version: lvm2-2.02.99-1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-06-13 08:07:31 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
zkabelac: needinfo? (msnitzer)


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2013-07-18 19:07:16 EDT
Description of problem:
SCENARIO - [resize_pool_meta_device]
Create an XFS filesystem, mount it, snapshot it, and attempt to resize it's pool meta device while online
Making origin volume
lvcreate --thinpool POOL -L 2G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy
Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
lvcreate -V 1G -T snapper_thinp/POOL -n other1
lvcreate -V 1G -T snapper_thinp/POOL -n other2
lvcreate -V 1G -T snapper_thinp/POOL -n other3
lvcreate -V 1G -T snapper_thinp/POOL -n other4
lvcreate -V 1G -T snapper_thinp/POOL -n other5
Placing an XFS filesystem on origin volume
Mounting origin volume

Making snapshot of origin volume
lvcreate -s /dev/snapper_thinp/origin -n meta_resize

Attempt to resize the open snapshoted filesystem multiple times with lvextend/fsadm on qalvm-01
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 12.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 20.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 28.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 36.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 44.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 52.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  [POOL_tmeta] snapper_thinp ewi-ao--- 60.00m  /dev/vdd1(0)  
(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Problem reactivating POOL
  libdevmapper exiting with 2 device(s) still suspended.
Online meta device resize failed

[  138.058126] device-mapper: thin: failed to resize metadata device

[root@qalvm-01 ~]# df -h
/dev/mapper/snapper_thinp-origin 1014M   33M  982M   4% /mnt/origin

[root@qalvm-01 ~]# umount /mnt/origin
[DEADLOCK]

kernel: [  576.274035] umount          D ffff88007fd14600     0  1580    808 0x00000080
kernel: [  576.274035]  ffff88007c015d88 0000000000000086 ffff88007c015fd8 0000000000014600
kernel: [  576.274035]  ffff88007c015fd8 0000000000014600 ffff880070ea0000 ffff880070163800
kernel: [  576.274035]  ffff880070ea0000 0000000000000001 0000000000000000 ffff880070163928
kernel: [  576.274035] Call Trace:
kernel: [  576.274035]  [<ffffffff81602969>] schedule+0x29/0x70
kernel: [  576.274035]  [<ffffffffa01a5109>] _xfs_log_force+0x1e9/0x2a0 [xfs]
kernel: [  576.274035]  [<ffffffff81094290>] ? wake_up_state+0x20/0x20
kernel: [  576.274035]  [<ffffffffa01a51e6>] xfs_log_force+0x26/0x80 [xfs]
kernel: [  576.274035]  [<ffffffffa015aced>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
kernel: [  576.274035]  [<ffffffff811c74f2>] sync_filesystem+0x72/0xa0
kernel: [  576.274035]  [<ffffffff8119cb00>] generic_shutdown_super+0x30/0xd0
kernel: [  576.274035]  [<ffffffff8119cdb7>] kill_block_super+0x27/0x70
kernel: [  576.274035]  [<ffffffff8119d12d>] deactivate_locked_super+0x3d/0x60
kernel: [  576.274035]  [<ffffffff8119d196>] deactivate_super+0x46/0x60
kernel: [  576.274035]  [<ffffffff811b8155>] mntput_no_expire+0xc5/0x120
kernel: [  576.274035]  [<ffffffff811b8f21>] SyS_umount+0x91/0x3a0
kernel: [  576.274035]  [<ffffffff8160c919>] system_call_fastpath+0x16/0x1b


Version-Release number of selected component (if applicable):
3.10.0-0.rc5.61.el7.x86_64
lvm2-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
lvm2-libs-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
lvm2-cluster-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-libs-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-event-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
device-mapper-event-libs-1.02.78-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013
cmirror-2.02.99-0.85.el7    BUILT: Fri Jul 12 08:27:37 CDT 2013


How reproducible:
Everytime
Comment 1 Corey Marthaler 2013-07-18 19:13:28 EDT
# AFTER REBOOT 

# Which device exactly is full here? I don't get it.

[root@qalvm-01 ~]# pvscan
  PV /dev/vdh1   VG snapper_thinp   lvm2 [2.00 GiB / 0    free]
  PV /dev/vdg1   VG snapper_thinp   lvm2 [2.00 GiB / 1.99 GiB free]
  PV /dev/vdf1   VG snapper_thinp   lvm2 [2.00 GiB / 2.00 GiB free]
  PV /dev/vde1   VG snapper_thinp   lvm2 [2.00 GiB / 2.00 GiB free]
  PV /dev/vdd1   VG snapper_thinp   lvm2 [2.00 GiB / 1.93 GiB free]


[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Devices
  POOL         snapper_thinp twi---tz-  2.00g             POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-a----  2.00g             /dev/vdh1(0)
  [POOL_tdata] snapper_thinp Twi-a----  2.00g             /dev/vdg1(0)
  [POOL_tmeta] snapper_thinp ewi-a---- 68.00m             /dev/vdd1(0)
  meta_resize  snapper_thinp Vwi---tz-  1.00g POOL origin
  origin       snapper_thinp Vwi---tz-  1.00g POOL
  other1       snapper_thinp Vwi---tz-  1.00g POOL
  other2       snapper_thinp Vwi---tz-  1.00g POOL
  other3       snapper_thinp Vwi---tz-  1.00g POOL
  other4       snapper_thinp Vwi---tz-  1.00g POOL
  other5       snapper_thinp Vwi---tz-  1.00g POOL

[root@qalvm-01 ~]# vgchange -an snapper_thinp
  0 logical volume(s) in volume group "snapper_thinp" now active

[root@qalvm-01 ~]# lvremove snapper_thinp
Removing pool "POOL" will remove 7 dependent volume(s). Proceed? [y/n]: y
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:4)
  Failed to update thin pool POOL.
Comment 2 Corey Marthaler 2013-07-18 19:36:10 EDT
I attempted to swap in a new meta device (knowing that it probably wouldn't work due to bug 973419/973432).

[root@qalvm-01 ~]# lvcreate -n new_meta -L 68M snapper_thinp
  Logical volume "new_meta" created

[root@qalvm-01 ~]# lvconvert --poolmetadata snapper_thinp/new_meta --thinpool snapper_thinp/POOL
  Attempted to decrement suspended device counter below zero.
Do you want to swap metadata of snapper_thinp/POOL pool with volume snapper_thinp/new_meta? [y/n]: y
  device-mapper: create ioctl on snapper_thinp-POOL_tmeta failed: Device or resource busy
  Failed to activate pool logical volume snapper_thinp/POOL.
  Device snapper_thinp-POOL_tdata (253:3) is used by another device.
  Failed to deactivate pool data logical volume.
Comment 3 Heinz Mauelshagen 2013-07-19 04:10:26 EDT
Corey,

seems like this snippet is reponsible:

/* Release unneeded blocks in thin pool */
/* TODO: defer when multiple LVs relased at once */
if (pool_lv && !update_pool_lv(pool_lv, 1)) {
        log_error("Failed to update thin pool %s.", pool_lv->name);
        return 0;
}

If so, are you able to remove the thin vols individualy?
Comment 4 Zdenek Kabelac 2013-07-19 05:04:35 EDT
Unfortunately pool metadata resize is currently awaiting for Joe's kernel patch.
There is a bug which limits maximum resizable size.
So i.e. for 2MB max is <64MB, there are some rules about that - but for current version of kernel the feature should be marked as unavailable.

It's currently enabled for debugging and testing purposes.
Comment 5 Zdenek Kabelac 2013-08-02 05:39:55 EDT
This upstream patch puts requirement for version 1.9:

http://www.redhat.com/archives/lvm-devel/2013-July/msg00264.html
Comment 6 Peter Rajnoha 2013-08-05 04:28:43 EDT
(In reply to Zdenek Kabelac from comment #5)
> This upstream patch puts requirement for version 1.9:
> 
> http://www.redhat.com/archives/lvm-devel/2013-July/msg00264.html

(this is already in latest RHEL7 package - lvm2-2.02.99-1.el7)
Comment 8 Corey Marthaler 2014-01-10 16:08:31 EST
This still exists in the latest rpms.

3.10.0-64.el7.x86_64
lvm2-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-libs-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
lvm2-cluster-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-event-libs-1.02.82-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014
device-mapper-persistent-data-0.2.8-2.el7    BUILT: Wed Oct 30 10:20:48 CDT 2013
cmirror-2.02.103-10.el7    BUILT: Tue Jan  7 07:44:33 CST 2014


(lvresize --poolmetadatasize +8M /dev/snapper_thinp/POOL)
  device-mapper: resume ioctl on  failed: No space left on device
  Unable to resume snapper_thinp-POOL-tpool (253:5)
  Problem reactivating POOL
  libdevmapper exiting with 2 device(s) still suspended.
Online meta device resize failed

/dev/mapper/snapper_thinp-origin on /mnt/origin type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
[root@host-049 ~]# umount /mnt/origin
[DEADLOCK]
Comment 9 Zdenek Kabelac 2014-01-28 08:06:22 EST
The online metadata resize requires latest thinpool kernel target with fixes as mentioned in Bug 1056647 and thin related mainly in Bug 1055944.

Temporarily usable brew kernel build could be found in 
Bug 1056647 comment 27.

Related upstream thin lvm commits which enables use of thin pool target 1.10 for online metadata resize:

https://www.redhat.com/archives/lvm-devel/2014-January/msg00036.html
Comment 14 Corey Marthaler 2014-02-10 14:40:19 EST
This is now fixed in the latest kernel.

3.10.0-84.el7.x86_64
lvm2-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
lvm2-libs-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
lvm2-cluster-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-libs-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-event-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-event-libs-1.02.84-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-3.el7    BUILT: Wed Feb  5 06:36:34 CST 2014
Comment 15 Ludek Smid 2014-06-13 08:07:31 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.