Bug 1416885 - [lvm2] Thin pool creation fails from newly added disk to volume group
Summary: [lvm2] Thin pool creation fails from newly added disk to volume group
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1899134 1418684 1546181
TreeView+ depends on / blocked
 
Reported: 2017-01-26 17:07 UTC by smahajan@redhat.com
Modified: 2020-11-18 15:29 UTC (History)
14 users (show)

Fixed In Version: lvm2-2.02.180-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-10-30 11:02:16 UTC
Target Upstream Version:


Attachments (Terms of Use)
lvcreate failure verbose output (-vvvv) (107.50 KB, text/plain)
2017-01-27 20:55 UTC, Vivek Goyal
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:3193 0 None None None 2018-10-30 11:03:23 UTC

Description smahajan@redhat.com 2017-01-26 17:07:53 UTC
Description of problem:

Host OS: Fedora-23.
When I spin a VM (RHEL 7.0) using KVM/QEMU Virtual machine manager.
And run /usr/bin/docker-storage-setup in it to setup my storage for docker. It blows up.

ERROR MESSAGE: "Insufficient suitable allocatable extents found for logical volume docker-pool"

INFO: Device node /dev/vdb1 exists.
  Physical volume "/dev/vdb1" successfully created
  Volume group "rhel" successfully extended
  Rounding up size to full physical extent 44.00 MiB
  Insufficient suitable allocatable extents found for logical volume docker-pool.

Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux Server release 7.2 (Maipo)

How reproducible:
100%


Steps to Reproduce:
1. Setup a RHEL7.0 VM with an additional harddrive (20 GB). Let's call it /dev/vdb.
2. Add these lines to /etc/sysconfig/docker-storage-setup
   DEVS=/dev/vdb
   WIPE_SIGNATURES=true
3. /usr/bin/docker-storage-setup

Actual results:
ERROR MESSAGE: "Insufficient suitable allocatable extents found for logical volume docker-pool"

INFO: Device node /dev/vdb1 exists.
  Physical volume "/dev/vdb1" successfully created
  Volume group "rhel" successfully extended
  Rounding up size to full physical extent 44.00 MiB
  Insufficient suitable allocatable extents found for logical volume docker-pool.


Expected results:

It should successfully create the thin pool named "docker-pool"


Additional info:

Based on the default configuration of fedora workstation (F23), When I spin my VM (RHEL7.0). It allocates the entire space to PV and VG.

sh-4.2# pvs
  PV         VG   Fmt  Attr PSize  PFree 
  /dev/vda2  rhel lvm2 a--  19.51g 40.00m

sh-4.2# vgs
  VG   #PV #LV #SN Attr   VSize  VFree 
  rhel   1   2   0 wz--n- 19.51g 40.00m

when I run /usr/bin/docker-storage-setup (d-s-s) with extra hard disk (/dev/vdb) it creates a pv "/dev/vdb1" and tries to allocate space from that PV into the existing volume group "rhel".
And when d-s-s tries to create the thin pool it blows up. It should ideally create the thin pool successfully since now it has extra 20GB of space.

Comment 2 Vivek Goyal 2017-01-27 19:48:52 UTC
This problem basically happens when following is true.

- Say rootfs has been carved out from root volume group and there is very little space left in VG. Say 12MB is free. There is only one disk in VG.

- Now add another disk to VG to extend it and then try to create a thin pool so that its size is 40%FREE space in VG. And then it fails.

- I can see that metadata volume creation succeeds and then data volume creation fails.

Comment 3 Zdenek Kabelac 2017-01-27 20:01:37 UTC
From provided  output in comment there just 40MB of free space in VG.

Unsure what is the 'exact' command executed - but it seem like with all
the rounding lvm2 was unable to figure out proper layout.


We do need to see state of pvs,vgs,lvs  before commands are run.

Then during execution of 'lvcreate -vvvv'

Comment 4 Zdenek Kabelac 2017-01-27 20:02:55 UTC
Also  'f23' has pretty old instance of lvm2.

Would it be possible to take some 'newer' rpm package and retest ?
There have been surely some improvements.

Comment 5 Vivek Goyal 2017-01-27 20:45:01 UTC
Zdenek, I had tested on fedora 25 and same issue was present there as well. I had pasted all details on IRC yesterday. I will dig it up and paste it here. That will give you a pretty good idea.

Comment 6 Vivek Goyal 2017-01-27 20:55:35 UTC
Created attachment 1245269 [details]
lvcreate failure verbose output (-vvvv)

This is lvcreate thin pool creation failure verbose output. It was captured on f25. lvm2 version was lvm2-2.02.167-3.fc25.x86_64

Comment 7 Vivek Goyal 2017-01-27 20:58:52 UTC
 
Following is additional data about state of volume group at the time of failure.

[root@vm7-f25 ~]# vgs
VG     #PV #LV #SN Attr   VSize  VFree 
fedora   2   4   0 wz--n- 68.99g 19.94g

[root@vm7-f25 ~]# pvs
PV         VG     Fmt  Attr PSize  PFree 
/dev/vda2  fedora lvm2 a--  49.00g 12.00m
/dev/vdd1  fedora lvm2 a--  20.00g 19.93g


[root@vm7-f25 ~]# lvcreate -y --type thin-pool --zero n -c 512K --poolmetadatasize 144688s -l 40%FREE -n docker-pool fedora
   Using default stripesize 64.00 KiB.
   Rounding up size to full physical extent 72.00 MiB
   Insufficient suitable allocatable extents found for logical volume docker-pool.


 I see that a metadata volume was created of 72m. So most likely it failed to create data volume. Ran same command with -vvvv and attached output with the bug.

Comment 8 Zdenek Kabelac 2017-01-27 21:04:21 UTC
The issue in your case is different as you have VG with 2 PVs.

There is ATM different logic applied for thin-pool allocation when there is only 1 PV or when there are 2PVs available.

For  2PVs - lvm2 tries to split 'data' LV on 1st. PV and  'metadata' on 2nd. PV.

But the logic for 2PVs is not yet able to handle %FREE allocation you might want to wish to be working.

I'll need to think if we can do something better on this case.
ATM it's beyond lvm2 allocator capabilities without extra user hinting.

LVM just picks one PV for data and the other for metadata - but this logic is not really smart.

For cases like this - one solution is to 'build' thin-pool from basic LVs on its own - i.e. dataLV, metadataLV and lvconvert them to thin-pool.
Other idea to check is to pass '--alloc anywhere'  to give lvm2 more freedom in space usage if you don't really care about device separation.

The case reported here originally however shows problem with single PV.
This should not suffer from sizing problem across PV boundary.

So I'd like to see the -vvvv for reported case.

Comment 9 Zdenek Kabelac 2017-01-27 22:24:21 UTC
So as a workaround and hinting in this case - user may simply specify  new added PV as the only place where to allocate space for thin-pool data & metadata:

lvcreate -T -l40%PVS vg/pool /dev/vdd1


existing logic tries to place metadata on different PV without 'fallback' to same PV in case the 1st. way is not possible.

Comment 10 Vivek Goyal 2017-01-31 16:50:06 UTC
- Is it limited to specifying single PV? Or I can specify 2 PVs as well.
- So it basically boils down to that we figure out which pvs have free space and pass in all those pvs to lvcreate?

Comment 11 Vivek Goyal 2017-01-31 16:57:22 UTC
I added two disks to VG, 20 GB each. That means 40GB of space is free in volume group and there are total 3 disks in VG.

Now I asked to create a thin pool of 32GB and that worked. I can see that metadata is on disk /dev/vdb1 and data has been split onto two disks. /dev/vdb1 and /dev/vdc1. 

So lvcreate is smart enough to split data on multiple disks. And in this case it also put data and metadata volume on same disk. /dev/vdb1. Not sure how did it decide that it is ok to do that.

vdb                           252:16   0   20G  0 disk 
└─vdb1                        252:17   0   20G  0 part 
  ├─fedora-docker--pool_tdata 253:4    0 31.8G  0 lvm  
  │ └─fedora-docker--pool     253:5    0 31.8G  0 lvm  
  └─fedora-docker--pool_tmeta 253:3    0   92M  0 lvm  
    └─fedora-docker--pool     253:5    0 31.8G  0 lvm  
sr0                            11:0    1 1024M  0 rom  
vdc                           252:32   0   20G  0 disk 
└─vdc1                        252:33   0   20G  0 part 
  └─fedora-docker--pool_tdata 253:4    0 31.8G  0 lvm  
    └─fedora-docker--pool     253:5    0 31.8G  0 lvm

Comment 12 Zdenek Kabelac 2017-01-31 17:00:28 UTC
After    vgname/lvname   there follows  PVs list - so you could put in list of any PVs (even with exact extents listed)

But as said already  it's ONLY workaround - there is no point why docker should play 'volume manager' role.

So the bug with %FREE simply needs to be fixed internally in lvm2.

My proposal was only meant to be used as 'instant workaround' of problem.

Comment 13 Vivek Goyal 2017-01-31 17:05:47 UTC
Ok, got it. So I will wait for lvm to fix this issue. (Until and unless some user comes screaming at me).

Will be great if this issue could get little higher priority in the list.

Comment 16 Corey Marthaler 2018-04-11 22:58:37 UTC
I don't think this has to do with the %FREE exclusively. You just need a 2 PV VG, where one of the PVs is almost full, then it's just some calculation interaction between pool size and chunk size it appears.

 
# I created a dummy LV to take up the majority of /dev/sda1
[root@host-087 ~]# pvs                                                                                                                                                                                                                                        
  PV         VG            Fmt  Attr PSize   PFree                                                                                                                                                                                                            
  /dev/sda1  VG            lvm2 a--  <29.99g  20.00m                                                                                                                                                                                                          
  /dev/sdb1  VG            lvm2 a--  <29.99g <29.99g                                                                                                                                                                                                          
[root@host-087 ~]# vgs                                                                                                                                                                                                                                        
  VG            #PV #LV #SN Attr   VSize   VFree  
  VG              2   1   0 wz--n- <59.98g <30.01g
[root@host-087 ~]# lvs
  LV    VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  dummy VG            -wi-a----- <29.97g                                                    


# There's 30G free in this VG, but a 10G pool doesn't work?
[root@host-087 ~]# lvcreate -y --type thin-pool -L 10G -n docker-pool VG 
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Insufficient suitable allocatable extents for logical volume docker-pool: 2560 more required

# But if we adjust the chunksize it does!
[root@host-087 ~]# lvcreate -y --type thin-pool -c 512K -L 10G -n docker-pool VG 
  Thin pool volume with chunk size 512.00 KiB can address at most 126.50 TiB of data.
  WARNING: Pool zeroing and 512.00 KiB large chunk size slows down thin provisioning.
  WARNING: Consider disabling zeroing (-Zn) or using smaller chunk size (<512.00 KiB).
  Logical volume "docker-pool" created.


# Reset back to earlier
[root@host-087 ~]# lvremove VG/docker-pool
Do you really want to remove active logical volume VG/docker-pool? [y/n]: y
  Logical volume "docker-pool" successfully removed
[root@host-087 ~]# lvs
  LV    VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  dummy VG            -wi-a----- <29.97g                                                    


# Again, there isn't enough space for a 10G pool?
[root@host-087 ~]# lvcreate -y --type thin-pool -L 10G -n docker-pool VG 
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Insufficient suitable allocatable extents for logical volume docker-pool: 2560 more required

# Or 12G?
[root@host-087 ~]# lvcreate -y --type thin-pool -L 12G -n docker-pool VG 
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Insufficient suitable allocatable extents for logical volume docker-pool: 3072 more required

# Or 14G?
[root@host-087 ~]# lvcreate -y --type thin-pool -L 14G -n docker-pool VG 
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Insufficient suitable allocatable extents for logical volume docker-pool: 3584 more required

# But then there is enough space for an 18G pool?
[root@host-087 ~]# lvcreate -y --type thin-pool -L 18G -n docker-pool VG 
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Logical volume "docker-pool" created.

[root@host-087 ~]# lvs -a -o +devices
  LV                  VG    Attr       LSize   Pool Origin Data%  Meta%   Devices
  docker-pool         VG    twi-a-tz--  18.00g             0.00   0.53    docker-pool_tdata(0)
  [docker-pool_tdata] VG    Twi-ao----  18.00g                            /dev/sdb1(0)
  [docker-pool_tmeta] VG    ewi-ao----  20.00m                            /dev/sdb1(4608)
  dummy               VG    -wi-a----- <29.97g                            /dev/sda1(0)
  [lvol0_pmspare]     VG    ewi-------  20.00m                            /dev/sda1(7672)


3.10.0-862.el7.x86_64

lvm2-2.02.177-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
lvm2-libs-2.02.177-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
lvm2-cluster-2.02.177-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
lvm2-lockd-2.02.177-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
lvm2-python-boom-0.8.5-4.el7    BUILT: Fri Feb 16 13:37:10 CET 2018
cmirror-2.02.177-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
device-mapper-1.02.146-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
device-mapper-libs-1.02.146-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
device-mapper-event-1.02.146-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018
device-mapper-event-libs-1.02.146-4.el7    BUILT: Fri Feb 16 13:22:31 CET 2018

Comment 19 Zdenek Kabelac 2018-07-09 08:46:47 UTC
I'm convinced these 2 patches in stable branch deal with the problem:

https://www.redhat.com/archives/lvm-devel/2018-July/msg00017.html
https://www.redhat.com/archives/lvm-devel/2018-July/msg00020.html

although allocator code got too complex, so it's difficult to say it's the ultimate fix. But ATM it should resolve reported issue.

Comment 21 Roman Bednář 2018-07-31 11:31:02 UTC
The patches seem to address the issue, I was not able to reproduce what's in Comment 16 with latest rpms. However this is still missing the unit testing requested earlier in Comment 17.


[root@virt-371 ~]# pvs
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  PV         VG            Fmt  Attr PSize  PFree 
  /dev/sda   vg            lvm2 a--  29.99g 20.00m
  /dev/sdb   vg            lvm2 a--  29.99g 29.99g
  /dev/vda2  rhel_virt-371 lvm2 a--  <7.00g     0 

[root@virt-371 ~]# lvcreate -y --type thin-pool -L 10G -n docker-pool vg
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Logical volume "docker-pool" created.

[root@virt-371 ~]# lvremove -y vg/docker-pool
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Logical volume "docker-pool" successfully removed

[root@virt-371 ~]# lvcreate -y --type thin-pool -L 12G -n docker-pool vg
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Logical volume "docker-pool" created.

[root@virt-371 ~]# lvremove -y vg/docker-pool
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Logical volume "docker-pool" successfully removed

[root@virt-371 ~]# lvcreate -y --type thin-pool -L 14G -n docker-pool vg
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Logical volume "docker-pool" created.

[root@virt-371 ~]# lvremove -y vg/docker-pool
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Logical volume "docker-pool" successfully removed

[root@virt-371 ~]# lvcreate -y --type thin-pool -L 18G -n docker-pool vg
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data.
  Logical volume "docker-pool" created.

[root@virt-371 ~]# lvremove -y vg/docker-pool
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Logical volume "docker-pool" successfully removed


lvm2-2.02.180-1.el7.x86_64

Comment 22 Zdenek Kabelac 2018-07-31 12:39:17 UTC
Patch has been accompanied with following unit test extension:

333eb8667e73439c71dc351a02612c1c7601e8ec

https://www.redhat.com/archives/lvm-devel/2018-July/msg00016.html

---
# Check how allocator works with 2PVs where one is nearly full
lvcreate -l99%PV $vg "$dev1"
lvs -a $vg
# Check when separate metadata is required, allocation needs to fail
fail lvcreate -L10 -T --poolmetadataspare n --config 'allocation/thin_pool_metadata_require_separate_pvs=1' $vg
# Check when data and metadata may share the same PV, it shall pass
lvcreate -L10 -T --poolmetadataspare n --config 'allocation/thin_pool_metadata_require_separate_pvs=0' $vg
lvremove -f $vg

Comment 24 errata-xmlrpc 2018-10-30 11:02:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3193


Note You need to log in before you can comment on or make changes to this bug.