Description of problem: Created around 150+ volumes using heketi-cli. After 150 its failing due to metadatasize. heketi-cli volume create -name=vol161 -size=100 -durability="replicate" -replica=2 Error: Process exited with: 5. Reason was: () Heketi logs: [sshexec] ERROR 2016/03/31 14:22:26 /builddir/build/BUILD/heketi-6563551111f7178b679866e85a0682325929d037/src/github.com/heketi/heketi/utils/ssh/ssh.go:155: Failed to run command [sudo lvcreate --poolmetadatasize 262144K -c 256K -L 52428800K -T vg_b2895b77ebe7ba3ba064544d15e54653/tp_99f7a9139e07e7f041afc4141016b4d4 -V 52428800K -n brick_99f7a9139e07e7f041afc4141016b4d4] on <node-4 ip>:22: Err[Process exited with: 5. Reason was: ()]: Stdout []: Stderr [ VG vg_b2895b77ebe7ba3ba064544d15e54653 metadata too large for circular buffer Failed to write VG vg_b2895b77ebe7ba3ba064544d15e54653. ] Version-Release number of selected component (if applicable): heketi-1.0.2-1 How reproducible: After 150+ volumes Steps to Reproduce: Try to create volume using heketi-cli Actual results: Failing with "Error: Process exited with: 5. Reason was: ()" Expected results: Volume creation should be successful. Additional info: Will add heketi logs and setup details.
Added bug: https://github.com/heketi/heketi/issues/268
It looks like the default metadatasize for PVs is too small: https://www.redhat.com/archives/linux-lvm/2010-November/msg00088.html May need to increase it to something like 64MB, which (I'm guessing here) should allow for about ~2-3K LVs.
PR https://github.com/heketi/heketi/pull/275 is available. The change sets the metadata size to 128MB. We need to find verification that this value is correct
Checked with #lvm community and they mentioned that RHEV/Ovirt uses 128M metadata size. I have confirmed in this email, but still haven't found the code http://lists.ovirt.org/pipermail/users/2013-July/015435.html . I will move forward with this change using 128M size.
Althought this solution is used by ovirt, latest LVM autoresizes the metadata. See lvm.conf thin_pool_autoextend_threshold
This is an example of mixing apples and oranges. So please spend a few minutes reading the man page (and yes it's been me answering your question on freenode...) -- lvm2 metadata size - is the preallocated buffer to store metadata for a volume group and it's typically located in front of disk/device. So for the case you want to keep less then 5K LVs - 8MB metadata size should ok - and it's quite mandatory to get this size correct properly set when you create/initialize your PV/VG - change of this size later is not supported by lvm2 tools and it's non-trivial. -- thin-pool metadata is the size to keep information about thin-volume within a single thin-pool - by default lvm2 targets for 128MB - but it could be easily created bigger if you know in front you need more. And yes thin-pool metadata could be easily resized online and could be automatically resized when threshold is reached. Now - using thousands of active thin volumes from a single thin-pool is not advised - although it would be interesting to get any feedback about some workload comparison with native (non-thin) volumes.
Thanks Zdenek. Our model is as follows: On each single disk, we create a PV, and then one VG on top of it. On the VG, we create a thin pool for each individual LV. Here is an example diagram: +-----------+ +-----+ | | |Brick| | Brick B | |A | | XFS | |XFS | +----------------------------+ | | | | ThinP B | ThinP A | | | | +-----------------------------------+ | | | VG | +-----------------------------------+ | | | PV | +-----------------------------------+ | | | Disk | | | +-----------------------------------+
In this case do not forget counting with far bigger metadata size. 1 single thin LV + thin-pool lead to 4 LVs stored in metadata - so the size grow much faster if you would be using just 1 linear LV. And as said earlier - when number of LVs in lvm2 metadata grows too much - the code process will go noticeable slower (lvm2 is not really a database tool ATM, and never been designed for 10K volumes or more....) Also archiving and backup in /etc/lvm/archive might become noticable.... So for lots of LVs - pvcreate --metadatasize For lots of thinLVs in a single thin-pool lvcreate --poolmetadatasize
Created attachment 1156679 [details] Heketi Logs
Prasanth , It seems the heketi rpm/image you are using doesn't have fix for this issue, in logs. Here its taking default metadatasize. pvcreate --dataalignment 256K /dev/vdd After fix, metadatasize is "128M" and I was able to create 256+ volumes with it. pvcreate --metadatasize=128M --dataalignment=256K /dev/sda
(In reply to Prasanth from comment #13) > Created attachment 1156679 [details] > Heketi Logs Creating 183 thin-pools + thin-volumes - creates 4 volumes! Let's approximate 3KB of lvm2 metadata space per 4 volumes. And you easily run out of ~500KB max metadata size you get with 1M --metadatasize. You could check at any point used space with 'vgcfgbackup' or looking at archived size in /etc/lvm/archive in case archiving is enabled. (In this case I'd strongly advice to disable archiving and clear event content of /etc/lvm/archive directory) For using this amount of LVs in a single VG - you simply have to create big metadata in the 'pvcreate' time. e.g.: pvcreate --metadatasize 64M /dev/sdX vgcreate vg /dev/sdX once the PV is 'pvcreated' you can't change metadata size. Surely no bug on lvm2 side here, default 1MB size is simply not big enough.
(In reply to Neha from comment #14) > Prasanth , > > It seems the heketi rpm/image you are using doesn't have fix for this issue, > in logs. Here its taking default metadatasize. > > pvcreate --dataalignment 256K /dev/vdd > > > After fix, metadatasize is "128M" and I was able to create 256+ volumes with > it. > > pvcreate --metadatasize=128M --dataalignment=256K /dev/sda It looks to me that, the issue is different than the fix mentioned here. That said, in c#11 Zdenek mentioned that, if we are creating lots of thin LVs we need to have bigger 'poolmetadatasize' . iic, we are hitting that limit here.
This is fixed in the the server since 1.0.3. This should be moved to ON_QA
.
able to create 260 volumes on a single pv using current heketi image. Moving it to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1498.html