Description of problem: The metadata utilization can reach to 100% if the lvm pool size is high. The size of metadata lv was only 88.00m for a pool with 5.44t size and the utilization become 100% in few months. === lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lvol0_pmspare] rhvh ewi------- 88.00m pool00 rhvh twi-cotzM- 5.44t 11.86 99.95 [pool00_tdata] rhvh Twi-ao---- 5.44t [pool00_tmeta] rhvh ewi-ao---- 88.00m rhvh-4.0-0.20170104.0 rhvh Vwi---tz-k 5.42t pool00 root rhvh-4.0-0.20170104.0+1 rhvh Vwi-aotz-- 5.42t pool00 rhvh-4.0-0.20170104.0 11.80 root rhvh Vwi-a-tz-- 5.42t pool00 0.17 swap rhvh -wi-ao---- 4.00g var rhvh Vwi-aotz-- 15.00g pool00 4.09 === As per the man page of lvmthin, the recommended size is 1GiB. === It can be hard to predict the amount of metadata space that will be needed, so it is recommended to start with a size of 1GiB which should be enough for all practical purposes. ==== Also by default, thin_pool_autoextend_threshold is set as 100 which disables the automatic extension of the metadata lv if it reach the threshold. grep "thin_pool_autoextend_threshold" /etc/lvm/lvm.conf thin_pool_autoextend_threshold = 100 Version-Release number of selected component (if applicable): RHV-H 7.3 Actual results: Metadata utilization can reach 100% which can cause various issues. Expected results: Prevent metadata volume to reach 100% Additional info:
There are two different bugs here. One is that Anaconda does not seem to set reasonable defaults. The other is that RHV-H doesn't have thin_pool_autoextend_threshold set below 100 (I'd argue that this should probably be set different in the defaults from lvm2, but we'll change it from our end)
For a complete solution to this, we should: * Check the used metadata size (and add an abstraction to imgbased for this) * On upgrades, extend the metadata size to 1GB if it is not already (plugins.osupdater will also trigger on "layout --init", so we can also cover the install case here) * nodectl check should ensure that this value is sane on login More generally, autoextend should also be set in lvm.conf. This is tricky, though. vdsm already uses lvmlocal.conf, so we could submit a patch to have it set there. Or file a bug against lvm2 to change the default. But to my knowledge, there is nothing like '/etc/lvm.conf.d/' in which we can set one-off values (other than lvmlocal), so we'll need to seek integration with another team. We could also just clobber this value, but this means that lvm.conf from new images won't supercede ours, which is not nice.
Part1. Install test Test Versions: Build1: redhat-virtualization-host-4.1-20170403.0 imgbased-0.9.20-0.1.el7ev.noarch Test Steps: 1. Install Build1, reboot and log into Build1. 2. Run "imgbase w", "lvs -a" Test Results: 1. After step2, the results are: [root@fctest ~]# imgbase w [INFO] You are on rhvh-4.1-0.20170403.0+1 [root@fctest ~]# lvs -a WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lvol0_pmspare] rhvh_fctest ewi------- 96.00m pool00 rhvh_fctest twi-aotz-- 375.11g 2.78 0.20 [pool00_tdata] rhvh_fctest Twi-ao---- 375.11g [pool00_tmeta] rhvh_fctest ewi-ao---- 1.00g rhvh-4.1-0.20170403.0 rhvh_fctest Vwi---tz-k 360.11g pool00 root rhvh-4.1-0.20170403.0+1 rhvh_fctest Vwi-aotz-- 360.11g pool00 rhvh-4.1-0.20170403.0 2.17 root rhvh_fctest Vwi-a-tz-- 360.11g pool00 2.16 swap rhvh_fctest -wi-ao---- 7.88g var rhvh_fctest Vwi-aotz-- 15.00g pool00 3.67 After initial installation, the size of pool00_tmeta is 1G. Part2. Upgrade test Test Versions: Build1: redhat-virtualization-host-4.1-20170202.0 redhat-virtualization-host-4.1-20170202.0 Build2: redhat-virtualization-host-4.1-20170403.0 imgbased-0.9.20-0.1.el7ev.noarch Test Steps: 1. Install Build1, reboot and log into Build1. 2. Run "imgbase w", "lvs -a" 3. Upgrade to Build2 using "yum update" 4. Reboot and log into Build2. 5. Run "imgbase w", "lvs -a" 6. Reboot and log into Build1. 7. Run "imgbase w", "lvs -a" Test Results: 1. After step2, the results are: [root@test ~]# imgbase w [INFO] You are on rhvh-4.1-0.20170202.0+1 [root@test ~]# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lvol0_pmspare] rhvh_test ewi------- 108.00m pool00 rhvh_test twi-aotz-- 212.00g 1.76 1.42 [pool00_tdata] rhvh_test Twi-ao---- 212.00g [pool00_tmeta] rhvh_test ewi-ao---- 108.00m rhvh-4.1-0.20170202.0 rhvh_test Vwi---tz-k 197.00g pool00 root rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.42 root rhvh_test Vwi-a-tz-- 197.00g pool00 1.49 swap rhvh_test -wi-ao---- 3.88g var rhvh_test Vwi-aotz-- 15.00g pool00 3.38 The pool00_tmeta size is 108M. 2. After step5, the results are: [root@test ~]# imgbase w [INFO] You are on rhvh-4.1-0.20170403.0+1 [root@test ~]# lvs -a WARNING: Not using lvmetad because config setting use_lvmetad=0. WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache). LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lvol0_pmspare] rhvh_test ewi------- 108.00m pool00 rhvh_test twi-aotz-- 212.00g 3.75 0.27 [pool00_tdata] rhvh_test Twi-ao---- 212.00g [pool00_tmeta] rhvh_test ewi-ao---- 1.00g rhvh-4.1-0.20170202.0 rhvh_test Vwi---tz-k 197.00g pool00 root rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-a-tz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.90 rhvh-4.1-0.20170403.0 rhvh_test Vri---tz-k 197.00g pool00 rhvh-4.1-0.20170403.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170403.0 1.39 root rhvh_test Vwi-a-tz-- 197.00g pool00 1.49 swap rhvh_test -wi-ao---- 3.88g var rhvh_test Vwi-aotz-- 15.00g pool00 2.91 After upgrade to the latest version, the pool00_tmeta size is extended to 1G. 3. After step7, the results are: [root@test ~]# imgbase w [INFO] You are on rhvh-4.1-0.20170202.0+1 [root@test ~]# lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert [lvol0_pmspare] rhvh_test ewi------- 108.00m pool00 rhvh_test twi-aotz-- 212.00g 3.89 0.28 [pool00_tdata] rhvh_test Twi-ao---- 212.00g [pool00_tmeta] rhvh_test ewi-ao---- 1.00g rhvh-4.1-0.20170202.0 rhvh_test Vwi---tz-k 197.00g pool00 root rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.91 rhvh-4.1-0.20170403.0 rhvh_test Vri---tz-k 197.00g pool00 rhvh-4.1-0.20170403.0+1 rhvh_test Vwi-a-tz-- 197.00g pool00 rhvh-4.1-0.20170403.0 1.53 root rhvh_test Vwi-a-tz-- 197.00g pool00 1.49 swap rhvh_test -wi-ao---- 3.88g var rhvh_test Vwi-aotz-- 15.00g pool00 2.94 After rolling back to the old version, the pool00_tmeta size is still 1G. Conclusion: 1. The pool metadata size is 1G after initial installation of the latest build, redhat-virtualization-host-4.1-20170403.0. 2. The pool metadata size could be extended to 1G after upgrade to the latest build, and will remain 1G after rolling back to the old build. From the point of view of only testing the patch, the bug is fixed, and set the status to VERIFIED.
Hi Yuval. I'm working on the doc text for this bug. Can you please clarify whether this fix is for upgrades only?
Hi Emma, yes this fix runs in upgrades at the moment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1114