Bug 1432359 - Default thin pool metadata size in RHV-H is less and utilization can reach 100%
Summary: Default thin pool metadata size in RHV-H is less and utilization can reach 100%
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node-ng
Version: 4.0.6
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ovirt-4.1.1-1
: ---
Assignee: Yuval Turgeman
QA Contact: Qin Yuan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-15 08:41 UTC by nijin ashok
Modified: 2020-07-16 09:18 UTC (History)
15 users (show)

Fixed In Version: imgbased-0.9.19-0.1.el7ev
Doc Type: Bug Fix
Doc Text:
Previously, Anaconda created a small metadata logical volume that was smaller than . As a result, the logical volume could reach 100% capacity. In this release, during upgrades, if the metadata logical volume is less than 1 GB, it is resized to 1 GB.
Clone Of:
Environment:
Last Closed: 2017-04-20 19:04:33 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2924091 0 None None None 2017-06-18 06:15:54 UTC
Red Hat Product Errata RHEA-2017:1114 0 normal SHIPPED_LIVE redhat-virtualization-host bug fix and enhancement update 2017-04-20 22:57:46 UTC
oVirt gerrit 74487 0 'None' MERGED osupdate: resize thinpool metadata on upgrades 2021-01-29 14:54:41 UTC
oVirt gerrit 74863 0 'None' MERGED osupdate: resize thinpool metadata on upgrades 2021-01-29 14:53:58 UTC

Description nijin ashok 2017-03-15 08:41:02 UTC
Description of problem:

The metadata utilization can reach to 100% if the lvm pool size is high. The size of metadata lv was only 88.00m for a pool with 5.44t size and the utilization become 100% in few months.  

===

lvs -a
  LV                      VG   Attr       LSize  Pool   Origin                Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]         rhvh ewi------- 88.00m
  pool00                  rhvh twi-cotzM-  5.44t                              11.86  99.95
  [pool00_tdata]          rhvh Twi-ao----  5.44t
  [pool00_tmeta]          rhvh ewi-ao---- 88.00m
  rhvh-4.0-0.20170104.0   rhvh Vwi---tz-k  5.42t pool00 root
  rhvh-4.0-0.20170104.0+1 rhvh Vwi-aotz--  5.42t pool00 rhvh-4.0-0.20170104.0 11.80
  root                    rhvh Vwi-a-tz--  5.42t pool00                       0.17
  swap                    rhvh -wi-ao----  4.00g
  var                     rhvh Vwi-aotz-- 15.00g pool00                       4.09
===

As per the man page of lvmthin, the recommended size is 1GiB.

===
It  can  be  hard  to  predict the amount of metadata space that will be needed, so it is recommended to start with a size of 1GiB which should be enough for all practical purposes.
====

Also by default, thin_pool_autoextend_threshold is set as 100 which disables the automatic extension of the metadata lv if it reach the threshold. 

grep "thin_pool_autoextend_threshold" /etc/lvm/lvm.conf
	thin_pool_autoextend_threshold = 100

Version-Release number of selected component (if applicable):

RHV-H 7.3


Actual results:

Metadata utilization can reach 100% which can cause various issues.

Expected results:

Prevent metadata volume to reach 100% 

Additional info:

Comment 1 Ryan Barry 2017-03-15 20:54:11 UTC
There are two different bugs here.

One is that Anaconda does not seem to set reasonable defaults. 

The other is that RHV-H doesn't have thin_pool_autoextend_threshold set below 100 (I'd argue that this should probably be set different in the defaults from lvm2, but we'll change it from our end)

Comment 4 Ryan Barry 2017-03-16 14:08:16 UTC
For a complete solution to this, we should:

* Check the used metadata size (and add an abstraction to imgbased for this)
* On upgrades, extend the metadata size to 1GB if it is not already (plugins.osupdater will also trigger on "layout --init", so we can also cover the install case here)
* nodectl check should ensure that this value is sane on login

More generally, autoextend should also be set in lvm.conf. This is tricky, though. vdsm already uses lvmlocal.conf, so we could submit a patch to have it set there. Or file a bug against lvm2 to change the default. But to my knowledge, there is nothing like '/etc/lvm.conf.d/' in which we can set one-off values (other than lvmlocal), so we'll need to seek integration with another team.

We could also just clobber this value, but this means that lvm.conf from new images won't supercede ours, which is not nice.

Comment 12 Qin Yuan 2017-04-06 06:58:16 UTC
Part1. Install test
Test Versions:
Build1:
redhat-virtualization-host-4.1-20170403.0
imgbased-0.9.20-0.1.el7ev.noarch

Test Steps:
1. Install Build1, reboot and log into Build1.
2. Run "imgbase w", "lvs -a"

Test Results:
1. After step2, the results are:
[root@fctest ~]# imgbase w
[INFO] You are on rhvh-4.1-0.20170403.0+1
[root@fctest ~]# lvs -a
  WARNING: Not using lvmetad because config setting use_lvmetad=0.
  WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
  LV                      VG          Attr       LSize   Pool   Origin                Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]         rhvh_fctest ewi-------  96.00m                                                                     
  pool00                  rhvh_fctest twi-aotz-- 375.11g                              2.78   0.20                            
  [pool00_tdata]          rhvh_fctest Twi-ao---- 375.11g                                                                     
  [pool00_tmeta]          rhvh_fctest ewi-ao----   1.00g                                                                     
  rhvh-4.1-0.20170403.0   rhvh_fctest Vwi---tz-k 360.11g pool00 root                                                         
  rhvh-4.1-0.20170403.0+1 rhvh_fctest Vwi-aotz-- 360.11g pool00 rhvh-4.1-0.20170403.0 2.17                                   
  root                    rhvh_fctest Vwi-a-tz-- 360.11g pool00                       2.16                                   
  swap                    rhvh_fctest -wi-ao----   7.88g                                                                     
  var                     rhvh_fctest Vwi-aotz--  15.00g pool00                       3.67    

After initial installation, the size of pool00_tmeta is 1G.

Part2. Upgrade test
Test Versions:
Build1:
redhat-virtualization-host-4.1-20170202.0
redhat-virtualization-host-4.1-20170202.0
Build2:
redhat-virtualization-host-4.1-20170403.0
imgbased-0.9.20-0.1.el7ev.noarch

Test Steps:
1. Install Build1, reboot and log into Build1.
2. Run "imgbase w", "lvs -a"
3. Upgrade to Build2 using "yum update"
4. Reboot and log into Build2.
5. Run "imgbase w", "lvs -a"
6. Reboot and log into Build1.
7. Run "imgbase w", "lvs -a"

Test Results:
1. After step2, the results are:
[root@test ~]# imgbase w
[INFO] You are on rhvh-4.1-0.20170202.0+1
[root@test ~]# lvs -a
  LV                      VG        Attr       LSize   Pool   Origin                Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]         rhvh_test ewi------- 108.00m                                                                     
  pool00                  rhvh_test twi-aotz-- 212.00g                              1.76   1.42                            
  [pool00_tdata]          rhvh_test Twi-ao---- 212.00g                                                                     
  [pool00_tmeta]          rhvh_test ewi-ao---- 108.00m                                                                     
  rhvh-4.1-0.20170202.0   rhvh_test Vwi---tz-k 197.00g pool00 root                                                         
  rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.42                                   
  root                    rhvh_test Vwi-a-tz-- 197.00g pool00                       1.49                                   
  swap                    rhvh_test -wi-ao----   3.88g                                                                     
  var                     rhvh_test Vwi-aotz--  15.00g pool00                       3.38   

The pool00_tmeta size is 108M.

2. After step5, the results are:
[root@test ~]# imgbase w
[INFO] You are on rhvh-4.1-0.20170403.0+1
[root@test ~]# lvs -a
  WARNING: Not using lvmetad because config setting use_lvmetad=0.
  WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).
  LV                      VG        Attr       LSize   Pool   Origin                Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]         rhvh_test ewi------- 108.00m                                                                     
  pool00                  rhvh_test twi-aotz-- 212.00g                              3.75   0.27                            
  [pool00_tdata]          rhvh_test Twi-ao---- 212.00g                                                                     
  [pool00_tmeta]          rhvh_test ewi-ao----   1.00g                                                                     
  rhvh-4.1-0.20170202.0   rhvh_test Vwi---tz-k 197.00g pool00 root                                                         
  rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-a-tz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.90                                   
  rhvh-4.1-0.20170403.0   rhvh_test Vri---tz-k 197.00g pool00                                                              
  rhvh-4.1-0.20170403.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170403.0 1.39                                   
  root                    rhvh_test Vwi-a-tz-- 197.00g pool00                       1.49                                   
  swap                    rhvh_test -wi-ao----   3.88g                                                                     
  var                     rhvh_test Vwi-aotz--  15.00g pool00                       2.91  

After upgrade to the latest version, the pool00_tmeta size is extended to 1G.

3. After step7, the results are:
[root@test ~]# imgbase w
[INFO] You are on rhvh-4.1-0.20170202.0+1
[root@test ~]# lvs -a
  LV                      VG        Attr       LSize   Pool   Origin                Data%  Meta%  Move Log Cpy%Sync Convert
  [lvol0_pmspare]         rhvh_test ewi------- 108.00m                                                                     
  pool00                  rhvh_test twi-aotz-- 212.00g                              3.89   0.28                            
  [pool00_tdata]          rhvh_test Twi-ao---- 212.00g                                                                     
  [pool00_tmeta]          rhvh_test ewi-ao----   1.00g                                                                     
  rhvh-4.1-0.20170202.0   rhvh_test Vwi---tz-k 197.00g pool00 root                                                         
  rhvh-4.1-0.20170202.0+1 rhvh_test Vwi-aotz-- 197.00g pool00 rhvh-4.1-0.20170202.0 1.91                                   
  rhvh-4.1-0.20170403.0   rhvh_test Vri---tz-k 197.00g pool00                                                              
  rhvh-4.1-0.20170403.0+1 rhvh_test Vwi-a-tz-- 197.00g pool00 rhvh-4.1-0.20170403.0 1.53                                   
  root                    rhvh_test Vwi-a-tz-- 197.00g pool00                       1.49                                   
  swap                    rhvh_test -wi-ao----   3.88g                                                                     
  var                     rhvh_test Vwi-aotz--  15.00g pool00                       2.94   

After rolling back to the old version, the pool00_tmeta size is still 1G.

Conclusion:
1. The pool metadata size is 1G after initial installation of the latest build, redhat-virtualization-host-4.1-20170403.0.
2. The pool metadata size could be extended to 1G after upgrade to the latest build, and will remain 1G after rolling back to the old build.

From the point of view of only testing the patch, the bug is fixed, and set the status to VERIFIED.

Comment 13 Emma Heftman 2017-04-09 14:30:48 UTC
Hi Yuval. I'm working on the doc text for this bug. Can you please clarify whether this fix is for upgrades only?

Comment 14 Yuval Turgeman 2017-04-13 07:28:40 UTC
Hi Emma, yes this fix runs in upgrades at the moment.

Comment 15 errata-xmlrpc 2017-04-20 19:04:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1114


Note You need to log in before you can comment on or make changes to this bug.