Bug 1364244

Summary: LVM Cache: limit number of cache chunks to the amount tested
Product: Red Hat Enterprise Linux 7 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
lvm2 sub component: Cache Logical Volumes QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact: Milan Navratil <mnavrati>
Severity: unspecified    
Priority: unspecified CC: agk, cmarthal, hartsjc, heinzm, jbrassow, mnavrati, msnitzer, prajnoha, rbednar, thornber, zkabelac
Version: 7.3   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.164-4.el7 Doc Type: Release Note
Doc Text:
LVM can now set the maximum number of cache pool chunks The new LVM allocation parameter in the allocation section of the `lvm.conf` file, `cache_pool_max_chunks`, limits the maximum number of cache pool chunks. When this parameter is undefined or set to 0, the built-in defaults are used.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 04:16:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Jonathan Earl Brassow 2016-08-04 19:23:17 UTC
The size of the cache pool and the cache chunk size should be limited such that the maximum number of chunks in memory will not exceed that which we have tested.  We wish to limit the exposure of our customers to bugs that might be uncovered at extreme sizes.  Currently, we test up to 1M blocks/chunks.

I'm open to the idea that the check can be disabled.

Comment 1 Joe Thornber 2016-08-17 14:21:06 UTC
Definitely make it so this check can be disabled in lvm.conf please.  What we want is people to be forced to take an action to break the limit.  The warning in the kernel log wasn't enough to make JT realise they were using cache in an unusual way.

Comment 2 Zdenek Kabelac 2016-08-17 14:25:29 UTC
We have number of variants to go with:


1 Just show 'WARNING: To many cache chunks' 
   But users tend to ignore any warn we print....

2 Put hard limit and refuse to proceed
   Could by a bit too scrict

3 WARN + prompt for continue:
   Commonly used in various parts of lvm2

4 Use lvm.conf   'extert_user'  to let operation pass ?

probably number of others...


So which way seems to be preferable ?

Comment 3 Alasdair Kergon 2016-08-17 18:52:33 UTC
So there should be a limit set in lvm.conf.

Checked at creation time, yes.
Checked also at activation time?

Is it a hard or soft limit (in each case, creation/activation)?
Warning message with prompt allowing override, or hard limit that can only be changed by editing the configuration?

Comment 4 Jonathan Earl Brassow 2016-08-22 14:53:11 UTC
No prompt.
Go by config limit in lvm.conf (0 = no limit).
Only applies to creation.

Comment 6 Jonathan Earl Brassow 2016-08-22 21:09:29 UTC
verify that the limit works - i.e. you cannot create cache pools with a chunk count greater than what is set.  Users have the option of changing the limit, but it should be clear that we only support up to the original amount.

Comment 9 Zdenek Kabelac 2016-08-31 11:28:54 UTC
Current new behavior for testing:


lvm.conf settings: allocation/cache_pool_max_chunks

limits maximum number of cache chunks lvm2 let pass for pool creation.

Cache pool should not be created with higher then this amount of chunks.

When chunk_size is unspecified - lvm2 will pick bigger chunk_size to
fit into the max_chunks limits (just like will scale thin-pool chunk_size).

When chunk_size is specified - error message will expose suggestions how to proceed.

Option itself is ATM undefined and evaluated runtime (= 1000000 as of now).
User's specified number is respected however when it's higher than what we see as reasonable max we still print at least warning.


Existing cache pool are not influenced by this change and there is no max_chunk detection during activation (just kernel logs warn in this case).

Comment 11 Roman Bednář 2016-09-21 14:07:52 UTC
Verified with latest rpms. 
New lvm.conf parameter (allocation/cache_pool_max_chunks) works as expected.

NOTE:
cache_pool_max_chunks = 0 means "undefined" and default limit is used in this case (1.000.000).

========================================================
Set lower value than recommended maximum of 1.000.000 chunks and try to create cache-pool over limit:

# lvmconfig | grep cache_pool_max_chunks
	cache_pool_max_chunks=100

# lvcreate --type cache-pool --chunksize 20960 -n CPOOL2 vg -L2G
  Using default stripesize 64.00 KiB.
  Chunk size 20.47 MiB is less then required minimal chunk size 20.50 MiB for a cache pool of 2.00 GiB size and limit 100 chunks.
  To allow use of more chunks, see setting allocation/cache_pool_max_chunks.

# echo "2097152.00/20960" | bc
100


# lvmconfig | grep cache_pool_max_chunks
	cache_pool_max_chunks=150

# lvcreate --type cache-pool --chunksize 20960 -n CPOOL2 vg -L2G
  Using default stripesize 64.00 KiB.
  Logical volume "CPOOL2" created.


Do the same but with higher cache_pool_max_chunks value than 1.000.000:
# lvmconfig | grep cache_pool_max_chunks
	cache_pool_max_chunks=1000100

Hit the limit here:
# echo "32003200/32" | bc
1000100

# lvcreate --type cache-pool --chunksize 32 -n CPOOL vg -L32003200k
  Using default stripesize 64.00 KiB.
  Rounding up size to full physical extent 30.52 GiB
  WARNING: Configured cache_pool_max_chunks value 1000100 is higher then recommended 1000000.
  Chunk size 32.00 KiB is less then required minimal chunk size 64.00 KiB for a cache pool of 30.52 GiB size and limit 1000100 chunks.
  To allow use of more chunks, see setting allocation/cache_pool_max_chunks.

Override limit:
# lvmconfig | grep cache_pool_max_chunks
	cache_pool_max_chunks=10000000

# lvcreate --type cache-pool --chunksize 32 -n CPOOL vg -L32003200k
  Using default stripesize 64.00 KiB.
  Rounding up size to full physical extent 30.52 GiB
  WARNING: Configured cache_pool_max_chunks value 10000000 is higher then recommended 1000000.
  Logical volume "CPOOL" created.
========================================================

3.10.0-505.el7.x86_64

lvm2-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
lvm2-libs-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
lvm2-cluster-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-libs-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-event-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-event-libs-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 12:29:13 CEST 2016
cmirror-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016

Comment 13 errata-xmlrpc 2016-11-04 04:16:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html