Red Hat Bugzilla – Bug 866971
Support policies for thin pool and being overfilled
Last modified: 2016-05-10 21:19:51 EDT
Description of problem:
When the thin pool is going above defined threshold, and for various reasons we cannot increase the size of thin-pool - lvm2 needs to define policies for such cases.
Unlike the Bug 852812 - where dmeventd serves as the last resort for failing lvextend command - policies should be able to define the more user defined
We should be able to select
(+ possibly switch thin volumes into read-only mode - updates metadata)
wait/block on the fly operations and awaits admins decision
(we need to define what should happen when shutdown is requested).
Ensure this solution is persistent across error case (shutdown)
Currently we allow to startup thin-pool which is above threshold - we need
to define what should happen in such case.
This bug might be split into sub-features.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Let use this BZ - to enhance current support for pools running out-of-space.
One of improvements should be usage of thin-pool's warning threshold support
by kernel - to speed-up reaction time of dmeventd.
(ATM we just instantly react only when pool gets 100% full - which is more or less state we WANT to avoid at all costs.)
1. speed-up resize via dmeventd.
2. add some noticable message to user when pool is approaching its limits possibly during running related lvm2 commands.
i.e. no we just WARN for overprovisioning - but we can do more - we could actually also tell user he runs into serious troubles.
3. think whether it's time for adding some policy options for pools.
So to recap what's been already done, and what's being moved forward.
With release lvm2 2.02.142
(patch https://www.redhat.com/archives/lvm-devel/2016-February/msg00004.html and few other surounding)
lvm2/dmeventd has gained match better speed in resize of out-of-date space pool - so reaction should be rather instant - compare with previous 'up-to' 10 second delay - although there could be still 'tiny' rounding differences - so if you are on 'block-exact' corners it might still be different.
So let's say when threshold is set to 70% - whenever you fill it to 75% - dmeventd should instantly react and resize it (no more 10 sec. timeouts)
The missing part here is better 'metadata' reaction - here the logic in target differs from lvm2 - so this is still under development how to combine both in best way.
When threshold is bellow 100% - we currently do some checks and warn about overprovisining.
The future version will deploy more advanced detection of metadata space (to utilize target logic better).
Since we recently improved status reporting for thin-pool - these things will be also reused later for better warning/error reporting.
Still nothing new here as we are not even close to be finished with existing stuff.
Marking verified in the latest rpms.
As mentioned in comments #8 and #11...
1. the resize does now appear to take just a couple seconds.
2. low water mark messages appear now when getting close to but not exceeding the threshold
3. over provision messages now appear when greater than the pool size (with threshold off) and greater than the VG auto extend size limit (with threshold on).
lvm2-2.02.143-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
lvm2-libs-2.02.143-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
lvm2-cluster-2.02.143-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
udev-147-2.71.el6 BUILT: Wed Feb 10 07:07:17 CST 2016
device-mapper-1.02.117-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
device-mapper-libs-1.02.117-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
device-mapper-event-1.02.117-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
device-mapper-event-libs-1.02.117-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
device-mapper-persistent-data-0.6.2-0.1.rc5.el6 BUILT: Wed Feb 24 07:07:09 CST 2016
cmirror-2.02.143-1.el6 BUILT: Wed Feb 24 07:59:50 CST 2016
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.