This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1301235 - new thin volume creation is not enforced when free space reaches threshold
new thin volume creation is not enforced when free space reaches threshold
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2 (Show other bugs)
6.8
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: LVM and device-mapper development team
cluster-qe@redhat.com
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-22 18:47 EST by Corey Marthaler
Modified: 2016-11-04 09:36 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-04 09:36:38 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2016-01-22 18:47:48 EST
Description of problem:
Current RHEL7.2:

[root@host-075 ~]# lvcreate  -V 1G -T snapper_thinp/POOL -n other6
  Cannot create new thin volume, free space in thin pool snapper_thinp/POOL reached threshold.

Current RHEL6.8

Jan 22 16:25:09 host-115 lvm[9644]: WARNING: Thin pool snapper_thinp-POOL-tpool data is now 100.00% full.
device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO) mode

Attempt to create other virtual volumes while pool is full and in RO mode

[root@host-115 ~]# lvcreate  -V 1G -T snapper_thinp/POOL -n other7
  WARNING: Sum of all thin volume sizes (16.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)!
  For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100.
  /dev/snapper_thinp/other7: write failed after 0 of 4096 at 0: Input/output error
  Logical volume "other7" created.

[root@host-115 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool Origin Data%  Meta%  Devices
  POOL            snapper_thinp twi-aot-D-   1.00g             100.00 14.75  POOL_tdata(0)
  [POOL_tdata]    snapper_thinp Twi-ao----   1.00g                           /dev/sdc1(1)
  [POOL_tmeta]    snapper_thinp ewi-ao----   4.00m                           /dev/sdf1(0)
  full_snap       snapper_thinp Vwi-aot---   2.00g POOL origin 49.98
  [lvol0_pmspare] snapper_thinp ewi-------   4.00m                           /dev/sdc1(0)
  origin          snapper_thinp Vwi-a-t---   2.00g POOL        4.74
  other1          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other2          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other3          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other4          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other5          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other6          snapper_thinp Vwi-a-t---   1.00g POOL        0.00
  other7          snapper_thinp Vwi-a-t---   1.00g POOL        0.00


Version-Release number of selected component (if applicable):
2.6.32-604.el6.x86_64

lvm2-2.02.140-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
lvm2-libs-2.02.140-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
lvm2-cluster-2.02.140-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
udev-147-2.63.el6_7.1    BUILT: Thu Nov 12 10:11:28 CST 2015
device-mapper-1.02.114-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
device-mapper-libs-1.02.114-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
device-mapper-event-1.02.114-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
device-mapper-event-libs-1.02.114-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016
device-mapper-persistent-data-0.6.0-1.el6    BUILT: Wed Jan 20 11:23:29 CST 2016
cmirror-2.02.140-3.el6    BUILT: Thu Jan 21 05:40:10 CST 2016

How reproducible:
Everytime
Comment 2 Zdenek Kabelac 2016-01-25 05:49:20 EST
(In reply to Corey Marthaler from comment #0)

> Current RHEL6.8
> 
> Jan 22 16:25:09 host-115 lvm[9644]: WARNING: Thin pool
> snapper_thinp-POOL-tpool data is now 100.00% full.
> device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO)
> mode
> 
> Attempt to create other virtual volumes while pool is full and in RO mode
> 
> [root@host-115 ~]# lvcreate  -V 1G -T snapper_thinp/POOL -n other7
>   WARNING: Sum of all thin volume sizes (16.00 GiB) exceeds the size of thin
> pool snapper_thinp/POOL (1.00 GiB)!
>   For thin pool auto extension activation/thin_pool_autoextend_threshold
> should be below 100.
>   /dev/snapper_thinp/other7: write failed after 0 of 4096 at 0: Input/output
> error
>   Logical volume "other7" created.


So what happens here - my estimation

On RHEL7 you've had configured threshold below 100% - so in this case lvm2 & dmeventd monitors the size of free space in thin-pool and doesn't allow to create new LV when it's above threshold. 
The same threshold used for 'autoextension' is also used for this 'guard' mechanism.

On RHEL6.8 it complains threshold is set to 100 (if I'm wrong here please provide lvm.conf) and thus it's not checking for any bounds and lets user create a new thin LV. Clearing of such LV will 'fail' on full thin-poll - but thin LV has already been created before clearing.

So is there a bug in lvm.conf parsing ?
Comment 3 Corey Marthaler 2016-01-25 10:46:56 EST
The threshold was turned off (100) in both the 7.2 and 6.8 cases. It's hard to do full thin pool cases with the threshold on since that causes auto extensions.



# RHEL7.2

[root@host-075 ~]# uptime
 09:14:47 up 1 min,  2 users,  load average: 0.29, 0.18, 0.07

[root@host-075 ~]# grep thin_pool_autoextend_threshold /etc/lvm/lvm.conf
        # Configuration option activation/thin_pool_autoextend_threshold.
        # thin_pool_autoextend_threshold = 70
        thin_pool_autoextend_threshold = 100

[root@host-075 ~]# ps -ef | grep dmeventd
root      2427  2373  0 09:14 pts/0    00:00:00 grep --color=auto dmeventd

[root@host-075 ~]# lvcreate  -k n -s /dev/snapper_thinp/origin -n full_snap
  WARNING: Sum of all thin volume sizes (10.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)!
  For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100.
  Logical volume "full_snap" created.

[root@host-075 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool Origin Data%  Meta% Devices
  POOL            snapper_thinp twi-aot---   1.00g             0.02   1.37  POOL_tdata(0)
  [POOL_tdata]    snapper_thinp Twi-ao----   1.00g                          /dev/sdc1(1)
  [POOL_tmeta]    snapper_thinp ewi-ao----   4.00m                          /dev/sdf1(0)
  full_snap       snapper_thinp Vwi-a-t---   2.00g POOL origin 0.00
  [lvol0_pmspare] snapper_thinp ewi-------   4.00m                          /dev/sdc1(0)
  origin          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other1          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other2          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other3          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
[root@host-075 ~]# dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct
^C
[1]+  Stopped                 dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct

[root@host-075 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool Origin Data%  Meta% Devices
  POOL            snapper_thinp twi-aot-D-   1.00g             100.00 14.06 POOL_tdata(0)
  [POOL_tdata]    snapper_thinp Twi-ao----   1.00g                          /dev/sdc1(1)
  [POOL_tmeta]    snapper_thinp ewi-ao----   4.00m                          /dev/sdf1(0)
  full_snap       snapper_thinp Vwi-aot---   2.00g POOL origin 49.99
  [lvol0_pmspare] snapper_thinp ewi-------   4.00m                          /dev/sdc1(0)
  origin          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other1          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other2          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other3          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
[root@host-075 ~]# lvcreate  -V 2G -T snapper_thinp/POOL -n other4
  Cannot create new thin volume, free space in thin pool snapper_thinp/POOL reached threshold.

[root@host-075 ~]# ps -ef | grep dmeventd
root      2484     1  0 09:15 ?        00:00:00 /usr/sbin/dmeventd -f
root      2666  2373  0 09:39 pts/0    00:00:00 grep --color=auto dmeventd




# RHEL6.8

[root@host-116 ~]# uptime
 09:23:48 up 9 min,  1 user,  load average: 0.07, 0.05, 0.05

[root@host-116 ~]# grep thin_pool_autoextend_threshold /etc/lvm/lvm.conf
        # Configuration option activation/thin_pool_autoextend_threshold.
        # thin_pool_autoextend_threshold = 70
    thin_pool_autoextend_threshold = 100

[root@host-116 ~]# ps -ef | grep dmeventd
root      2342  2320  0 09:24 pts/0    00:00:00 grep dmeventd

[root@host-116 ~]# lvcreate  -k n -s /dev/snapper_thinp/origin -n full_snap
  WARNING: Sum of all thin volume sizes (10.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)!
  For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100.
  Logical volume "full_snap" created.
[root@host-116 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool Origin Data%  Meta% Devices
  POOL            snapper_thinp twi-aot---   1.00g             0.02   1.37  POOL_tdata(0)
  [POOL_tdata]    snapper_thinp Twi-ao----   1.00g                          /dev/sda1(1)
  [POOL_tmeta]    snapper_thinp ewi-ao----   4.00m                          /dev/sdf1(0)
  full_snap       snapper_thinp Vwi-a-t---   2.00g POOL origin 0.00
  [lvol0_pmspare] snapper_thinp ewi-------   4.00m                          /dev/sda1(0)
  origin          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other1          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other2          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other3          snapper_thinp Vwi-a-t---   2.00g POOL        0.00

[root@host-116 ~]# dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct
dd: writing `/dev/snapper_thinp/full_snap': Input/output error
1024+0 records in
1023+0 records out
1072693248 bytes (1.1 GB) copied, 70.7544 s, 15.2 MB/s
[root@host-116 ~]# lvs -a -o +devices
  LV              VG            Attr       LSize   Pool Origin Data%  Meta% Devices
  POOL            snapper_thinp twi-aot-D-   1.00g             100.00 14.06 POOL_tdata(0)
  [POOL_tdata]    snapper_thinp Twi-ao----   1.00g                          /dev/sda1(1)
  [POOL_tmeta]    snapper_thinp ewi-ao----   4.00m                          /dev/sdf1(0)
  full_snap       snapper_thinp Vwi-a-t---   2.00g POOL origin 49.99
  [lvol0_pmspare] snapper_thinp ewi-------   4.00m                          /dev/sda1(0)
  origin          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other1          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other2          snapper_thinp Vwi-a-t---   2.00g POOL        0.00
  other3          snapper_thinp Vwi-a-t---   2.00g POOL        0.00

[root@host-116 ~]# lvcreate  -V 2G -T snapper_thinp/POOL -n other4
  WARNING: Sum of all thin volume sizes (12.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)!
  For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100.
  /dev/snapper_thinp/other4: write failed after 0 of 4096 at 0: Input/output error
  Logical volume "other4" created.

[root@host-116 ~]# ps -ef | grep dmeventd
root      2400     1  0 09:24 ?        00:00:00 /sbin/dmeventd
root      2594  2320  0 09:39 pts/0    00:00:00 grep dmeventd
Comment 4 Zdenek Kabelac 2016-01-26 05:35:32 EST
Still not sure if I get it right - but is this about seeing waiting 'dd' process and instantly 'errored' one  ("D" flagged) ?

Before thin-pool is switching to 'D'  - there is implicit (yet kernel modprobe configurable) timeout - ATM   60s.

So whoever first fill thin-pool hits the timeout.

So even if there is no 'dmeventd' autoextend - user still has a 60s chance to resize pool by himself. (With monitoring enabled -  dmeventd throws some messages to syslog).

So if the 'dd' is the first app to reach this full pool data state - it will wait 60s before it will error out.

There is even 'ugly' hidden issue if you would not have use 'odirect' - since 'dd' may fill page cache and exit without even telling you there is a problem.

Anyway - so far I'm still confused what is the actual difference between 7.2 and 6.8.

(probably ping me)
Comment 6 Zdenek Kabelac 2016-01-26 10:55:12 EST
Ok - so to reproduce discussion -

lvm2 now correctly checks for threshold borders.

i.e. when set 80%  (for snapshot and thin) -  if it's 80% - it's still valid, when it's above, the size is increased. So you can also create a thin LV
when thin-pool has given threshold.

The fact this in past accidentally caused 'guard' against creating a new thin-LV when thin-pool is 100% full and threshold was set to 100 was rather a side-effect of check '>='  (which is now correctly doing '>' everywhere).

The remaining open now question is - do we want to 'reintroduce' extra check for thin-pool 100% ?

We also may want to probably check for slightly smaller value e.g. 95% (as this is last value reported by dmeventd before usually getting fill 100% full pull)
as checking for 100 is 'too late' anyway.

It's relatively easy to add.

Just the origin of this bug is rather 'feature misuse' then actually loosing some supported & documented one...
Comment 7 Corey Marthaler 2016-01-26 11:51:30 EST
Looks like the 6.7 behavior was even worse. 

[root@host-117 ~]# lvcreate -L 200M -T vgtest/mythinpool -V1G -n thin1
  Logical volume "thin1" created.
[root@host-117 ~]# lvs -a -o +devices
  LV                 VG     Attr       LSize   Pool       Origin Data%  Meta% Devices
  [lvol0_pmspare]    vgtest ewi-------   4.00m                                /dev/sda1(0)
  mythinpool         vgtest twi-aotz-- 200.00m                   0.00   0.98  mythinpool_tdata(0)
  [mythinpool_tdata] vgtest Twi-ao---- 200.00m                                /dev/sda1(1)
  [mythinpool_tmeta] vgtest ewi-ao----   4.00m                                /dev/sdh1(0)
  thin1              vgtest Vwi-a-tz--   1.00g mythinpool        0.00

[root@host-117 ~]# dd if=/dev/zero of=/dev/mapper/vgtest-thin1 bs=1M count=1K
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 64.237 s, 16.7 MB/s

[root@host-117 ~]# lvs -a -o +devices
  /dev/vgtest/thin1: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/vgtest/thin1: read failed after 0 of 4096 at 1073733632: Input/output error
  LV                 VG     Attr       LSize   Pool       Origin Data%  Meta% Devices
  [lvol0_pmspare]    vgtest ewi-------   4.00m                                /dev/sda1(0)
  mythinpool         vgtest twi-aotzM- 200.00m                   100.00 3.42  mythinpool_tdata(0)
  [mythinpool_tdata] vgtest Twi-ao---- 200.00m                                /dev/sda1(1)
  [mythinpool_tmeta] vgtest ewi-ao----   4.00m                                /dev/sdh1(0)
  thin1              vgtest Vwi-a-tz--   1.00g mythinpool        19.53

[root@host-117 ~]# lvcreate -n thin2 -V 1G -T vgtest/mythinpool
  /dev/vgtest/thin1: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/vgtest/thin1: read failed after 0 of 4096 at 1073733632: Input/output error
  device-mapper: message ioctl on  failed: Invalid argument
  Failed to resume mythinpool.


2.6.32-573.12.1.el6.x86_64

lvm2-2.02.118-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
lvm2-libs-2.02.118-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
lvm2-cluster-2.02.118-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
udev-147-2.63.el6_7.1    BUILT: Thu Nov 12 10:11:28 CST 2015
device-mapper-1.02.95-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
device-mapper-libs-1.02.95-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
device-mapper-event-1.02.95-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
device-mapper-event-libs-1.02.95-3.el6_7.4    BUILT: Tue Nov 10 12:12:57 CST 2015
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014
Comment 9 Zdenek Kabelac 2016-11-04 09:31:56 EDT
As Bug 1189221 suggests - this has be all resolved with version 2.02.124 and better.
Comment 10 Peter Rajnoha 2016-11-04 09:36:38 EDT
(In reply to Zdenek Kabelac from comment #9)
> As Bug 1189221 suggests - this has be all resolved with version 2.02.124 and
> better.

That is included in RHEL 6.8 lvm2 package.

Note You need to log in before you can comment on or make changes to this bug.