Hide Forgot
Description of problem: Current RHEL7.2: [root@host-075 ~]# lvcreate -V 1G -T snapper_thinp/POOL -n other6 Cannot create new thin volume, free space in thin pool snapper_thinp/POOL reached threshold. Current RHEL6.8 Jan 22 16:25:09 host-115 lvm[9644]: WARNING: Thin pool snapper_thinp-POOL-tpool data is now 100.00% full. device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO) mode Attempt to create other virtual volumes while pool is full and in RO mode [root@host-115 ~]# lvcreate -V 1G -T snapper_thinp/POOL -n other7 WARNING: Sum of all thin volume sizes (16.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)! For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. /dev/snapper_thinp/other7: write failed after 0 of 4096 at 0: Input/output error Logical volume "other7" created. [root@host-115 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices POOL snapper_thinp twi-aot-D- 1.00g 100.00 14.75 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-ao---- 1.00g /dev/sdc1(1) [POOL_tmeta] snapper_thinp ewi-ao---- 4.00m /dev/sdf1(0) full_snap snapper_thinp Vwi-aot--- 2.00g POOL origin 49.98 [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sdc1(0) origin snapper_thinp Vwi-a-t--- 2.00g POOL 4.74 other1 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other2 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other3 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other4 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other5 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other6 snapper_thinp Vwi-a-t--- 1.00g POOL 0.00 other7 snapper_thinp Vwi-a-t--- 1.00g POOL 0.00 Version-Release number of selected component (if applicable): 2.6.32-604.el6.x86_64 lvm2-2.02.140-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 lvm2-libs-2.02.140-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 lvm2-cluster-2.02.140-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 udev-147-2.63.el6_7.1 BUILT: Thu Nov 12 10:11:28 CST 2015 device-mapper-1.02.114-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 device-mapper-libs-1.02.114-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 device-mapper-event-1.02.114-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 device-mapper-event-libs-1.02.114-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 device-mapper-persistent-data-0.6.0-1.el6 BUILT: Wed Jan 20 11:23:29 CST 2016 cmirror-2.02.140-3.el6 BUILT: Thu Jan 21 05:40:10 CST 2016 How reproducible: Everytime
(In reply to Corey Marthaler from comment #0) > Current RHEL6.8 > > Jan 22 16:25:09 host-115 lvm[9644]: WARNING: Thin pool > snapper_thinp-POOL-tpool data is now 100.00% full. > device-mapper: thin: 253:4: switching pool to out-of-data-space (error IO) > mode > > Attempt to create other virtual volumes while pool is full and in RO mode > > [root@host-115 ~]# lvcreate -V 1G -T snapper_thinp/POOL -n other7 > WARNING: Sum of all thin volume sizes (16.00 GiB) exceeds the size of thin > pool snapper_thinp/POOL (1.00 GiB)! > For thin pool auto extension activation/thin_pool_autoextend_threshold > should be below 100. > /dev/snapper_thinp/other7: write failed after 0 of 4096 at 0: Input/output > error > Logical volume "other7" created. So what happens here - my estimation On RHEL7 you've had configured threshold below 100% - so in this case lvm2 & dmeventd monitors the size of free space in thin-pool and doesn't allow to create new LV when it's above threshold. The same threshold used for 'autoextension' is also used for this 'guard' mechanism. On RHEL6.8 it complains threshold is set to 100 (if I'm wrong here please provide lvm.conf) and thus it's not checking for any bounds and lets user create a new thin LV. Clearing of such LV will 'fail' on full thin-poll - but thin LV has already been created before clearing. So is there a bug in lvm.conf parsing ?
The threshold was turned off (100) in both the 7.2 and 6.8 cases. It's hard to do full thin pool cases with the threshold on since that causes auto extensions. # RHEL7.2 [root@host-075 ~]# uptime 09:14:47 up 1 min, 2 users, load average: 0.29, 0.18, 0.07 [root@host-075 ~]# grep thin_pool_autoextend_threshold /etc/lvm/lvm.conf # Configuration option activation/thin_pool_autoextend_threshold. # thin_pool_autoextend_threshold = 70 thin_pool_autoextend_threshold = 100 [root@host-075 ~]# ps -ef | grep dmeventd root 2427 2373 0 09:14 pts/0 00:00:00 grep --color=auto dmeventd [root@host-075 ~]# lvcreate -k n -s /dev/snapper_thinp/origin -n full_snap WARNING: Sum of all thin volume sizes (10.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)! For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. Logical volume "full_snap" created. [root@host-075 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices POOL snapper_thinp twi-aot--- 1.00g 0.02 1.37 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-ao---- 1.00g /dev/sdc1(1) [POOL_tmeta] snapper_thinp ewi-ao---- 4.00m /dev/sdf1(0) full_snap snapper_thinp Vwi-a-t--- 2.00g POOL origin 0.00 [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sdc1(0) origin snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other1 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other2 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other3 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 [root@host-075 ~]# dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct ^C [1]+ Stopped dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct [root@host-075 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices POOL snapper_thinp twi-aot-D- 1.00g 100.00 14.06 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-ao---- 1.00g /dev/sdc1(1) [POOL_tmeta] snapper_thinp ewi-ao---- 4.00m /dev/sdf1(0) full_snap snapper_thinp Vwi-aot--- 2.00g POOL origin 49.99 [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sdc1(0) origin snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other1 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other2 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other3 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 [root@host-075 ~]# lvcreate -V 2G -T snapper_thinp/POOL -n other4 Cannot create new thin volume, free space in thin pool snapper_thinp/POOL reached threshold. [root@host-075 ~]# ps -ef | grep dmeventd root 2484 1 0 09:15 ? 00:00:00 /usr/sbin/dmeventd -f root 2666 2373 0 09:39 pts/0 00:00:00 grep --color=auto dmeventd # RHEL6.8 [root@host-116 ~]# uptime 09:23:48 up 9 min, 1 user, load average: 0.07, 0.05, 0.05 [root@host-116 ~]# grep thin_pool_autoextend_threshold /etc/lvm/lvm.conf # Configuration option activation/thin_pool_autoextend_threshold. # thin_pool_autoextend_threshold = 70 thin_pool_autoextend_threshold = 100 [root@host-116 ~]# ps -ef | grep dmeventd root 2342 2320 0 09:24 pts/0 00:00:00 grep dmeventd [root@host-116 ~]# lvcreate -k n -s /dev/snapper_thinp/origin -n full_snap WARNING: Sum of all thin volume sizes (10.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)! For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. Logical volume "full_snap" created. [root@host-116 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices POOL snapper_thinp twi-aot--- 1.00g 0.02 1.37 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-ao---- 1.00g /dev/sda1(1) [POOL_tmeta] snapper_thinp ewi-ao---- 4.00m /dev/sdf1(0) full_snap snapper_thinp Vwi-a-t--- 2.00g POOL origin 0.00 [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sda1(0) origin snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other1 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other2 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other3 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 [root@host-116 ~]# dd if=/dev/zero of=/dev/snapper_thinp/full_snap count=1500 bs=1M oflag=direct dd: writing `/dev/snapper_thinp/full_snap': Input/output error 1024+0 records in 1023+0 records out 1072693248 bytes (1.1 GB) copied, 70.7544 s, 15.2 MB/s [root@host-116 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices POOL snapper_thinp twi-aot-D- 1.00g 100.00 14.06 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-ao---- 1.00g /dev/sda1(1) [POOL_tmeta] snapper_thinp ewi-ao---- 4.00m /dev/sdf1(0) full_snap snapper_thinp Vwi-a-t--- 2.00g POOL origin 49.99 [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sda1(0) origin snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other1 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other2 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 other3 snapper_thinp Vwi-a-t--- 2.00g POOL 0.00 [root@host-116 ~]# lvcreate -V 2G -T snapper_thinp/POOL -n other4 WARNING: Sum of all thin volume sizes (12.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (1.00 GiB)! For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. /dev/snapper_thinp/other4: write failed after 0 of 4096 at 0: Input/output error Logical volume "other4" created. [root@host-116 ~]# ps -ef | grep dmeventd root 2400 1 0 09:24 ? 00:00:00 /sbin/dmeventd root 2594 2320 0 09:39 pts/0 00:00:00 grep dmeventd
Still not sure if I get it right - but is this about seeing waiting 'dd' process and instantly 'errored' one ("D" flagged) ? Before thin-pool is switching to 'D' - there is implicit (yet kernel modprobe configurable) timeout - ATM 60s. So whoever first fill thin-pool hits the timeout. So even if there is no 'dmeventd' autoextend - user still has a 60s chance to resize pool by himself. (With monitoring enabled - dmeventd throws some messages to syslog). So if the 'dd' is the first app to reach this full pool data state - it will wait 60s before it will error out. There is even 'ugly' hidden issue if you would not have use 'odirect' - since 'dd' may fill page cache and exit without even telling you there is a problem. Anyway - so far I'm still confused what is the actual difference between 7.2 and 6.8. (probably ping me)
Ok - so to reproduce discussion - lvm2 now correctly checks for threshold borders. i.e. when set 80% (for snapshot and thin) - if it's 80% - it's still valid, when it's above, the size is increased. So you can also create a thin LV when thin-pool has given threshold. The fact this in past accidentally caused 'guard' against creating a new thin-LV when thin-pool is 100% full and threshold was set to 100 was rather a side-effect of check '>=' (which is now correctly doing '>' everywhere). The remaining open now question is - do we want to 'reintroduce' extra check for thin-pool 100% ? We also may want to probably check for slightly smaller value e.g. 95% (as this is last value reported by dmeventd before usually getting fill 100% full pull) as checking for 100 is 'too late' anyway. It's relatively easy to add. Just the origin of this bug is rather 'feature misuse' then actually loosing some supported & documented one...
Looks like the 6.7 behavior was even worse. [root@host-117 ~]# lvcreate -L 200M -T vgtest/mythinpool -V1G -n thin1 Logical volume "thin1" created. [root@host-117 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Devices [lvol0_pmspare] vgtest ewi------- 4.00m /dev/sda1(0) mythinpool vgtest twi-aotz-- 200.00m 0.00 0.98 mythinpool_tdata(0) [mythinpool_tdata] vgtest Twi-ao---- 200.00m /dev/sda1(1) [mythinpool_tmeta] vgtest ewi-ao---- 4.00m /dev/sdh1(0) thin1 vgtest Vwi-a-tz-- 1.00g mythinpool 0.00 [root@host-117 ~]# dd if=/dev/zero of=/dev/mapper/vgtest-thin1 bs=1M count=1K 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB) copied, 64.237 s, 16.7 MB/s [root@host-117 ~]# lvs -a -o +devices /dev/vgtest/thin1: read failed after 0 of 4096 at 1073676288: Input/output error /dev/vgtest/thin1: read failed after 0 of 4096 at 1073733632: Input/output error LV VG Attr LSize Pool Origin Data% Meta% Devices [lvol0_pmspare] vgtest ewi------- 4.00m /dev/sda1(0) mythinpool vgtest twi-aotzM- 200.00m 100.00 3.42 mythinpool_tdata(0) [mythinpool_tdata] vgtest Twi-ao---- 200.00m /dev/sda1(1) [mythinpool_tmeta] vgtest ewi-ao---- 4.00m /dev/sdh1(0) thin1 vgtest Vwi-a-tz-- 1.00g mythinpool 19.53 [root@host-117 ~]# lvcreate -n thin2 -V 1G -T vgtest/mythinpool /dev/vgtest/thin1: read failed after 0 of 4096 at 1073676288: Input/output error /dev/vgtest/thin1: read failed after 0 of 4096 at 1073733632: Input/output error device-mapper: message ioctl on failed: Invalid argument Failed to resume mythinpool. 2.6.32-573.12.1.el6.x86_64 lvm2-2.02.118-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 lvm2-libs-2.02.118-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 lvm2-cluster-2.02.118-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 udev-147-2.63.el6_7.1 BUILT: Thu Nov 12 10:11:28 CST 2015 device-mapper-1.02.95-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 device-mapper-libs-1.02.95-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 device-mapper-event-1.02.95-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 device-mapper-event-libs-1.02.95-3.el6_7.4 BUILT: Tue Nov 10 12:12:57 CST 2015 device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT 2014
As Bug 1189221 suggests - this has be all resolved with version 2.02.124 and better.
(In reply to Zdenek Kabelac from comment #9) > As Bug 1189221 suggests - this has be all resolved with version 2.02.124 and > better. That is included in RHEL 6.8 lvm2 package.