Bug 852812
Summary: | Automatic umount (in theory) of full thinp snapshots (in one of the many cases that exist) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> |
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.3 | CC: | agk, coughlan, ddumas, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | lvm2-2.02.98-1.el6 | Doc Type: | Known Issue |
Doc Text: |
When filling a thin pool to 100% by writing to thin volume device, access to all thin volumes using this thin pool can be blocked. To prevent this, try not to overfill the pool. If the pool is overfilled and this error occurs, extend the thin pool with new space to continue using the pool.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2013-02-21 08:13:16 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2012-08-29 16:26:17 UTC
The problem here is that thin volumes/snapshots are different - while old snapshot was 'a separate' device which was having its own live - overfilling thin snapshot currently kills all related thin volumes. Thus setting up enough free space where the thin pool could grow via dmeventd lvextend is key element here. We need to decide what should precisely happen in several scenarios. We could have either full data device and|or full metadata device. All thin volumes from one thin pool share the same space in those devices. If we want to safely unmount device - we need to try unmounting device much sooner then 100% fullness is reported (so all metadata & data could be stored). Upstream solution for 6.4 went into 2.02.98 release. https://www.redhat.com/archives/lvm-devel/2012-October/msg00126.html System detects when the thin pool goes above lvm.conf value of: activation/thin_pool_autoextend_threshold When the lvextends --use-policies fails to extend pool LV (which could be now also tested by setting lvm.conf value of: activation/thin_pool_autoextend_percent to 0 (and extension by 0% returns error) - dmeventd should try to: unmount -fl all thin volumes using this pool. For now this check is only based on dmeventd timer - so it make take upto 10 seconds to observer umount reaction. New bugzilla for 6.5 will be opened to cover further fixes for related problems. Bug 866971 should take care about lvm2 thin pool policies. It's now impossible to fill a thinp snapshot. So was this part of the fix? Or is there no longer a need to auto unmount a "full" thin snapshot if filling it is not longer possible? SCENARIO - [full_EXT_snap_verification] Verify full snapshots on top of EXT are automatically unmounted and can be removed Making origin volume Creating thinpool and corresponding thin virtual volumes (one to be used as an origin) lvcreate --thinpool POOL -L 1G snapper_thinp lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n origin Placing an EXT filesystem on origin volume mke2fs 1.41.12 (17-May-2010) Making snapshot of origin volume lvcreate -s /dev/snapper_thinp/origin -n full_snap Mounting snapshot volume Filling snapshot /dev/snapper_thinp/full_snap dd if=/dev/zero of=/mnt/full_snap/fill_file count=801 bs=1M oflag=direct dd: writing `/mnt/full_snap/fill_file': No space left on device 786+0 records in 785+0 records out 823132160 bytes (823 MB) copied, 16.042 s, 51.3 MB/s snapshot didn't fill completely up [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Devices POOL snapper_thinp twi-a-tz- 1.00g 78.19 POOL_tdata(0) [POOL_tdata] snapper_thinp Twi-aot-- 1.00g /dev/sdg1(0) [POOL_tmeta] snapper_thinp ewi-aot-- 4.00m /dev/sdc1(0) full_snap snapper_thinp Vwi-aotz- 800.00m POOL origin 99.95 origin snapper_thinp Vwi-a-tz- 800.00m POOL 1.65 [root@taft-01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/snapper_thinp-full_snap 788M 787M 0 100% /mnt/full_snap From the lvs - pool size is 1G - while origin and it's snapshot has only 800M size. Origin has provisioned 1.65% of blocks from 800M. Its snapshot after 'dd' provisioned 99.95% from 800M. This all fits within a pool - which is now being used for 78.19% (so ~13.2 + ~799.6 consumed 812.8MB - since pool is reported to consume ~800.7 there is around 12MB of shared size) So to test filling of the snapshot (which is now limited only by the thin pool size) you need to use smalled pool size and there needs to be no free space for pool extension or extension must by disabled - so you overfill thinpool. (This is were thin snapshot and old-snapshot differs significantly, since only thinpool size is now limit of all snaps inside pool - this may change with 'external-origin' snapshot planned for 6.5. This leads however to whole new area of problems with pool recovery after pool being overfilled. I must still be missing something as just like in comment #5, I can fill the file system, but can not fully fill the actual snapshot volume and again, the full file system is not auto unmounted. Is there actually a devel tested scenario where a full thinp snap volume is auto unmounted? If so please post the results. SCENARIO - [full_EXT_snap_verification] Verify full snapshots on top of EXT are automatically unmounted and can be removed Making origin volume Creating thinpool and corresponding thin virtual volumes (one to be used as an origin) lvcreate --thinpool POOL -L 800M snapper_thinp lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n origin lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n other1 lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n other2 lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n other3 lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n other4 lvcreate --virtualsize 800M --thinpool snapper_thinp/POOL -n other5 Placing an EXT filesystem on origin volume mke2fs 1.41.12 (17-May-2010) Making snapshot of origin volume lvcreate -s /dev/snapper_thinp/origin -n full_snap Mounting snapshot volume Filling snapshot /dev/snapper_thinp/full_snap dd if=/dev/zero of=/mnt/full_snap/fill_file count=850 bs=1M oflag=direct [root@hayes-01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/snapper_thinp-full_snap 788M 788M 0 100% /mnt/full_snap [root@hayes-01 ~]# lvs -a -o +devices LV Attr LSize Pool Origin Data% Devices POOL twi-a-tz- 800.00m 100.00 POOL_tdata(0) [POOL_tdata] Twi-aot-- 800.00m /dev/etherd/e1.1p2(0) [POOL_tmeta] ewi-aot-- 4.00m /dev/etherd/e1.1p1(0) full_snap Vwi-aotz- 800.00m POOL origin 99.90 origin Vwi-a-tz- 800.00m POOL 1.65 other1 Vwi-a-tz- 800.00m POOL 0.00 other2 Vwi-a-tz- 800.00m POOL 0.00 other3 Vwi-a-tz- 800.00m POOL 0.00 other4 Vwi-a-tz- 800.00m POOL 0.00 other5 Vwi-a-tz- 800.00m POOL 0.00 Jan 25 14:19:46 hayes-01 lvm[2516]: Thin snapper_thinp-POOL-tpool is now 80% full. Jan 25 14:20:55 hayes-01 lvm[2516]: Thin snapper_thinp-POOL-tpool is now 85% full. Jan 25 14:22:05 hayes-01 lvm[2516]: Thin snapper_thinp-POOL-tpool is now 90% full. Jan 25 14:23:35 hayes-01 lvm[2516]: Thin snapper_thinp-POOL-tpool is now 95% full. Jan 25 14:24:31 hayes-01 kernel: device-mapper: thin: 253:5: reached low water mark, sending event. Jan 25 14:24:31 hayes-01 lvm[2516]: Thin snapper_thinp-POOL-tpool is now 100% full. Jan 25 14:24:31 hayes-01 kernel: device-mapper: thin: 253:5: no free space available. Jan 25 14:27:09 hayes-01 kernel: INFO: task dd:2618 blocked for more than 120 seconds. Jan 25 14:27:09 hayes-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 25 14:27:09 hayes-01 kernel: dd D 0000000000000002 0 2618 2617 0x00000080 Jan 25 14:27:09 hayes-01 kernel: ffff88011d54fa78 0000000000000082 0000000000000000 ffffffffa00043ec Jan 25 14:27:09 hayes-01 kernel: ffff88011d54fa48 00000000ed7583d7 0000000000000000 ffff88011cebb540 Jan 25 14:27:09 hayes-01 kernel: ffff88011c3165f8 ffff88011d54ffd8 000000000000fb88 ffff88011c3165f8 Jan 25 14:27:09 hayes-01 kernel: Call Trace: Jan 25 14:27:09 hayes-01 kernel: [<ffffffffa00043ec>] ? dm_table_unplug_all+0x5c/0x100 [dm_mod] Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8150d9c3>] io_schedule+0x73/0xc0 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff811be84e>] __blockdev_direct_IO_newtrunc+0x6de/0xb30 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff811becfe>] __blockdev_direct_IO+0x5e/0xd0 Jan 25 14:27:09 hayes-01 kernel: [<ffffffffa047c380>] ? ext2_get_block+0x0/0x50 [ext2] Jan 25 14:27:09 hayes-01 kernel: [<ffffffffa047afde>] ext2_direct_IO+0x5e/0x60 [ext2] Jan 25 14:27:09 hayes-01 kernel: [<ffffffffa047c380>] ? ext2_get_block+0x0/0x50 [ext2] Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8111a8f2>] generic_file_direct_write+0xc2/0x190 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8111c221>] __generic_file_aio_write+0x3a1/0x490 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff810572f0>] ? __dequeue_entity+0x30/0x50 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8111c398>] generic_file_aio_write+0x88/0x100 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81180baa>] do_sync_write+0xfa/0x140 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81510025>] ? page_fault+0x25/0x30 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81283512>] ? __clear_user+0x42/0x70 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81228a6b>] ? selinux_file_permission+0xfb/0x150 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8121b946>] ? security_file_permission+0x16/0x20 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff81180ea8>] vfs_write+0xb8/0x1a0 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff811821ab>] ? fget_light+0x3b/0x90 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff811817a1>] sys_write+0x51/0x90 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff810dc565>] ? __audit_syscall_exit+0x265/0x290 Jan 25 14:27:09 hayes-01 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b Unmounting of thin volumes works only for the case, the pool is tried to be extend, and extension fails. If the thin-pool write is very fast and pool is small, there could be a latency problem to get some reaction and thin-pool gets 100% full. In this moment device driver is stuck and expects more space to be added to the pool to finish in flight-operations (since snapshot is same as normal thin volume this applies also to regular write to a single thin volume in thin-pool.) So the user should ensure the pool does not get 100% full, and if it does, there is needed some extra spaces to be added to VG (vgextend) and pool is extended (lvextend) to unblock blocked pool. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0501.html |