Description of problem: With recent update to latest thin-pool kernel target it seems, the driver switches itself to read-only mode, but the device return error while waiting for extension and remains in read-only mode even when the new space has been added. Here is example: lvcreate -L10 -T vg/pool lvcreate -V20 -T vg/pool now while dmeventd is watching threshold - single dd does this: LC_ALL=C dd if=/dev/zero of=/dev/vg/lvol1 bs=1M dd: error writing '/dev/vg/lvol1': No space left on device 21+0 records in 20+0 records out 20971520 bytes (21 MB) copied, 0.33572 s, 62.5 MB/s # dmsetup table vg-pool: 0 61312 linear 253:3 0 vg-pool-tpool: 0 61312 thin-pool 253:1 253:2 128 0 0 vg-pool_tdata: 0 20480 linear 7:0 6144 vg-pool_tdata: 20480 40832 linear 7:0 30720 vg-pool_tmeta: 0 4096 linear 7:0 26624 # dmsetup status vg-pool: 0 61312 linear vg-pool-tpool: 0 61312 thin-pool 1 12/512 320/479 - ro discard_passdown queue_if_no_space vg-pool_tdata: 0 20480 linear vg-pool_tdata: 20480 40832 linear vg-pool_tmeta: 0 4096 linear Kernel logs this: With pool user cannot manipulate. [79436.148722] device-mapper: thin: 253:3: reached low water mark for data device: sending event. [79436.150555] device-mapper: thin: 253:3: no free data space available. [79436.150559] device-mapper: thin: 253:3: switching pool to read-only mode [79436.162560] Buffer I/O error on device dm-4, logical block 2528 [79436.162574] lost page write due to I/O error on dm-4 [79436.162584] device-mapper: thin: 253:3: metadata operation 'dm_pool_commit_metadata' failed: error = -1 [79436.162595] Buffer I/O error on device dm-4, logical block 2529 [79436.162597] lost page write due to I/O error on dm-4 [79436.162602] Buffer I/O error on device dm-4, logical block 2530 [79436.162604] lost page write due to I/O error on dm-4 [79436.162607] Buffer I/O error on device dm-4, logical block 2531 [79436.162609] lost page write due to I/O error on dm-4 [79436.162614] Buffer I/O error on device dm-4, logical block 2532 [79436.162615] lost page write due to I/O error on dm-4 [79436.162619] Buffer I/O error on device dm-4, logical block 2533 [79436.162620] lost page write due to I/O error on dm-4 [79436.162625] Buffer I/O error on device dm-4, logical block 2534 [79436.162626] lost page write due to I/O error on dm-4 [79436.162631] Buffer I/O error on device dm-4, logical block 2535 [79436.162632] lost page write due to I/O error on dm-4 [79436.162636] Buffer I/O error on device dm-4, logical block 2536 [79436.162638] lost page write due to I/O error on dm-4 [79436.175890] device-mapper: thin: 253:3: switching pool to write mode [79436.175898] device-mapper: thin: 253:3: growing the data device from 160 to 192 blocks [79436.177374] device-mapper: thin: 253:3: reached low water mark for data device: sending event. [79436.178616] device-mapper: thin: 253:3: no free data space available. [79436.178621] device-mapper: thin: 253:3: switching pool to read-only mode [79436.227868] device-mapper: thin: 253:3: switching pool to write mode [79436.227876] device-mapper: thin: 253:3: growing the data device from 192 to 230 blocks [79436.228816] device-mapper: thin: 253:3: reached low water mark for data device: sending event. [79436.230885] device-mapper: thin: 253:3: no free data space available. [79436.230891] device-mapper: thin: 253:3: switching pool to read-only mode [79436.280139] device-mapper: thin: 253:3: switching pool to write mode [79436.280145] device-mapper: thin: 253:3: growing the data device from 230 to 277 blocks [79436.280794] device-mapper: thin: 253:3: reached low water mark for data device: sending event. [79436.283262] device-mapper: thin: 253:3: no free data space available. [79436.283269] device-mapper: thin: 253:3: switching pool to read-only mode [79436.328451] device-mapper: thin: 253:3: switching pool to write mode [79436.328458] device-mapper: thin: 253:3: growing the data device from 277 to 332 blocks [79436.346794] device-mapper: thin: 253:3: switching pool to read-only mode [79436.391660] device-mapper: thin: 253:3: switching pool to write mode [79436.391668] device-mapper: thin: 253:3: growing the data device from 332 to 399 blocks [79436.394991] device-mapper: thin: 253:3: switching pool to read-only mode [79436.430559] device-mapper: thin: 253:3: switching pool to write mode [79436.430567] device-mapper: thin: 253:3: growing the data device from 399 to 479 blocks [79436.434421] device-mapper: thin: 253:3: switching pool to read-only mode Version-Release number of selected component (if applicable): 3.10.0-85.el7.x86_64 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 863191 [details] Patch 1 1st. patch suggest to partial fix problem
Created attachment 863192 [details] Patch 2 2nd. patch proposed by Mike
When both patches are applied - problem seems to be slightly different. Now lvm2 is capable to resize the data device for pool without getting write errors generated during on going write to thin volume. Here are again step to repeat the problem without even using dmeventd - just multiple terminals are needed. Check lvm.conf has disable threashold for pool (or disabled monitoring) and filter out dm devices (enable on PV device and reject everything else) thin_pool_autoextend_threshold = 100 monitoring = 0 filter = ["a/loop/", "r/.*/"] Create pool & thin volume: - lvcreate -L10 -V20 -T vg/pool Create 20MB dd write which will block - dd if=/dev/zero of=/dev/vg/lvol1 bs=1M Now the pool should be blocked and awating resize. In the 2nd terminal - - lvextend -L+20 vg/pool this should unlock pool and let the write finish. During this I can obtain this kernel log: [21144.209733] device-mapper: thin: 253:3: reached low water mark for data device: sending event. [21144.210063] device-mapper: thin: 253:3: no free data space available. [21144.210066] device-mapper: thin: 253:3: switching pool to read-only mode [21144.218431] bio: create slab <bio-0> at 0 [21177.928424] bio: create slab <bio-0> at 0 [21177.929101] device-mapper: thin: 253:3: growing the data device from 160 to 480 blocks [21177.929172] device-mapper: thin: 253:3: switching pool to write mode [21177.992551] device-mapper: thin: 253:3: switching pool to read-only mode [21178.001530] bio: create slab <bio-0> at 0 As could be seen - pool seems to stay in read-only mode Now when I try to remove thin volume lvol1 - I get error when delete message is passed to the thin pool: [21269.301661] device-mapper: space map common: dm_tm_shadow_block() failed [21269.301671] device-mapper: space map common: dm_tm_shadow_block() failed [21269.301674] device-mapper: space map metadata: unable to allocate new metadata block [21269.301677] device-mapper: thin: Deletion of thin device 1 failed. Maybe there needs to be a special new command passed to the resize thin-pool ? - but this would also mean the driver is incompatible with previous behavior, where pool has been just waiting for resize and after that it has continued with normal operation.
(In reply to Zdenek Kabelac from comment #4) > When both patches are applied - problem seems to be slightly different. > Now lvm2 is capable to resize the data device for pool without getting write > errors generated during on going write to thin volume. > > Here are again step to repeat the problem without even using dmeventd - just > multiple terminals are needed. Thanks, I'll try to reproduce today. > As could be seen - pool seems to stay in read-only mode Yeah, considering there is no indication of error that would cause the transition back to read-only, this is weird.
(In reply to Mike Snitzer from comment #5) > (In reply to Zdenek Kabelac from comment #4) > > When both patches are applied - problem seems to be slightly different. > > Now lvm2 is capable to resize the data device for pool without getting write > > errors generated during on going write to thin volume. > > > > Here are again step to repeat the problem without even using dmeventd - just > > multiple terminals are needed. > > Thanks, I'll try to reproduce today. > > > As could be seen - pool seems to stay in read-only mode > > Yeah, considering there is no indication of error that would cause the > transition back to read-only, this is weird. lvextend of the pool is suspending and resuming the pool multiple times during resize: #libdm-deptree.c:2476 Loading stec-pool-tpool table (253:3) #libdm-deptree.c:2420 Adding target to (253:3): 0 65536 thin-pool 253:1 253:2 128 0 0 #ioctl/libdm-iface.c:1750 dm table (253:3) OF [16384] (*1) #ioctl/libdm-iface.c:1750 dm reload (253:3) NF [16384] (*1) #libdm-deptree.c:2528 Table size changed from 24576 to 65536 for stec-pool-tpool (253:3). #libdm-deptree.c:1263 Resuming stec-pool-tpool (253:3) #libdm-common.c:2154 Udev cookie 0xd4db319 (semid 1736707) incremented to 3 #libdm-common.c:2395 Udev cookie 0xd4db319 (semid 1736707) assigned to RESUME task(5) with flags DISABLE_SUBSYSTEM_RULES DISABLE_DISK_RULES DISABLE_OTHER_RULES DISABLE_LIBRARY_FALLBACK (0x2e) #ioctl/libdm-iface.c:1750 dm resume (253:3) NF [16384] (*1) #libdm-common.c:1352 stec-pool-tpool: Stacking NODE_ADD (253,3) 0:6 0660 [trust_udev] #libdm-common.c:1362 stec-pool-tpool: Stacking NODE_READ_AHEAD 256 (flags=1) and later: #libdm-deptree.c:1314 Suspending stec-pool-tpool (253:3) with device flush #ioctl/libdm-iface.c:1750 dm suspend (253:3) NFS [16384] (*1) ... #libdm-deptree.c:2476 Loading stec-pool-tpool table (253:3) #libdm-deptree.c:2420 Adding target to (253:3): 0 65536 thin-pool 253:1 253:2 128 0 0 #ioctl/libdm-iface.c:1750 dm table (253:3) OF [16384] (*1) #libdm-deptree.c:2511 Suppressed stec-pool-tpool (253:3) identical table reload. ... #libdm-deptree.c:1263 Resuming stec-pool-tpool (253:3) #libdm-common.c:2154 Udev cookie 0xd4db319 (semid 1736707) incremented to 4 #libdm-common.c:2395 Udev cookie 0xd4db319 (semid 1736707) assigned to RESUME task(5) with flags DISABLE_SUBSYSTEM_RULES DISABLE_DISK_RULES DISABLE_OTHER_RULES DISABLE_LIBRARY_FALLBACK (0x2e) #ioctl/libdm-iface.c:1750 dm resume (253:3) NF [16384] (*1) #libdm-common.c:1352 stec-pool-tpool: Stacking NODE_ADD (253,3) 0:6 0660 [trust_udev] #libdm-common.c:1362 stec-pool-tpool: Stacking NODE_READ_AHEAD 256 (flags=1) #libdm-common.c:225 Suspended device counter reduced to 1 Not sure why lvextend is doing the second suspend/resume (probably some bug in libdm-deptree?) but regardless.. the table isn't changing so it certainly shouldn't be causing the pool to transition to read-only mode.
(In reply to Mike Snitzer from comment #6) > Not sure why lvextend is doing the second suspend/resume (probably some bug > in libdm-deptree?) but regardless.. the table isn't changing so it certainly > shouldn't be causing the pool to transition to read-only mode. Using the device-mapper-test-suite -- which doesn't use lvm -- I've been able to confirm that a suspend+resume of a read-write pool, at the end of our "resize_io" test, will cause it to transition to read-only.
Subtree suspend/resume is the current way of operation of whole lvm. Mostly to advertise on top-level node, there is something underneath suspended - so you may avoid opening device which you know you will sleep on it (yet it's not really race-free...) Anyway I guess it shouldn't cause troubles - operation is just slightly slower.
Created attachment 863524 [details] Patch 3 This patch fixes the unexpected pool mode transition on table reload.
FYI, the latest fixes are all available in the 'devel' branch of this git repo: git://git.kernel.org/pub/scm/linux/kernel/git/snitzer/linux.git You can browse the changes here: https://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/log/?h=devel I'll be pulling a subset (or all) of these thinp changes into linux-dm.git (as fixes for 3.14) once I've had a chance to coordinate with Joe.
(In reply to Mike Snitzer from comment #7) > (In reply to Mike Snitzer from comment #6) > > > Not sure why lvextend is doing the second suspend/resume (probably some bug > > in libdm-deptree?) but regardless.. the table isn't changing so it certainly > > shouldn't be causing the pool to transition to read-only mode. > > Using the device-mapper-test-suite -- which doesn't use lvm -- I've been > able to confirm that a suspend+resume of a read-write pool, at the end of > our "resize_io" test, will cause it to transition to read-only. Hi Mike, I'm reproducing the issue using device-mapper-test-suite, below is my test result. Could you confirm if the failure is the expected and how do you confirm that a suspend+resume of a read-write pool like your said above? # dmtest run --suite thin-provisioning -n resize_io --profile spindle Loaded suite thin-provisioning Started test_resize_io(PoolResizeTests): F Finished in 7.896977168 seconds. 1) Failure: test_resize_io(PoolResizeTests) [/usr/local/rvm/gems/ruby-1.9.3-p484/gems/rspec-expectations-2.14.5/lib/rspec/expectations/fail_with.rb:32:in `fail_with' /usr/local/rvm/gems/ruby-1.9.3-p484/gems/rspec-expectations-2.14.5/lib/rspec/expectations/handler.rb:36:in `handle_matcher' /usr/local/rvm/gems/ruby-1.9.3-p484/gems/rspec-expectations-2.14.5/lib/rspec/expectations/syntax.rb:53:in `should' /root/device-mapper-test-suite/lib/dmtest/tests/thin-provisioning/pool_resize_tests.rb:91:in `block in resize_io_many' /root/device-mapper-test-suite/lib/dmtest/pool-stack.rb:33:in `call' /root/device-mapper-test-suite/lib/dmtest/pool-stack.rb:33:in `block in activate' /root/device-mapper-test-suite/lib/dmtest/prelude.rb:6:in `bracket' /root/device-mapper-test-suite/lib/dmtest/device-mapper/lexical_operators.rb:12:in `with_dev' /root/device-mapper-test-suite/lib/dmtest/pool-stack.rb:31:in `activate' /root/device-mapper-test-suite/lib/dmtest/thinp-mixin.rb:125:in `with_standard_pool' /root/device-mapper-test-suite/lib/dmtest/tests/thin-provisioning/pool_resize_tests.rb:61:in `resize_io_many' /root/device-mapper-test-suite/lib/dmtest/tests/thin-provisioning/pool_resize_tests.rb:99:in `test_resize_io']: expected: false value got: true 1 tests, 0 assertions, 1 failures, 0 errors [root@hp-dl385pg8-03 device-mapper-test-suite]# lvs tsvg LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert lvol0 tsvg -wi------- 8.00m mythinpool tsvg twi-a-tz-- 4.88g 20.49 thinlv1 tsvg Vwi-a-tz-- 1.00g mythinpool 0.07 thinlv2 tsvg Vwi-a-tz-- 4.00g mythinpool 25.00 # cat ~/.dmtest/config profile :spindle do metadata_dev '/dev/tsvg/thinlv1' data_dev '/dev/tsvg/thinlv2' end
(In reply to yanfu,wang from comment #11) > (In reply to Mike Snitzer from comment #7) > > (In reply to Mike Snitzer from comment #6) > > > > > Not sure why lvextend is doing the second suspend/resume (probably some bug > > > in libdm-deptree?) but regardless.. the table isn't changing so it certainly > > > shouldn't be causing the pool to transition to read-only mode. > > > > Using the device-mapper-test-suite -- which doesn't use lvm -- I've been > > able to confirm that a suspend+resume of a read-write pool, at the end of > > our "resize_io" test, will cause it to transition to read-only. > > Hi Mike, > I'm reproducing the issue using device-mapper-test-suite, below is my test > result. Could you confirm if the failure is the expected and how do you > confirm that a suspend+resume of a read-write pool like your said above? > # dmtest run --suite thin-provisioning -n resize_io --profile spindle > Loaded suite thin-provisioning > Started > test_resize_io(PoolResizeTests): F > > Finished in 7.896977168 seconds. > > 1) Failure: <snip> > /root/device-mapper-test-suite/lib/dmtest/tests/thin-provisioning/ > pool_resize_tests.rb:99:in `test_resize_io']: > expected: false value > got: true The failed assertion is: status.options[:read_only].should be_false Meaning, the pool is in read-only mode after the suspend+resume. A kernel with the fix wouldn't transition the pool to read-only. FYI, the 'devel' branch that I referenced in comment#10 doesn't exist any more -- it was replaced by the 'dm-3.14-fixes' branch of the snitzer/linux.git repo.
(In reply to Mike Snitzer from comment #12) > (In reply to yanfu,wang from comment #11) > > (In reply to Mike Snitzer from comment #7) > > > (In reply to Mike Snitzer from comment #6) > > > > > > > Not sure why lvextend is doing the second suspend/resume (probably some bug > > > > in libdm-deptree?) but regardless.. the table isn't changing so it certainly > > > > shouldn't be causing the pool to transition to read-only mode. > > > > > > Using the device-mapper-test-suite -- which doesn't use lvm -- I've been > > > able to confirm that a suspend+resume of a read-write pool, at the end of > > > our "resize_io" test, will cause it to transition to read-only. > > > > Hi Mike, > > I'm reproducing the issue using device-mapper-test-suite, below is my test > > result. Could you confirm if the failure is the expected and how do you > > confirm that a suspend+resume of a read-write pool like your said above? > > # dmtest run --suite thin-provisioning -n resize_io --profile spindle > > Loaded suite thin-provisioning > > Started > > test_resize_io(PoolResizeTests): F > > > > Finished in 7.896977168 seconds. > > > > 1) Failure: > > <snip> > > > /root/device-mapper-test-suite/lib/dmtest/tests/thin-provisioning/ > > pool_resize_tests.rb:99:in `test_resize_io']: > > expected: false value > > got: true > > The failed assertion is: > status.options[:read_only].should be_false Hi Mike, Seems I'm running in unexpected failure, could you help to look at where I'm wrong? I'm using https://github.com/jthornber/device-mapper-test-suite.git and test on kernel 3.10.0-75.el7.x86_64 to reproduce. My setup shown as below: # modprobe scsi-debug dev_size_mb=6000 lbpu=1 lbpws10=1 # pvcreate /dev/sdb # vgcreate tsvg /dev/sdb # lvcreate -L 5000M -T tsvg/mythinpool # lvcreate -V1G -T tsvg/mythinpool -n thinlv1 # lvcreate -V4G -T tsvg/mythinpool -n thinlv2 Edit ~/.dmtest/config: profile :spindle do metadata_dev '/dev/tsvg/thinlv1' data_dev '/dev/tsvg/thinlv2' end run 'dmtest run --suite thin-provisioning -n resize_io --profile spindle' and get the failure like comment #11. How to get the expected failure: status.options[:read_only].should be_false? Thanks in advance. > > Meaning, the pool is in read-only mode after the suspend+resume. A kernel > with the fix wouldn't transition the pool to read-only. > > FYI, the 'devel' branch that I referenced in comment#10 doesn't exist any > more -- it was replaced by the 'dm-3.14-fixes' branch of the > snitzer/linux.git repo.
(In reply to yanfu,wang from comment #13) > run 'dmtest run --suite thin-provisioning -n resize_io --profile spindle' > and get the failure like comment #11. > > How to get the expected failure: status.options[:read_only].should be_false? That is _not_ the expected failure. 'status.options[:read_only].should be_false' is the ruby code that causes the failed assertion error. The "expected: false value" error you're seeing is what I'd expect from a kernel that isn't fixed.
Do you need any additional info from me? The "resize_io" test is one of the tests we have in the device-mapper-test-suite that validates resising the data volume works as expected. We also have new tests that were developed to exercise new aspects of the kernel fixes that have been developed (these fixes will be posted to rhkernel-list for RHEL7 Snap11 kernel inclusion).
(In reply to Mike Snitzer from comment #15) > Do you need any additional info from me? The "resize_io" test is one of the > tests we have in the device-mapper-test-suite that validates resising the > data volume works as expected. > > We also have new tests that were developed to exercise new aspects of the > kernel fixes that have been developed (these fixes will be posted to > rhkernel-list for RHEL7 Snap11 kernel inclusion). Thanks Mike's reply, I will try testing again as per above comments. set qa_ack+
Patch(es) available on kernel-3.10.0-108.el7
Patch(es) available on kernel-3.10.0-109.el7, not 108 as previously stated
Due to issues in 309, please wait to test these patches in kernel-3.10.0-110.el7 or later
This request was resolved in Red Hat Enterprise Linux 7.0. Contact your manager or support representative in case you have further questions about the request.