Bug 1030411

Summary:	resizing thin-snapshot with external origin should return zeros behind origin's end
Product:	Red Hat Enterprise Linux 6	Reporter:	Marian Csontos <mcsontos>
Component:	kernel	Assignee:	Joe Thornber <thornber>
kernel sub component:	Thin Provisioning	QA Contact:	yanfu,wang <yanwang>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	agk, bdonahue, dwysocha, heinzm, jbrassow, msnitzer, prajnoha, prockai, shyu, thornber, yanwang, zkabelac
Version:	6.5
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	kernel-2.6.32-493.el6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2014-10-14 05:37:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Marian Csontos 2013-11-14 11:40:54 UTC

Description of problem:
lvextending thin-snapshot with ext.origin will result EIO errors when reading behind the end.
One may argue do not read what you did not write, but anyone attempting to write chunks smaller than 4k will get a failure (for example mkfs.ext4 does this[1])

[1] Yes I am aware it is not a good idea to format a snapshot and I should better create a new thin-volume. I tried anyway. :-)


Version-Release number of selected component (if applicable):
lvm2-2.02.100-8.el6.x86_64
lvm2-2.02.104-0.145.el6.x86_64 (Upstream lvm)
kernel-2.6.32-431.el6.x86_64

RHEL7 and F20 are all affected.

dmsetup targets:
thin-pool        v1.7.6
thin             v1.8.6
mirror           v1.12.0
striped          v1.5.6
linear           v1.1.0
error            v1.0.1


How reproducible:
100%


Steps to Reproduce:
    VG=vg
    lvcreate --thinpool pool -L 8G $VG
    lvcreate -n l1 -L 1G $VG
    lvchange -an -pr $VG/l1
    lvcreate -s -n s1 --thinpool pool $VG/l1
    lvextend -L+1G $VG/s1
    lvs
    # Reading 4k block one less and one behind the end of origin - these are both fine:
    dd if=/dev/zero of=/dev/$VG/s1 bs=4096 count=1 seek=$((1024*1024/4-1))
    dd if=/dev/zero of=/dev/$VG/s1 bs=4096 count=1 seek=$((1024*1024/4))
    # Reading 1k block one less and one behind the end of origin - the write behind end of origin fails:
    dd if=/dev/zero of=/dev/$VG/s1 bs=1024 count=1 seek=$((1024*1024-1))
    dd if=/dev/zero of=/dev/$VG/s1 bs=1024 count=1 seek=$((1024*1024)) # THIS FAILS


Actual results:
Attempt to read from extended `s1` fails with EIOs - e.g. the lvs output below says:
  /dev/vg/s1: read failed after 0 of 4096 at 2147418112: Input/output error
  /dev/vg/s1: read failed after 0 of 4096 at 2147475456: Input/output error
This seems to be not an issue (though it is reported in messages) until one does not try to write chunks smaller than 4k.

Expected results:
This should work as if the origin was zero-padded.

Additional info:

(05:10:35) [root@zaphodc1-node01:~]$ lvcreate --thinpool pool -L 8G vg
  Logical volume "lvol0" created
  Logical volume "pool" created
(05:10:55) [root@zaphodc1-node01:~]$ lvcreate -n l1 -L 1G vg
  Logical volume "l1" created
(05:11:17) [root@zaphodc1-node01:~]$ lvchange -an -pr vg/l1
  Logical volume "l1" changed.
(05:11:25) [root@zaphodc1-node01:~]$ lvcreate -s -n s1 --thinpool pool vg/l1
  Logical volume "s1" created
(05:11:54) [root@zaphodc1-node01:~]$ lvextend -L+1G vg/s1
  Extending logical volume s1 to 2.00 GiB
  Logical volume s1 successfully resized
(05:12:23) [root@zaphodc1-node01:~]$ lvs
  /dev/vg/s1: read failed after 0 of 4096 at 2147418112: Input/output error
  /dev/vg/s1: read failed after 0 of 4096 at 2147475456: Input/output error
  LV      VG       Attr       LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  lv_root VolGroup -wi-ao----   5.81g                                             
  lv_swap VolGroup -wi-ao---- 716.00m                                             
  l1      vg       ori-------   1.00g                                             
  pool    vg       twi-a-tz--   8.00g               0.00                          
  s1      vg       Vwi-a-tz--   2.00g pool l1       0.00                          

(05:12:25) [root@zaphodc1-node01:~]$ dd if=/dev/zero of=/dev/vg/s1 bs=1024 count=1 seek=$((1024*1024-1))
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.0149106 s, 68.7 kB/s

(05:12:59) [root@zaphodc1-node01:~]$ dd if=/dev/zero of=/dev/vg/s1 bs=1024 count=1 seek=$((1024*1024))
dd: writing `/dev/vg/s1': Input/output error
1+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00171456 s, 0.0 kB/s

(05:13:19) [root@zaphodc1-node01:~]$ dd if=/dev/zero of=/dev/vg/s1 bs=4096 count=1 seek=$((1024*1024/4-1))
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00187546 s, 2.2 MB/s

(05:13:33) [root@zaphodc1-node01:~]$ dd if=/dev/zero of=/dev/vg/s1 bs=4096 count=1 seek=$((1024*1024/4))
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.0215167 s, 190 kB/s

Comment 1 Zdenek Kabelac 2013-11-14 11:48:28 UTC

Looks like duplicate of R7 Bug  #976045.

Comment 3 Alasdair Kergon 2013-11-14 14:27:09 UTC

Let's use this bug for the kernel fix as suggested in bug 976045:


"E.g. record the external origin size on table load

tc->origin_dev_size = i_size_read(origin_dev->bdev->bd_inode) >> SECTOR_SHIFT;

and check the end of the bio data against this in process_bio() when deciding whether to remap or provision.

I think it's OK to require that the external origin is not reduced in size and this case does not need handling in the kernel.

We may also need to check what happens at the end of the device if the external origin is not an exact multiple of the pool block size."


It needs to cope with either device being resized.

Comment 4 Zdenek Kabelac 2014-01-29 14:04:18 UTC

Related problem could the usage of external origin with the size, that is not an exact multiple of thin pool chunk size.

In case like thin pool is using i.e. 192KB chunk size, but external origin has i.e. 1MB (~5.3 * 192) the write to such thin volume work only for the first 5 chunks.  6th chunk seems to be not provisioned and data are always read from external origin, even in the case they were update in the thin volume.

LV has been currently patched to prohibit the use of unaligned external origins with upstream commit:

https://www.redhat.com/archives/lvm-devel/2014-January/msg00071.html

Once the kernel target is fixed - the condition could be relaxed for newer thin pool targets.

Comment 6 RHEL Program Management 2014-03-26 00:16:04 UTC

This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 8 Rafael Aquini 2014-07-28 18:32:39 UTC

Patch(es) available on kernel-2.6.32-493.el6

Comment 12 errata-xmlrpc 2014-10-14 05:37:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-1392.html