Bug 976837 - lseek with SEEK_DATA / SEEK_HOLE broken on ext4
lseek with SEEK_DATA / SEEK_HOLE broken on ext4
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
18
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Eric Sandeen
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-21 11:32 EDT by Martin Wilck
Modified: 2013-07-20 05:40 EDT (History)
5 users (show)

See Also:
Fixed In Version: kernel-3.9.10-200.fc18
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-18 02:09:58 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Martin Wilck 2013-06-21 11:32:13 EDT
Description of problem:
SEEK_DATA / SEEK_HOLE doesn't work as expected on ext4.
This can be seen e.g. with star which uses SEEK_DATA / SEEK_HOLE for sparse file support.

Version-Release number of selected component (if applicable):
3.8.5-201.fc18.x86_64

How reproducible:
always

Steps to Reproduce:
1. star c -H exustar -sparse -f /dev/null /var/log/lastlog

Actual results:
Hangs forever

Expected results:
Finishes

Additional info:
strace output:

open("/var/log/lastlog", O_RDONLY)      = 3
lseek(3, 0, 0x4 /* SEEK_??? */)         = 4096
lseek(3, 0, SEEK_SET)                   = 0
write(2, "/var/log/lastlog is sparse\n", 27/var/log/lastlog is sparse
) = 27
lseek(3, 0, 0x4 /* SEEK_??? */)         = 4096
lseek(3, 4096, 0x3 /* SEEK_??? */)      = 8192
lseek(3, 8192, 0x4 /* SEEK_??? */)      = 24576
lseek(3, 24576, 0x3 /* SEEK_??? */)     = 28672
lseek(3, 28672, 0x4 /* SEEK_??? */)     = 32768
lseek(3, 32768, 0x3 /* SEEK_??? */)     = 49152
lseek(3, 49152, 0x4 /* SEEK_??? */)     = 53248
lseek(3, 53248, 0x3 /* SEEK_??? */)     = 286720
lseek(3, 286720, 0x4 /* SEEK_??? */)    = 294912
lseek(3, 294912, 0x3 /* SEEK_??? */)    = 607887360
lseek(3, 607887360, 0x4 /* SEEK_??? */) = 607887360
lseek(3, 607887360, 0x3 /* SEEK_??? */) = 607887360
lseek(3, 607887360, 0x4 /* SEEK_??? */) = 607887360
lseek(3, 607887360, 0x3 /* SEEK_??? */) = 607887360
lseek(3, 607887360, 0x4 /* SEEK_??? */) = 607887360
[...]

It can be seen that star is stuck at the same offset. star makes alternating calls to lseek(..., SEEK_DATA) (0x3)  and lseek(..., SEEK_HOLE) (0x4), and the resulting offset value is always equal to the input.

This can't be correct, either SEEK_DATA or SEEK_HOLE must return a value larger than the input offset.

# ls -l /var/log/lastlog
-rw-r--r--. 1 root root 4902858704 Jun 21 14:28 /var/log/lastlog
Comment 1 Martin Wilck 2013-06-21 11:56:41 EDT
This happens only with input files of a certain size, The critical value appears to be 0xfffff000 (4GiB - 1 page).

# truncate -s $((0xffffefff)) /var/tmp/hepp
# star c -H exustar -sparse -f /dev/null /var/tmp/hepp 
/var/tmp/hepp is sparse
star: 1 blocks + 0 bytes (total of 10240 bytes = 10.00k).
(success)

# truncate -s $((0xfffff000)) /var/tmp/hepp
# star c -H exustar -sparse -f /dev/null /var/tmp/hepp 
/var/tmp/hepp is sparse
(*HANGS*)
Comment 2 Martin Wilck 2013-06-21 12:09:15 EDT
Happens with  3.9.6-200.fc18.x86_64, too

[root@cooper martin]# uname -a
Linux cooper.psw.pdbps.fsc.net 3.9.6-200.fc18.x86_64 #1 SMP Thu Jun 13 18:56:55 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@cooper martin]# truncate -s $((0xfffff000)) /var/tmp/hepp
[root@cooper martin]# strace -e trace=lseek star c -H exustar -sparse -f /dev/null /var/tmp/hepp 
lseek(7, 0, SEEK_CUR)                   = 0
lseek(3, 0, 0x4 /* SEEK_??? */)         = 0
/var/tmp/hepp is sparse
lseek(3, 0, 0x4 /* SEEK_??? */)         = 0
lseek(3, 0, 0x3 /* SEEK_??? */)         = 0
lseek(3, 0, 0x4 /* SEEK_??? */)         = 0
lseek(3, 0, 0x3 /* SEEK_??? */)         = 0
[...]


[root@cooper martin]# truncate -s $((0xffffefff)) /var/tmp/hepp
[root@cooper martin]# strace -e trace=lseek star c -H exustar -sparse -f /dev/null /var/tmp/hepp 
lseek(7, 0, SEEK_CUR)                   = 0
lseek(3, 0, 0x4 /* SEEK_??? */)         = 0
/var/tmp/hepp is sparse
lseek(3, 0, 0x4 /* SEEK_??? */)         = 0
lseek(3, 0, 0x3 /* SEEK_??? */)         = -1 ENXIO (No such device or address)
lseek(3, 4294963199, 0x4 /* SEEK_??? */) = -1 ENXIO (No such device or address)
lseek(3, 0, SEEK_SET)                   = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=2081, si_status=0, si_utime=0, si_stime=0} ---
star: 1 blocks + 0 bytes (total of 10240 bytes = 10.00k).
+++ exited with 0 +++
Comment 3 Martin Wilck 2013-06-21 12:12:51 EDT
Comparing the 2 traces above, the error is in the SEEK_DATA call. It should return errno ENXIO because the file contains no data; instead it returns 0.
Comment 4 Martin Wilck 2013-06-21 12:29:48 EDT
This should fix it.

https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git/commit/fs/ext4/file.c?h=unstable&id=e7293fd146846e2a44d29e0477e0860c60fb856b

ext4: fix overflows in SEEK_HOLE, SEEK_DATA implementations
ext4_lblk_t is just u32 so multiplying it by blocksize can easily overflow for files larger than 4 GB. Fix that by properly typing the block offsets before shifting. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Comment 5 Josh Boyer 2013-07-01 13:33:31 EDT
Hm.  I wonder why that commit isn't CC'd to stable.  Eric?
Comment 6 Eric Sandeen 2013-07-09 10:13:13 EDT
Because remembering to cc: stable is leaky & error prone I suppose.
Comment 7 Eric Sandeen 2013-07-09 10:15:24 EDT
the original patch submission from Jan said:

> Likely this is also stable material so Ted, you might want to add
> stable@vger.kernel.org to CC when merging the patches.

:(

I pinged on the list, I'll see if we can get it done retroactively.
Comment 8 Josh Boyer 2013-07-09 10:38:02 EDT
(In reply to Eric Sandeen from comment #7)
> the original patch submission from Jan said:
> 
> > Likely this is also stable material so Ted, you might want to add
> > stable@vger.kernel.org to CC when merging the patches.
> 
> :(
> 
> I pinged on the list, I'll see if we can get it done retroactively.

OK.  I was mostly curious if it was kept out on purpose, but seems not.  We can grab this for Fedora if it doesn't show up with 3.9.10 and 3.10.1.
Comment 9 Eric Sandeen 2013-07-09 10:58:47 EDT
From Ted:

> It was an oversight; my fault, sorry.  I'll send a request to the
> stable kernel tree for the following patches:
> 
> 8af8eec ext4: fix overflow when counting used blocks on 32-bit architectures
> a60697f ext4: fix data offset overflow in ext4_xattr_fiemap() on 32-bit archs
> e7293fd ext4: fix overflows in SEEK_HOLE, SEEK_DATA implementations
> eaf3793 ext4: fix data offset overflow on 32-bit archs in ext4_inline_data_fiemap()

(not sure what the disposition of this bug should be now?)
Comment 10 Josh Boyer 2013-07-09 10:59:50 EDT
POST is fine.  Once they're in the Fedora kernel repo (one way or another), we'll move it to MODIFIED.
Comment 11 Josh Boyer 2013-07-12 09:36:50 EDT
I grabbed the 4 patches Ted highlighted upstream, since they aren't queued for 3.9.10/3.10.1 at the moment.  Greg has a bunch of others to sort through, but there's no reason to wait on these.
Comment 12 Fedora Update System 2013-07-14 07:22:21 EDT
kernel-3.9.10-100.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.9.10-100.fc17
Comment 13 Fedora Update System 2013-07-14 07:27:52 EDT
kernel-3.9.10-200.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.9.10-200.fc18
Comment 14 Fedora Update System 2013-07-14 21:05:08 EDT
Package kernel-3.9.10-200.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.9.10-200.fc18'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-12987/kernel-3.9.10-200.fc18
then log in and leave karma (feedback).
Comment 15 Fedora Update System 2013-07-18 02:09:58 EDT
kernel-3.9.10-100.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 16 Fedora Update System 2013-07-20 05:40:32 EDT
kernel-3.9.10-200.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.