Bug 554735

Summary: readahead/sync_file_range/fadvise64 compat calls broken
Product: Red Hat Enterprise Linux 5 Reporter: Veaceslav Falico <vfalico>
Component: kernelAssignee: Chris Lalancette <clalance>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 5.4CC: john.haxby, peterm
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-05 13:37:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 559410    
Attachments:
Description Flags
Program to demonstrate badly working fadvise()
none
systemtap script.
none
Proposed patch none

Description Veaceslav Falico 2010-01-12 14:25:23 UTC
Description of problem:
In RHEL5 readahead/sync_file_range/fadvise64 compat calls are broken (x86/x86_64 to ia32). Upstream fix - e412ac4971d27ea84f3d63ce425c6ab2d6a67f23. Patch for rhel5 included.


Version-Release number of selected component (if applicable):
2.6.18-164.9.1.el5

Comment 1 john.haxby@oracle.com 2010-03-04 17:02:56 UTC
That's interesting.   I've just patched this.  I have a systemtap script and a program that can be used to test the system calls (to be attached shortly).  Without the fix in place, "./randtest32 /etc/passwd" stap gives this:

fadvise64   (fd=3. offset=0, len=0, advice=0/FADV_NORMAL)
fadvise64_64(fd=3. offset=0, len=0, advice=0/FADV_NORMAL)
fadvise64   (fd=3. offset=5, len=0, advice=7/UNKNOWN VALUE: 7)
fadvise64_64(fd=3. offset=5, len=0, advice=7/UNKNOWN VALUE: 7)
fadvise64_64(fd=3. offset=0, len=0, advice=1/FADV_RANDOM)
fadvise64_64(fd=3. offset=5, len=7, advice=2/FADV_SEQUENTIAL)

Whereas "./randtest64 /etc/passwd" gives this:

fadvise64   (fd=3. offset=0, len=0, advice=1/FADV_RANDOM)
fadvise64_64(fd=3. offset=0, len=0, advice=1/FADV_RANDOM)
fadvise64   (fd=3. offset=5, len=7, advice=2/FADV_SEQUENTIAL)
fadvise64_64(fd=3. offset=5, len=7, advice=2/FADV_SEQUENTIAL)
fadvise64   (fd=3. offset=0, len=0, advice=1/FADV_RANDOM)
fadvise64_64(fd=3. offset=0, len=0, advice=1/FADV_RANDOM)
fadvise64   (fd=3. offset=5, len=7, advice=2/FADV_SEQUENTIAL)
fadvise64_64(fd=3. offset=5, len=7, advice=2/FADV_SEQUENTIAL)

randtest.c, the stap script and the patch follow.

Comment 2 john.haxby@oracle.com 2010-03-04 17:07:27 UTC
Created attachment 397870 [details]
Program to demonstrate badly working fadvise()

Note that the posix_fadvise() arguments are only there to illustrate the problem; I don't expect the values to be at all useful.

Compile the program up with "-m32" or "-m64" and note that two of the calls fail with EINVAL.  (Also note that strace is not terribly useful as that suffers from a similar bug, fix available RSN).

Comment 3 john.haxby@oracle.com 2010-03-04 17:10:12 UTC
Created attachment 397872 [details]
systemtap script.

Just run this with stap while you run randtest32 (32 bit randtest) or randtest64 (63 bit randtest) in another window.   The bad arguments are pretty obvious.

Comment 4 john.haxby@oracle.com 2010-03-04 17:16:22 UTC
Created attachment 397874 [details]
Proposed patch

Note that this is a slightly re-worked version of the upstream patch as the xen syscalls need to be modified as well as the bare metal syscalls.

With this in place the fadvise() calls in randtest all work and the stap script shows correct values being passed.  strace still reports rubbish arguments, but it doesn't report EINVAL either, that is before:

fadvise64(3, 0, 0, POSIX_FADV_NORMAL)   = 0
fadvise64(3, 5, 0, 0x7 /* POSIX_FADV_??? */) = -1 EINVAL (Invalid argument)
fadvise64_64(3, 0, 0, POSIX_FADV_NORMAL) = 0
fadvise64_64(3, 5, 0, 0x7 /* POSIX_FADV_??? */) = 0

the second call returns 0, not EINVAL.   With a working strace you get

fadvise64(3, 0, 0, POSIX_FADV_RANDOM)   = 0
fadvise64(3, 5, 7, POSIX_FADV_SEQUENTIAL) = 0
fadvise64_64(3, 0, 0, POSIX_FADV_RANDOM) = 0
fadvise64_64(3, 5, 7, POSIX_FADV_SEQUENTIAL) = 0

which matches the stap output.

Comment 5 john.haxby@oracle.com 2010-03-04 17:18:50 UTC
This problem was found by outside Oracle.