Bug 456453 - GFS2: d_rwdirectempty fails with short read
GFS2: d_rwdirectempty fails with short read
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
medium Severity medium
: beta
: ---
Assigned To: Ben Marzinski
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-23 15:30 EDT by Nate Straz
Modified: 2009-01-20 15:08 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:08:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch that fixes the short reads (1.77 KB, patch)
2008-07-25 18:02 EDT, Ben Marzinski
no flags Details | Diff
Port of upstream fix. (5.36 KB, patch)
2008-08-12 14:20 EDT, Ben Marzinski
no flags Details | Diff

  None (edit)
Description Nate Straz 2008-07-23 15:30:56 EDT
Description of problem:

While running distributed I/O test cases on GFS2, the d_rwdirectempty test case
fails.  This test case starts with a new zero length file and performs
sequential writes using O_DIRECT to the file.  The writes are then verified on a
different node.  In the failing case, the read returns 0 bytes.

$ cat r3/8.d_rwdirectempty/*/cmd.log
d_iogen starting up with the following:
Start Time:                 Wed Jul 23 14:15:11 2008
Session id:                 19737
Resource file:             
/local/nstraz/svn/sts-rhel5/sts-root/var/share/resource_files/tank-cluster.xml
Internal Region Lock Type:  clm
Iterations:                 30s
Seed:                       11883
Offset-mode:                sequential
Overlap Flag:               off
Mintrans:                   512000
Maxtrans:                   8388608
Requests:                   read,write
Syscalls:                   read,readv,write,writev
Verify Syscalls:            read
IO type:                    direct

Test Files:

Path                                                      Size
                                                        (bytes)
---------------------------------------------------------------
rwdirectempty                                        758990139392
d_doio ior status != expected status

======== msg ========
type: 2 (verify)
status: 0 (nack)
expected status: 1 (ack)
srchost: tank-03
srcpid: 9932
desthost: try
destpid: 0
ior: 
----- xior ----
magic: 0xfeed10
type: 4 (read)
path: rwdirectempty
syscall: read
oflags: 16386 (O_RDWR|O_DIRECT)
offset: 36082688
count: 783872
pattern: N:9974:tank-04:writev*
chksum: 0xa8da8144

=====================
Cleanup took 1 seconds.
d_doio(11063) 0 requests finished, exiting...
d_doio(11064) 2 requests finished, exiting...
d_doio(11065) 1 requests finished, exiting...
d_doio(11062) 0 requests finished, exiting...
d_doio(11066) 1 requests finished, exiting...
d_doio(9927) 2 requests finished, exiting...
d_doio(9928) 1 requests finished, exiting...
d_doio(9930) 1 requests finished, exiting...
d_doio(9929) 1 requests finished, exiting...
d_doio(9931) 0 requests finished, exiting...
Short read(), read 0 of 783872 bytes at 36082688 on rwdirectempty
(parent) pid 9932 exited non-zero
d_doio(9973) 1 requests finished, exiting...
d_doio(9972) 1 requests finished, exiting...
d_doio(9976) 1 requests finished, exiting...
d_doio(9974) 1 requests finished, exiting...
d_doio(9975) 1 requests finished, exiting...



Version-Release number of selected component (if applicable):
kernel-2.6.18-98.el5
kmod-gfs2-1.98-1.1.el5.abhi.4

How reproducible:
Every time.

Steps to Reproduce:
1. run dd_io on a GFS2 file system.
2.
3.
  
Actual results:


Expected results:


Additional info:

The test case passes on GFS.
Comment 2 Ben Marzinski 2008-07-25 11:50:40 EDT
The problem is that the vfs code does a check to see if the position of a
direct_io read is past the end of a file in __generic_file_aio_read(). If it is,
then it never calls generic_file_direct_IO(), which is what hooks into the gfs2
directio read code.  This check is useful for local filesystems, but since gfs2
doesn't have a lock on the file, there's no guarantee that the file size will be
correct for gfs2.  This won't be a problem for gfs, since it grabs the locks at
an earlier stage of the system call.
Comment 3 Ben Marzinski 2008-07-25 18:02:10 EDT
Created attachment 312687 [details]
patch that fixes the short reads

This patch applies on top of the 2.6.18-98.el5 RHEL5 kernel. It adds another
inode flag, S_NOSIZECHK, that skips the test of whether the read position is
past the end of the file.  GFS2 sets this on all of its inodes, so that this
check is skipped.
Comment 4 Ben Marzinski 2008-08-04 14:01:57 EDT
I have been totally unable to recreate this issue on the upstream 2.6.26 kernel. However I can't see any reason why it should be any different.
Comment 5 Ben Marzinski 2008-08-12 14:20:35 EDT
Created attachment 314123 [details]
Port of upstream fix.

My last patch only dealt with the directio case. This happens for cached reads too. However, this problem is fixed by already existing patch in the upstream kernel. So this is a port of that patch.
Comment 6 Don Zickus 2008-09-02 23:40:52 EDT
in kernel-2.6.18-107.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 9 Nate Straz 2008-11-13 17:19:24 EST
Verified against kernel-2.6.18-122.el5.
Comment 11 errata-xmlrpc 2009-01-20 15:08:23 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.