Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 4 product line. The current stable release is 4.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 164338

Summary: fix aio hang when reading beyond EOF
Product: Red Hat Enterprise Linux 4 Reporter: Jason Baron <jbaron>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: knoel
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2005-514 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-05 13:45:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 156322    

Description Jason Baron 2005-07-27 01:47:52 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

| Hello all,
|
| I came across the following problem while running ltp-aiodio testcases
| from ltp-full-20050405 on linux-2.6.12-rc3-mm3. I tried running the
| tests with EXT3 as well as JFS filesystems.
|
| One or two fsx-linux testcases were hung after some time. These
| testcases were hanging at wait_for_all_aios().
|
|  From initial debugging I found that there were some iocbs which were
| not getting completed eventhough the last retry for those returned
| -EIOCBQUEUED. Also all such pending iocbs represented READ operation.
|
| Further debugging revealed that all such iocbs hit EOF in the DIO layer.
| To be more precise, the "pos" from which they were trying to read was
| greater than the "size" of the file. So the generic_file_direct_IO
| returned 0.
|
| This happens rarely as there is already a check in
| __generic_file_aio_read(), for whether "pos" < "size" before calling
| direct IO routine.
|
| > size = i_size_read(inode);
| > if (pos < size) {
| >       retval = generic_file_direct_IO(READ, iocb,
| >                                iov, pos, nr_segs);
|
|
| But for READ, we are taking the inode->i_sem only in the DIO layer. So
| it is possible that some other process can change the size of the file
| before we take the i_sem. In such a case ( when "pos" > "size"), the
| __generic_file_aio_read() would return -EIOCBQUEUED even though there
| were no I/O requests submitted by the DIO layer. This would cause the
| AIO layer to expect aio_complete() for THE iocb, which doesnot happen.
| And thus the test hangs forever, waiting for an I/O completion, where
| there are no requests submitted at all.
|
| The following patch makes __generic_file_aio_read() return 0 ( instead
| of returning -EIOCBQUEUED ), on getting 0 from generic_file_direct_IO(),
| so that the AIO layer does the aio_complete().
|
| Testing:
|
| I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running
| linux-2.6.12-rc3-mm3. I ran the ltp-aiodio testcases and none of the
| fsx-linux tests hung. Also the aio-stress tests ran without any problem.
|
| --
| thanks,
|
| Suzuki K P
| Linux Technology Centre
| IBM Software Labs
`----

Comment 5 Red Hat Bugzilla 2005-10-05 13:45:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html