Bug 164338 - fix aio hang when reading beyond EOF
fix aio hang when reading beyond EOF
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jeffrey Moyer
Brian Brock
:
Depends On:
Blocks: 156322
  Show dependency treegraph
 
Reported: 2005-07-26 21:47 EDT by Jason Baron
Modified: 2013-03-06 00:58 EST (History)
1 user (show)

See Also:
Fixed In Version: RHSA-2005-514
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-10-05 09:45:30 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Jason Baron 2005-07-26 21:47:52 EDT
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

| Hello all,
|
| I came across the following problem while running ltp-aiodio testcases
| from ltp-full-20050405 on linux-2.6.12-rc3-mm3. I tried running the
| tests with EXT3 as well as JFS filesystems.
|
| One or two fsx-linux testcases were hung after some time. These
| testcases were hanging at wait_for_all_aios().
|
|  From initial debugging I found that there were some iocbs which were
| not getting completed eventhough the last retry for those returned
| -EIOCBQUEUED. Also all such pending iocbs represented READ operation.
|
| Further debugging revealed that all such iocbs hit EOF in the DIO layer.
| To be more precise, the "pos" from which they were trying to read was
| greater than the "size" of the file. So the generic_file_direct_IO
| returned 0.
|
| This happens rarely as there is already a check in
| __generic_file_aio_read(), for whether "pos" < "size" before calling
| direct IO routine.
|
| > size = i_size_read(inode);
| > if (pos < size) {
| >       retval = generic_file_direct_IO(READ, iocb,
| >                                iov, pos, nr_segs);
|
|
| But for READ, we are taking the inode->i_sem only in the DIO layer. So
| it is possible that some other process can change the size of the file
| before we take the i_sem. In such a case ( when "pos" > "size"), the
| __generic_file_aio_read() would return -EIOCBQUEUED even though there
| were no I/O requests submitted by the DIO layer. This would cause the
| AIO layer to expect aio_complete() for THE iocb, which doesnot happen.
| And thus the test hangs forever, waiting for an I/O completion, where
| there are no requests submitted at all.
|
| The following patch makes __generic_file_aio_read() return 0 ( instead
| of returning -EIOCBQUEUED ), on getting 0 from generic_file_direct_IO(),
| so that the AIO layer does the aio_complete().
|
| Testing:
|
| I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running
| linux-2.6.12-rc3-mm3. I ran the ltp-aiodio testcases and none of the
| fsx-linux tests hung. Also the aio-stress tests ran without any problem.
|
| --
| thanks,
|
| Suzuki K P
| Linux Technology Centre
| IBM Software Labs
`----
Comment 5 Red Hat Bugzilla 2005-10-05 09:45:30 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-514.html

Note You need to log in before you can comment on or make changes to this bug.