Bug 164338 - fix aio hang when reading beyond EOF
Summary: fix aio hang when reading beyond EOF
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 4.0
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: Jeff Moyer
QA Contact: Brian Brock
Depends On:
Blocks: 156322
TreeView+ depends on / blocked
Reported: 2005-07-27 01:47 UTC by Jason Baron
Modified: 2013-03-06 05:58 UTC (History)
1 user (show)

Fixed In Version: RHSA-2005-514
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-10-05 13:45:30 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2005:514 qe-ready SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 2 2005-10-05 04:00:00 UTC

Description Jason Baron 2005-07-27 01:47:52 UTC
Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:

Expected results:

Additional info:

| Hello all,
| I came across the following problem while running ltp-aiodio testcases
| from ltp-full-20050405 on linux-2.6.12-rc3-mm3. I tried running the
| tests with EXT3 as well as JFS filesystems.
| One or two fsx-linux testcases were hung after some time. These
| testcases were hanging at wait_for_all_aios().
|  From initial debugging I found that there were some iocbs which were
| not getting completed eventhough the last retry for those returned
| -EIOCBQUEUED. Also all such pending iocbs represented READ operation.
| Further debugging revealed that all such iocbs hit EOF in the DIO layer.
| To be more precise, the "pos" from which they were trying to read was
| greater than the "size" of the file. So the generic_file_direct_IO
| returned 0.
| This happens rarely as there is already a check in
| __generic_file_aio_read(), for whether "pos" < "size" before calling
| direct IO routine.
| > size = i_size_read(inode);
| > if (pos < size) {
| >       retval = generic_file_direct_IO(READ, iocb,
| >                                iov, pos, nr_segs);
| But for READ, we are taking the inode->i_sem only in the DIO layer. So
| it is possible that some other process can change the size of the file
| before we take the i_sem. In such a case ( when "pos" > "size"), the
| __generic_file_aio_read() would return -EIOCBQUEUED even though there
| were no I/O requests submitted by the DIO layer. This would cause the
| AIO layer to expect aio_complete() for THE iocb, which doesnot happen.
| And thus the test hangs forever, waiting for an I/O completion, where
| there are no requests submitted at all.
| The following patch makes __generic_file_aio_read() return 0 ( instead
| of returning -EIOCBQUEUED ), on getting 0 from generic_file_direct_IO(),
| so that the AIO layer does the aio_complete().
| Testing:
| I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running
| linux-2.6.12-rc3-mm3. I ran the ltp-aiodio testcases and none of the
| fsx-linux tests hung. Also the aio-stress tests ran without any problem.
| --
| thanks,
| Suzuki K P
| Linux Technology Centre
| IBM Software Labs

Comment 5 Red Hat Bugzilla 2005-10-05 13:45:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.