Bug 164338
| Summary: | fix aio hang when reading beyond EOF | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Jason Baron <jbaron> |
| Component: | kernel | Assignee: | Jeff Moyer <jmoyer> |
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.0 | CC: | knoel |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | RHSA-2005-514 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2005-10-05 13:45:30 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 156322 | ||
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2005-514.html |
Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: | Hello all, | | I came across the following problem while running ltp-aiodio testcases | from ltp-full-20050405 on linux-2.6.12-rc3-mm3. I tried running the | tests with EXT3 as well as JFS filesystems. | | One or two fsx-linux testcases were hung after some time. These | testcases were hanging at wait_for_all_aios(). | | From initial debugging I found that there were some iocbs which were | not getting completed eventhough the last retry for those returned | -EIOCBQUEUED. Also all such pending iocbs represented READ operation. | | Further debugging revealed that all such iocbs hit EOF in the DIO layer. | To be more precise, the "pos" from which they were trying to read was | greater than the "size" of the file. So the generic_file_direct_IO | returned 0. | | This happens rarely as there is already a check in | __generic_file_aio_read(), for whether "pos" < "size" before calling | direct IO routine. | | > size = i_size_read(inode); | > if (pos < size) { | > retval = generic_file_direct_IO(READ, iocb, | > iov, pos, nr_segs); | | | But for READ, we are taking the inode->i_sem only in the DIO layer. So | it is possible that some other process can change the size of the file | before we take the i_sem. In such a case ( when "pos" > "size"), the | __generic_file_aio_read() would return -EIOCBQUEUED even though there | were no I/O requests submitted by the DIO layer. This would cause the | AIO layer to expect aio_complete() for THE iocb, which doesnot happen. | And thus the test hangs forever, waiting for an I/O completion, where | there are no requests submitted at all. | | The following patch makes __generic_file_aio_read() return 0 ( instead | of returning -EIOCBQUEUED ), on getting 0 from generic_file_direct_IO(), | so that the AIO layer does the aio_complete(). | | Testing: | | I have tested the patch on a SMP machine(with 2 Pentium 4 (HT)) running | linux-2.6.12-rc3-mm3. I ran the ltp-aiodio testcases and none of the | fsx-linux tests hung. Also the aio-stress tests ran without any problem. | | -- | thanks, | | Suzuki K P | Linux Technology Centre | IBM Software Labs `----