Red Hat Bugzilla – Bug 456686
race in aio_complete() leads to process hang
Last modified: 2009-05-22 06:44:29 EDT
Description of problem: There is a missing memory barrier in the current aio_complete in the RHEL4 kernels causing a race between read_events/aio_complete causing the thread in read_events to sleep indefinitely, hanging the application that is waiting on I/O completion. This was reported upstream by Quentin Barnes of Yahoo: http://lkml.org/lkml/2008/3/12/207 Fix has been merged in 2.6.26: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6cb2a21049b8990df4576c5fce4d48d0206c22d5 And was also accepted for 2.6.24.y: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.24.y.git;a=commit;h=0db49fc729eee503836ea12745b55f7f802d2abb Version-Release number of selected component (if applicable): How reproducible: Unclear. Depends on the AIO application; Quentin reports seeing hangs virtually 100% of the time. Looking for a straightforward reproducer for this now and will update with details when they are available. Steps to Reproduce: < to be filled > Actual results: Application hangs in read_events Expected results: No hang. AIO completes as normal. Additional info:
Updating PM score.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Patch posted for review: http://post-office.corp.redhat.com/archives/rhkernel-list/2009-January/msg00042.html
Committed in 78.29.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html