Bug 456686 - race in aio_complete() leads to process hang
Summary: race in aio_complete() leads to process hang
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.7
Hardware: All
OS: Linux
urgent
medium
Target Milestone: rc
: ---
Assignee: Jeff Moyer
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks: 461304 475814 489935
TreeView+ depends on / blocked
 
Reported: 2008-07-25 15:33 UTC by Bryn M. Reeves
Modified: 2009-05-22 10:44 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-05-18 19:08:21 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1024 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 4.8 kernel security and bug fix update 2009-05-18 14:57:26 UTC

Description Bryn M. Reeves 2008-07-25 15:33:37 UTC
Description of problem:
There is a missing memory barrier in the current aio_complete in the RHEL4
kernels causing a race between read_events/aio_complete causing the thread in
read_events to sleep indefinitely, hanging the application that is waiting on
I/O completion.

This was reported upstream by Quentin Barnes of Yahoo:

http://lkml.org/lkml/2008/3/12/207

Fix has been merged in 2.6.26:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6cb2a21049b8990df4576c5fce4d48d0206c22d5

And was also accepted for 2.6.24.y:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.24.y.git;a=commit;h=0db49fc729eee503836ea12745b55f7f802d2abb

Version-Release number of selected component (if applicable):


How reproducible:
Unclear. Depends on the AIO application; Quentin reports seeing hangs virtually
100% of the time. Looking for a straightforward reproducer for this now and will
update with details when they are available.

Steps to Reproduce:
< to be filled >
  
Actual results:
Application hangs in read_events

Expected results:
No hang. AIO completes as normal.

Additional info:

Comment 1 RHEL Program Management 2008-09-03 13:12:05 UTC
Updating PM score.

Comment 2 RHEL Program Management 2008-09-22 17:53:39 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Jeff Moyer 2009-01-05 19:50:24 UTC
Patch posted for review:
http://post-office.corp.redhat.com/archives/rhkernel-list/2009-January/msg00042.html

Comment 4 Vivek Goyal 2009-01-15 14:04:05 UTC
Committed in 78.29.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/

Comment 12 errata-xmlrpc 2009-05-18 19:08:21 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html


Note You need to log in before you can comment on or make changes to this bug.