This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 446599 - jbd races lead to EIO for O_DIRECT
jbd races lead to EIO for O_DIRECT
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.1
All Linux
high Severity high
: rc
: ---
Assigned To: Bryn M. Reeves
Martin Jenner
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-15 05:55 EDT by Bryn M. Reeves
Modified: 2011-01-24 18:01 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 14:46:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
testcase to trigger O_DIRECT EIO problem (580 bytes, text/x-csrc)
2008-05-15 05:55 EDT, Bryn M. Reeves
no flags Details
Patch correcting jbd races (5.67 KB, patch)
2008-06-10 14:21 EDT, Bryn M. Reeves
no flags Details | Diff

  None (edit)
Description Bryn M. Reeves 2008-05-15 05:55:38 EDT
Description of problem:
When running the attached test case on an ext3 file system eventually one of the
processes using direct I/O (O_DIRECT) will fail with EIO.

This has been reported to occur during for e.g. database load operations. 

This only occurs on kernels that include the patch:

linux-2.6-fs-jbd-wait-for-t_sync_datalist-buf-to-complete.patch

Version-Release number of selected component (if applicable):
2.6.18-53.1.13 onwards

How reproducible:
100%

Steps to Reproduce:
1. Compile the attached testcase with:
$ gcc -Wall -D_GNU_SOURCE -o testcase testcase.c

3. Create a testfile:
dd if=/dev/zero of=testfile bs=64k count=1000

3. Run multiple copies of the test in parallel with half using direct I/O and
half using buffered I/O, e.g.:
# ./testcase & ./testcase -d & ./testcase & ./testcase -d & ./testcase &
./testcase -d & ./testcase & ./testcase -d

  
Actual results:
[1] 18481
[2] 18482
[3] 18483
[4] 18484
[5] 18485
[6] 18486
[7] 18487
write failed: Input/output error


Expected results:
Test runs indefinitely without error

Additional info:
Several upstream threads discussing this:

http://lkml.org/lkml/2008/5/1/160
http://lkml.org/lkml/2008/5/12/193
Comment 1 Bryn M. Reeves 2008-05-15 05:55:39 EDT
Created attachment 305460 [details]
testcase to trigger O_DIRECT EIO problem
Comment 3 Issue Tracker 2008-05-20 15:06:40 EDT
Mirroring events from IT


This event sent from IssueTracker by balkov 
 issue 172641
Comment 4 Ben 2008-05-27 14:00:26 EDT
IT is refusing to mirror even when done manually...

----- Additional Comments From mranweil@us.ibm.com (prefers email at
mjr@us.ibm.com)  2008-05-27 13:30 EDT -------
The testcase ran fine over the long weekend with the patch version 7.  Elmar -
it fix for you, too?
Comment 5 Bryn M. Reeves 2008-06-10 14:21:26 EDT
Created attachment 308846 [details]
Patch correcting jbd races

This is the final version of the patch pushed upstream by IBM. Now in -mm &
expected to be merged in 2.6.26.
Comment 7 RHEL Product and Program Management 2008-07-25 13:03:51 EDT
This request was evaluated by Red Hat Product Management for
inclusion, but this component is not scheduled to be updated in
the current Red Hat Enterprise Linux release. If you would like
this request to be reviewed for the next minor release, ask your
support representative to set the next rhel-x.y flag to "?".
Comment 8 Brad Peters 2008-08-04 20:03:00 EDT
Posted for review - pending PM ack based on Joe K.'s request

http://post-office.corp.redhat.com/archives/rhkernel-list/2008-August/msg00097.html
Comment 10 RHEL Product and Program Management 2008-08-07 18:14:53 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 11 Don Zickus 2008-08-13 12:07:17 EDT
in kernel-2.6.18-104.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 12 Don Zickus 2008-08-13 13:25:46 EDT
in kernel-2.6.18-104.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5
Comment 19 errata-xmlrpc 2009-01-20 14:46:42 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html

Note You need to log in before you can comment on or make changes to this bug.