Bug 718232

Summary: [xfs] mis-sized O_DIRECT I/O results in hung task timeouts
Product: Red Hat Enterprise Linux 5 Reporter: Jeff Moyer <jmoyer>
Component: kernelAssignee: Jeff Moyer <jmoyer>
Status: CLOSED ERRATA QA Contact: Petr Beňas <pbenas>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.6CC: andrei, branto, bstein, cww, dchinner, dhoward, eguan, esandeen, jmoyer, mgahagan, pbenas, pstehlik, rwheeler, syeghiay
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
A problem with the XFS dio error handling was discovered. If a misaligned write I/O operation was issued, XFS would return -EINVAL without unlocking the inode's mutex. This caused any further operations on the inode to become unresponsive. This update adds a missing mutex_unlock operation to the dio error path, solving this issue.
Story Points: ---
Clone Of: 716991 Environment:
Last Closed: 2012-02-21 03:42:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 695827, 716991    
Bug Blocks: 727590    

Comment 1 RHEL Program Management 2011-07-01 14:19:49 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 12 Jarod Wilson 2011-08-23 14:04:55 UTC
Patch(es) available in kernel-2.6.18-282.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 13 Andrei Maslennikov 2011-08-31 16:35:52 UTC
I had an XFS-related issue 00511623 with RedHat which I just closed after trying the test kernel (-283.el5xen) as suggested by Jared. The error I was observing was 100% reproducible with kernel (-274.el5xen). To repeat it just reboot into kernel -274.el5xen, then place /usr/src/redhat into an XFS filesystem and try to rebuild the coreutils-5.97-34 package with "rpmbuild --rebuild coreutils-5.97-34.el5.src.rpm". When the build arrives to the post-compile tests, it would hang on the "shred" test. The same build ends up correctly when /usr/src/redhat is inside ext3. 

This does not happen with "-238.19.1.el5xen" kernel, and the problem was apparently taken care of with XFS-related patches in "-283.el5xen".

Andrei.

Comment 14 Martin Prpič 2011-09-08 14:59:00 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
A problem with the XFS dio error handling was discovered. If a misaligned write I/O operation was issued, XFS would return -EINVAL without unlocking the inode's mutex. This caused any further operations on the inode to become unresponsive. This update adds a missing mutex_unlock operation to the dio error path, solving this issue.

Comment 16 Petr Beňas 2011-10-17 07:36:40 UTC
Reproduced in 2.6.18-275.el5 and verified in 2.6.18-276.el5.

Comment 17 errata-xmlrpc 2012-02-21 03:42:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html