Bug 789452

Summary: bonnie++ fails on 6.2 kernels indicates bug in NFS
Product: Red Hat Enterprise Linux 6 Reporter: Colin.Simpson
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: admin, bburke264, jcastillo, jlayton, jperrin, ksquizza, nragusa, pasteur, rwheeler, steved, toracat
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-01 13:28:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Colin.Simpson 2012-02-10 19:33:30 UTC
Description of problem:

If you run bonnie++ (epel but the source of this seems immaterial) from a RHEL6.2 machine to another RHEL6 machine (NFSv4) or a RHEL5 machine (NFSv3). It fails with:

Bonnie: drastic I/O error (rmdir): Directory not empty

This succeeded with the 6.1 kernels using identical bonnie++ version.Seems to have been broken in 6.2

I'm concerned that this might indicate a serious problem with NFS in 6.2 that might hit us on production applications. 

Version-Release number of selected component (if applicable):


How reproducible:
Every time


Steps to Reproduce:
1.Just run bonnie++ against and NFS mount (-d flag)

  
Actual results:
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.


Expected results:

Successful bonnie++ run.

Additional info:

Even though I'm testing in a pure RHEL6.2 environment and see this, googling around the Centos guys have seen this issue and have dug a bit deeper on it:

http://bugs.centos.org/view.php?id=5496

Might help with debugging this issue.

Comment 2 Ric Wheeler 2012-02-29 17:56:55 UTC
Thanks for the report - can you please also open a call with Red Hat support for us?

Regards,

Ric

Comment 3 Colin.Simpson 2012-02-29 18:04:29 UTC
Has been open for 2 weeks at Case#00598633

Comment 4 Ric Wheeler 2012-02-29 18:50:09 UTC
Sorry, I missed that!

Comment 5 Jeff Layton 2012-03-30 19:42:02 UTC
I think what's needed here is some understanding of what's happening at the system call level. What syscall is generating the EIO in this case?

I saw some of the upstream discussion that referenced this bug. If the
problems are with readdir(), I wonder if the readdir fixes that have 
already been queued up for 6.3 will make any difference here. It might be
worthwhile to test a current 6.3-ish kernel and see if it helps.

Comment 6 Jeff Layton 2012-03-30 20:23:17 UTC
If it does turn out to be reproducible on 6.3-ish kernels, then another thing to test would be to see if it's still reproducible if you mount with '-o nordirplus'.

Comment 7 Colin.Simpson 2012-03-31 10:52:17 UTC
I have a support call open on this, will they be providing me with a 6.3 kernel to test?

Comment 8 Jeff Layton 2012-04-03 12:18:24 UTC
Yes, they should.

Comment 9 csb sysadmin 2012-04-25 21:14:57 UTC
I have the same issue from NFS clients running this kernel : 2.6.32-220.2.1.el6.x86_64 (centos 6.2) to a BlueArc Titan 3200 NFS server with mount options : 

rw,proto=tcp,rsize=32768,wsize=32768,timeo=600,hard,intr,nfsvers=3,sloppy

I don't recall having this issue with earlier 6 kernels or 5 kernels

Comment 10 csb sysadmin 2012-04-25 21:20:57 UTC
Where's the location of the new test kernels ? There's nothing in http://people.redhat.com/dzickus/rhel6/

Comment 13 Jeff Layton 2012-05-01 13:28:59 UTC
Ok, sounds like this was fixed by some patches that are going in for 6.3, so
closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 770250 ***