Bug 789452

Summary:	bonnie++ fails on 6.2 kernels indicates bug in NFS
Product:	Red Hat Enterprise Linux 6	Reporter:	Colin.Simpson
Component:	kernel	Assignee:	Jeff Layton <jlayton>
Status:	CLOSED DUPLICATE	QA Contact:	Red Hat Kernel QE team <kernel-qe>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.2	CC:	admin, bburke264, jcastillo, jlayton, jperrin, ksquizza, nragusa, pasteur, rwheeler, steved, toracat
Target Milestone:	rc
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-05-01 13:28:59 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Colin.Simpson 2012-02-10 19:33:30 UTC

Description of problem:

If you run bonnie++ (epel but the source of this seems immaterial) from a RHEL6.2 machine to another RHEL6 machine (NFSv4) or a RHEL5 machine (NFSv3). It fails with:

Bonnie: drastic I/O error (rmdir): Directory not empty

This succeeded with the 6.1 kernels using identical bonnie++ version.Seems to have been broken in 6.2

I'm concerned that this might indicate a serious problem with NFS in 6.2 that might hit us on production applications. 

Version-Release number of selected component (if applicable):


How reproducible:
Every time


Steps to Reproduce:
1.Just run bonnie++ against and NFS mount (-d flag)

  
Actual results:
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...Bonnie: drastic I/O error (rmdir): Directory not empty
Cleaning up test directory after error.


Expected results:

Successful bonnie++ run.

Additional info:

Even though I'm testing in a pure RHEL6.2 environment and see this, googling around the Centos guys have seen this issue and have dug a bit deeper on it:

http://bugs.centos.org/view.php?id=5496

Might help with debugging this issue.

Comment 2 Ric Wheeler 2012-02-29 17:56:55 UTC

Thanks for the report - can you please also open a call with Red Hat support for us?

Regards,

Ric

Comment 3 Colin.Simpson 2012-02-29 18:04:29 UTC

Has been open for 2 weeks at Case#00598633

Comment 4 Ric Wheeler 2012-02-29 18:50:09 UTC

Sorry, I missed that!

Comment 5 Jeff Layton 2012-03-30 19:42:02 UTC

I think what's needed here is some understanding of what's happening at the system call level. What syscall is generating the EIO in this case?

I saw some of the upstream discussion that referenced this bug. If the
problems are with readdir(), I wonder if the readdir fixes that have 
already been queued up for 6.3 will make any difference here. It might be
worthwhile to test a current 6.3-ish kernel and see if it helps.

Comment 6 Jeff Layton 2012-03-30 20:23:17 UTC

If it does turn out to be reproducible on 6.3-ish kernels, then another thing to test would be to see if it's still reproducible if you mount with '-o nordirplus'.

Comment 7 Colin.Simpson 2012-03-31 10:52:17 UTC

I have a support call open on this, will they be providing me with a 6.3 kernel to test?

Comment 8 Jeff Layton 2012-04-03 12:18:24 UTC

Yes, they should.

Comment 9 csb sysadmin 2012-04-25 21:14:57 UTC

I have the same issue from NFS clients running this kernel : 2.6.32-220.2.1.el6.x86_64 (centos 6.2) to a BlueArc Titan 3200 NFS server with mount options : 

rw,proto=tcp,rsize=32768,wsize=32768,timeo=600,hard,intr,nfsvers=3,sloppy

I don't recall having this issue with earlier 6 kernels or 5 kernels

Comment 10 csb sysadmin 2012-04-25 21:20:57 UTC

Where's the location of the new test kernels ? There's nothing in http://people.redhat.com/dzickus/rhel6/

Comment 13 Jeff Layton 2012-05-01 13:28:59 UTC

Ok, sounds like this was fixed by some patches that are going in for 6.3, so
closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 770250 ***