Bug 119314

Summary: NFS performance degrades over time on an Opteron
Product: Red Hat Enterprise Linux 3 Reporter: David Alden <alden>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: petrides, riel
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-04-15 11:13:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ethereal packets none

Description David Alden 2004-03-29 12:24:51 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1)
Gecko/20031114

Description of problem:
NFS slows down to a crawl on our Dual Opteron workstation over time.
I'm not sure if it's a gradual slowdown or it just hits a point and
goes slow from then on.  Rebooting temporarily fixes the problem.  

Note that this occurred under kernel-2.4.21-9.EL too.


Version-Release number of selected component (if applicable):
kernel-2.4.21-9.0.1.EL

How reproducible:
Always

Steps to Reproduce:
1.  Reboot
2.  Run programs that access NFS.


Actual Results:  NFS performance was abysmal.

Expected Results:  NFS performance should not degrade.

Additional info:

Comment 1 Steve Dickson 2004-04-05 13:00:09 UTC
This kernel-2.4.21-9.0.1.EL, the server or client for both? 

Is nfsstat reporting retrans? Is nfsstat reporting badcalls?

Does the ethereal trace show any type of errors or retrans?


Just saying "NFS performance was abysmal" or "NFS performance 
should not degrade." w/out any supporting facts really helps nobody
(including yourself) and is truly a waste of time for me and you... 

Comment 2 David Alden 2004-04-05 19:14:49 UTC
The kernel-2.4.21-9.0.1.EL is for the client.  For the server, I've
tried 2.4.18-27.8.0bigmem, 2.4.22-1.2174.nptl and 2.4.21-9.EL.

nfsstat shows no "badcalls", but several retrans:

calls      retrans    authrefrsh
1366325    6227       0

5 minutes later:

calls      retrans    authrefrsh
1367381    6285       0


Yes, ethereal shows many several calls like:

[RPC retransmission of #3068]V3 GETATTR Call, FH:[...]


I'm sorry for the poor bug report -- I've not had troubles with NFS
in over 10 years, so I'm not sure how to go about debugging it.  :-)


Comment 3 Steve Dickson 2004-04-06 15:26:20 UTC
hmm... the ratio of calls verses retrans is really
not that bad... I've seen worse... 

Are the mounts over UDP? If so, try using TCP to see
if that helps.

Also if the trace not too big, post a bzip2 compressed
copy of the ethereal trace.

Comment 4 David Alden 2004-04-07 15:46:13 UTC
Created attachment 99193 [details]
ethereal packets

The mounts were done over UDP.	I've tried doing them over TCP to a
2.4.22-1.2174.nptl kernel and I get the same results.

The trace is ~225K bzip2'ed -- should I attach a copy?	I've pulled the last
few
packets out (they included some of the retransmissions) and will attach that --

let me know if you'd like the whole thing.

Comment 5 Steve Dickson 2004-04-15 11:13:18 UTC
From look at the last trace, it really appears to be some
type of network issue.... especially if using TCP does not help....