Bug 70080

Summary:	high load NFS performance
Product:	[Retired] Red Hat Linux	Reporter:	Need Real Name <aander07>
Component:	kernel	Assignee:	Steve Dickson <steved>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Brian Brock <bbrock>
Severity:	high	Docs Contact:
Priority:	medium
Version:	7.2	CC:	k.georgiou, steved
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2004-01-12 12:34:43 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Need Real Name 2002-07-30 03:10:22 UTC

Description of Problem:

I have observed that there is some resource that is allocated on a per-mount
basis that is becomes a performance issue under high NFS I/O loads.  The
symptoms are that file operations to the individual mount slow noticably
(4 seconds to run 'df' on the mount, stat() taking 2-5 seconds, etc) under
load.  Other mounts to the same NFS server still respond in a reasonable
timeframe, even if they are simply a second mount of the same export.

These symptoms lead me to believe that there is either a lock being held,
or a linked list that is being walked linearly that fails to scale under 
a high number of NFS operations (1300-1800 NFS ops/sec, >20Mbytes/sec I/O).

Were any NFS patches rolled into 2.4.18 that may address this?  If not,
has anyone been in touch with Trond about the current stability of his
latest NFS patch sets?

Version-Release number of selected component (if applicable):
2.4.9-31 and 2.4.9-34

Comment 1 Need Real Name 2002-08-05 15:14:52 UTC

I have also tested 2.4.18-5, and the maximum performance under that kernel is
around 30Mb/s.  The long waits for stat() and friends did not show up, but that
could just be a result of the server not handling as much traffic because of the
new upper performance limits in 2.4.18-5.

I have also observed 2 of the 4 hosts that I have installed it on have had
processes go into device wait and not recover, but new processes have no
problems accessing the same NFS mount point.  One of the hosts has 1G of RAM,
the other host has 2G of RAM.

Comment 2 Need Real Name 2002-10-18 15:12:37 UTC

2.4.18-17.7.x still exhibits the same device wait behavior as 2.4.18-5 and
2.4.18-10 did.

Comment 3 Steve Dickson 2003-02-17 14:59:14 UTC

How did you generate the "high NFS I/O loads"? 

How did you get these numbers (1300-1800 NFS ops/sec, >20Mbytes/sec I/O)?