8424 – NFS server, not responding..

Bug 8424 - NFS server, not responding..

Summary: NFS server, not responding..

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	nfs-utils
Sub Component:
Version:	7.0
Hardware:	i386
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Michael K. Johnson
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2000-01-12 23:12 UTC by luke
Modified:	2008-05-01 15:37 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2003-01-25 03:22:35 UTC
Embargoed:

Attachments	(Terms of Use)

Description luke 2000-01-12 23:12:20 UTC

I have three redhat systems one is an NFS server and two is an NFS clients,
they are:
server:
Redhat 6.0
Kernel: 2.2.13 SMP
knfsd: 1.4.7-7

client1:
Redat 6.0
kernel: 2.2.12-20smp
knfsd: 1.4.7-7
knfsd clients: 1.4.7-7

client2:
Redhat 6.0
kernel: 2.2.13
knfsd: 1.4.7-7
knfsd clients: 1.4.7-7

The client system is typicually losing the server system.
When this happens, it is not for a long time period, maybe serveral seconds
but it is causing pauses on the client side as people are trying to access
files.  These pauses are very frustrating to the users.  This is happening
abour 30 times a day as well.

Here are log messages from the systems:
server:
Jan 12 14:21:40 khan kernel: fh_verify: sweets/.nfs0003d803000000e9
permission failure, acc=2, error=13
Jan 12 14:59:02 khan kernel: fh_verify: a/admin permission failure, acc=1,
error=13
Jan 12 14:59:49 khan kernel: fh_verify: a/admin permission failure, acc=1,
error=13
Jan 12 14:59:49 khan kernel: fh_verify: a/admin permission failure, acc=1,
error=13
Jan 12 15:01:38 khan kernel: fh_verify: d/daver permission failure, acc=1,
error=13
nfsd_d_validate: invalid address feebbaca
nfsd_d_validate: invalid address feebbaca
get_empty_dquot: pruning 465

client1:
Jan 11 14:13:21 spock kernel: nfs: task 23473 can't get a request slot
Jan 12 14:01:08 spock kernel: nfs: server khan2 OK
Jan 12 14:02:03 spock kernel: nfs: server khan2 not responding, still
trying
Jan 12 14:02:03 spock kernel: nfs: server khan2 OK
Jan 12 14:09:39 spock kernel: nfs: server khan2 not responding, still
trying
Jan 12 14:09:39 spock kernel: nfs: server khan2 not responding, still
trying
Jan 12 14:09:50 spock kernel: nfs: task 37169 can't get a request slot
Jan 12 14:09:58 spock kernel: nfs: server khan2 OK
Jan 12 14:21:42 spock kernel: NFS: can't silly-delete
sweets/.nfs0003d803000000e
9, error=-13

client2:
Jan 12 15:07:41 locutus kernel: nfs: task 34523 can't get a request slot
Jan 12 15:07:42 locutus kernel: nfs: task 34524 can't get a request slot
Jan 12 15:14:12 locutus kernel: nfs: server khan2 not responding, still
trying
Jan 12 15:14:15 locutus kernel: nfs: server khan2 OK
Jan 12 15:14:15 locutus kernel: nfs: server khan2 OK
Jan 12 15:15:33 locutus kernel: nfs: server khan2 not responding, still
trying
Jan 12 15:15:34 locutus kernel: nfs: server khan2 not responding, still
trying
Jan 12 15:15:39 locutus kernel: nfs: server khan2 OK
Jan 12 15:15:39 locutus kernel: nfs: server khan2 OK

Let me know if there is any more information I can provide to aid the
solving of this problem.

Thanks,

Luke

Comment 1 Cristian Gafton 2000-01-13 17:20:59 UTC

there is some heavy networking going on there that makes the kernel on the
client side to run out of available sockets. I doubt this is related to the NFS
server.

Adjusted priorities and severity of the problem report.

Comment 2 luke 2000-03-01 21:25:59 UTC

Do you know where I would look to see about adjusting the number of sockets on
the client system?

Comment 3 Cristian Gafton 2000-08-09 02:36:26 UTC

assigned to johnsonm

Comment 4 frenzel 2001-02-27 22:13:48 UTC

We have 3 Dell Precision 420 workstations, 2 with single CPUs 
(the clients/desktops), one with two CPUs (intended as 
compute/file/print/web server). Each workstation exports file systems via NFS 
to the other two. Accessing the NFS mounted file systems on the server from 
the clients often results in hangups ("NFS task xxx can't get a request slot") 
of the clients for a few seconds up to several minutes. 
This must be a problem with the SMP kernel - if I run the single-processor 
kernel (RH 2.2.16-22 in both cases) on the server, 
the problem does not exist. There is also no problem in accessing the file 
systems on the clients from the server. Network traffic is always low, ping 
gives times around 150 useconds even during an NFS hangup. The load on the 
server is also very low.

Comment 5 Stephen John Smoogen 2003-01-25 03:22:35 UTC

The running out of slots and other problems listed above were mainly fixed with
the newer nfsd that showed up in Red Hat Linux 7.2. The configuration of the
server can be found in man rpc.nfsd command to get the number of threads to
start and should be configured in /etc/rc.d/inet.d/nfs

Note You need to log in before you can comment on or make changes to this bug.