Bug 38392

Summary:	kernel: nfs: task 264936 can't get a request slot - after upgrading to 2.2.19 / nfs-utils-0.3.1-0.6.x.1
Product:	[Retired] Red Hat Linux	Reporter:	Bruce Garlock <bruce>
Component:	nfs-utils	Assignee:	Pete Zaitcev <zaitcev>
Status:	CLOSED WORKSFORME	QA Contact:	David Lawrence <dkl>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	6.2	CC:	bruce, magnus.moren, oabundes
Target Milestone:	---
Target Release:	---
Hardware:	i386
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2002-11-12 20:47:57 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Bruce Garlock 2001-04-30 13:40:39 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.77 [en] (Win95; U)


After upgrading to the latest RH 6.2 kernel (2.2.19), and updating mount, losetup, and nfs-utils - I get this error on my console:
kernel: nfs: task 264936 can't get a request slot
....
....
....

I did not get this message with the 2.2.16 kernel.  This error also seems to happen only when my system is backing up.  I backup a few nfs 
mounts on a SCO Unix Openserver 5.0.4 nfs server.  My entire backup time is also increased as a result of these errors, but I get no errors 
when the backup is verified (bit level verification), so it seems like the data is getting there, just a lot slower.

I seem to have no problems accessing the nfs mount, and reading files during normal operations.

Any suggestions?

Reproducible: Always
Steps to Reproduce:
1.  Overnight backups will trigger this nfs error.
2.
3.

Comment 1 Rex Dieter 2001-04-30 14:23:25 UTC

FYI, the "can't get a request slot" error is usually the result of a network 
congestion/misconfiguration or a badly behaving network driver.  What 
make/model of network card do you use?  

I'd suggest trying a different one to see if the problem goes away.

Comment 2 Bruce Garlock 2001-04-30 15:20:37 UTC

The network card is built into the server (IBM Netfinity-5000).  It uses the pcnet32 module.  Since the only thing I did was upgrade the kernel, and any 
other dependencies, I would suspect the pcnet32 module, since my network has not changed in months, and was working fine before the kernel upgrade. 
 I guess I could test this next weekend by booting back into 2.2.16, and see if it clears up this issue.  Any other ideas?

Comment 3 Rex Dieter 2001-04-30 18:21:24 UTC

Another thing to consider... Are you mounting NFS v2 or v3?  Whatever you're 
currently using, try the other one, provided the SCO server supports v3.

And, for the record, since the linux machine is acting as the 'client' in this 
case, nfs-utils is most likely not involved (since it is primarily the 'server' 
nfs component), though it does handle client-side file locking... which you 
could try turning off to see if it helps:
/etc/rc.d/init.d/nfslock stop

Comment 4 Bruce Garlock 2001-05-01 12:42:15 UTC

I turned off nfslock, and still got the errors during my overnight backups.  I'm waiting to hear a response I posted to comp.unix.sco.misc about which 
version of NFS server is running on SCO Openserver 5.0.4.  Maybe someone from that newsgroup will be able to help with this problem.  Another 
question: should I change any mount options?  Right now they all look like this:

Server:/usr/covalent/users/miked /opt/work/users/miked/covalent nfs rsize=8192,wsize=8192,hard,intr 0 0

Maybe the linux client does not like this anymore?

Comment 5 Bob Matthews 2001-05-01 14:36:42 UTC

Just FYI:  as of 2.2.19, the kernel starts lockd automatically as needed, so
this is probably not related to your problem.

Comment 6 Bruce Garlock 2001-05-01 17:39:23 UTC

For the time being, I am excluding all NFS mounts from my backup.  'sar' on the SCO box reports heavy system usage during the time of the backup 
(much more than normal usage during the same time before the 2.2.19 upgrade), and  the last thing I need is a page at 0300.  I'll write up a quick script 
that tar/bzip's the directories I need, and ftp it over to the linux box.  Hopefully I'll get a chance to boot back in to 2.2.16 this weekend and see if the 
problem persists, but I really think this is related to the 2.2.19 upgrade.

Comment 7 Bryan Cater 2001-05-14 14:02:40 UTC

I've had similar problems with Red Hat 7.1 running the 2.4.2-2 kernel.  When I
mount a remote system from a Solaris 2.8 box onto my Linux laptop, I'm able to
create files and I'm to read them, but if I try to copy them from the remote fs
to the local fs the process freezes and never completes; as a matter of fact
then the mounted fs has to be unmount and remouted to get access again.  I hope
they fix this problems soon, it'd be nice if they did more testing before
releasing a new version of the OS.

Comment 8 Oscar Abundes 2002-11-12 20:23:36 UTC

I'm getting the same problem with a RH 7.3, kernel 2.4.18-3smp, NFS client of a 
Solaris 8 2/02 server:

.....
Nov 11 09:35:14 hcidaldelws70 kernel: nfs: task 43977 can't get a request slot
.....

Possible culprits I've come up with from researching the problem:

1. Network congestion.- However, our network load has not changed. The RH 
machine is just another NFS client in our landscape doing the same thing as all 
the other ones (all Solaris though).

2. Busy servers. We have two and they're not. Once clients automount data 
directories, NFS traffic is pretty low.

3. Bad NIC. It's a possibility but this is a brand new machine. I've already 
eliminated the switch and cabling to the wiring closet by connecting it to the 
same port where a Solaris 8 client was happily working.

Someone suggested to get the latest kernel available via up2date. Does anyone 
know if the latest kernel has NFS fixes?

Comment 9 Pete Zaitcev 2002-11-12 20:30:53 UTC

The bad news is that the way errors are propagated by sunrpc stack,
"cannot get request slot" error may be caused by a thousand of
different reasons.

The 2.4.18-17 can fix the problem, or perhaps not. I think it is
worth to download and try, it's a better NFS than it was.
Also, victims of early Intel e100's get helped by newer driver
(it has nothing to do with NFS by itself, but NFS is more sensitive
than, say, FTP).

Comment 10 Pete Zaitcev 2002-11-12 20:39:59 UTC

Oh, and BTW, I just noticed we are raping a 6.2 bug report.
Please file new bugs for kernel 2.4 based clients,
it's a little different code base.

Comment 11 Oscar Abundes 2002-11-12 20:47:51 UTC

Yes, I realized this bug referred to RH 6.2 but I decided to comment because I 
saw no resolution posted. My apologies. I'll file a report for 7.3. Thank you.

BTW, I will load the latest kernel and test to see if the problem goes away.

Comment 12 Pete Zaitcev 2002-11-15 16:40:49 UTC

Oscar says 2.4.18-17 works for him.

I am closing this one, because we only do security fixes for 6.2 by now.
Don't hesitate to file new bugs against contemporary releases.