Bug 7483
Summary: | knfsd stops functioning. | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Rex Dieter <rdieter> |
Component: | nfs-utils | Assignee: | Michael K. Johnson <johnsonm> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | a.j.every, bartschies, dch, johnb, kirk.erickson, mapatw=bugs, ung |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2003-01-25 00:56:30 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Rex Dieter
1999-12-01 16:55:04 UTC
We were experiencing this problem for months. It frequently occurred whenever the NFS server was experiencing high load (during backups, for example.) I couldn't found a solution anywhere, so i finally bit the bullet and commented out the piece of kernel code which was triggering the errors. We haven't had any NFS problems since. Maybe the folks at RedHat have a less risky solution? As per johnb's comments: Can you give a few more details pertaining to your metnion of "commenting the piece of kernel code triggering the error"? We may have a very similar problem. We export our home directorys from a redhat 6.1, kernel 2.2.12-20 After a period of usage, sometimes as much as a day :-) the system becomes overrun with stale file handles. We are using knfsd- 1.4.7 and have recently tried the latest stable kernel (2.2.14). None of this has improved the problem. This causes us approx one hour of downtime every two days and seems to be related to load. We do not have an environment where people are grossly sharing files etc. so can not understand why so many stale file handles exist. We are having massive problems with this and if we can't find a work-around soon we will have to shift all our home filespace back across to our slower solaris server. I don't really want this extra work. Our problems have almost completely gone away since: 1. we've started using a lot less non-Linux clients (in our case, NeXTSTEP) 2. reconfiguring NIS and /etc/nsswitch.conf to NOT use NIS for hostname lookups 3. Upgrading to kernel-2.2.14-1.3.0 (it was once available at rawhide). I wouldn't hesitate in saying that an upgrade from 2.2.12-20 is absolutely essential. I haven't upgraded further simply because we've had problem-free uptimes of 1-2 months. (If it ain't broke...) 4. rpc.mountd DOES still occasionally die (once every ~2 weeks), preventing any new mounts. I think this is related to hostname lookup problems (our campus DNS servers crash semi-often). I wrote a little /etc/cron.hourly script to check for rpc.mountd's existence, and to relaunch if necessary: ------ /etc/cron.hourly/rpc.mountd -------- snip ------ #!/bin/sh . /etc/rc.d/init.d/functions dead=0 prog=rpc.mountd pid=`pidof $prog` #Only do check if nfs subsystem is activated if [ -f /var/lock/subsys/nfs ]; then if [ "$pid" != "" ]; then dead=0 else dead=1 date echo -n "$prog dead... restarting:" daemon /usr/sbin/rpc.mountd --no-nfs-version 3 fi fi -------- /etc/cron.hourly/rpc.mountd ------- snip ------ assigned to johnsonm |