Bug 56665 - nfsd fails to server exports after a few minutes uptime in 2.4.9-12smp
nfsd fails to server exports after a few minutes uptime in 2.4.9-12smp
Status: CLOSED NOTABUG
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.1
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Steve Dickson
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-11-23 13:17 EST by Paul Raines
Modified: 2007-04-18 12:38 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-08-11 06:54:09 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Paul Raines 2001-11-23 13:17:21 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.4)
Gecko/20011019 Netscape6/6.2

Description of problem:
I just installed RH7.1 on a machine, upgraded to the last errata 
versions, and installed the 2.4.9-12smp kernel.  The machine has an
internal SCSI disk and external IDE->SCSI Raid.  I converted all 
partitions except / to ext3 which are exported.  I start nfs and can
mount the exports just fine from a couple of clients.  I start a 
script that loops over many tens of clients to mount a export off
the server.  After about ten or so, mounts stop working and I get
I/O errors.  Unmounting from a previously successful client and 
retrying the mount all fails.

I can '/etc/init.d/nfs restart' and things start working again for
another few minutes.

I downgrade to the RH 2.4.7-2.9 kernel (which I patched for ext3) and
the problem goes away.
Mounts of the ext2 root volume also fail so I don't think it is an
ext3 problem.

I tried upgrading nfs-utils and mount to rawhide versions but problem
did not go away.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Upgrade to 2.4.9-12 kernel
2. Export several mount points
3. Mount exports from serveral clients till you get I/O error
   and server refuses to give more mounts
	

Actual Results:  Could no longer mount NFS exported volumes off server

Expected Results:  SHould have been able to mount exported NFS volumes

Additional info:

When the problem first starts, you will see messages like this in
syslog:

Nov 23 12:36:46 monte rpc.mountd: authenticated mount request from
132.183.203.39:1022 for /local_mount/homes/monte/1
(/local_mount/homes/monte/1) 
Nov 23 12:36:46 monte last message repeated 19 times

However, soon no messages appear from other attempted mounts so
rpc.mountd is probably completely locked up.

Here is the /etc/exports file:
===============
/local_mount/homes/monte/1 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/1 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/2 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/3 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/4 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/5 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/6 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/7 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/8 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/local_mount/space/monte/9 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

/export/redhat-7.1 \
  @all(rw) \
  192.168.100.0/255.255.255.0(rw,no_root_squash,insecure)

===========
The machine is a master of batch (beowulf) cluster so has
two network devices with batch nodes on 192.168.100.*
Comment 1 Paul Raines 2002-01-17 19:23:38 EST
I discovered this problem was related to iptables. The machine serves as
a bridge between a private network and the main network and is setup to
masquerade using iptables.  Below is how it is configure.  As soon as
I turn off iptables, the NFS problem goes away.  So the IP filters looks
like it is breaking NFS somehow.

# /etc/init.d/iptables status
Table: nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  anywhere             anywhere           

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
Table: filter
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
Comment 2 Pete Zaitcev 2002-01-17 21:23:16 EST
Make sure "-o <public_ethN>" is used in iptables.
Don't let it masquerade what goes inside, or else
the connection tracker chokes.

The output of "iptables -L -t nat" does not show
additional options such as -o.
Comment 3 Paul Raines 2002-01-18 08:47:24 EST
I tried adding the "-o <public_ethN>"
 option and it still broke NFS.  It made no difference.  Specifically,
I added "-o 192.168.100.0/255.255.255.0"

Note You need to log in before you can comment on or make changes to this bug.