Bug 65772 - close() hangs on file in NFS-mounted dir using
close() hangs on file in NFS-mounted dir using
Status: CLOSED ERRATA
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Ben LaHaise
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2002-05-31 12:16 EDT by Erik Williamson
Modified: 2007-04-18 12:42 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2002-06-03 18:09:14 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Erik Williamson 2002-05-31 12:16:39 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.0 (X11; Linux i686; U;) Gecko/20020516

Description of problem:
With home  directory mounted via NFS on a Solaris 8 server, If I try to use 'ar'
(in this case, there's other apps as well), the program freezes when attempting
to close the output file (found this out using strace).  Note that if I attempt
to do the same task in a directory mounted on a linux box, all is well.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. copy some object files ( .o ) to an nfs share on a solaris (8) box
2. run 'ar rc outlib.a *.o'
3. wait!
	

Actual Results:  'ar' successfully creates its temporary file in the directory,
yet when it tries to close(), it hangs.  The rest of the machine is responsive,
though.  'ps' shows that the ar process is in state 'D'.  I can magically
un-hang the process by ssh-ing to the box, and the process completes
successfully.  Wierd, huh?

Additional info:

Sometimes this works though, but I can't figure out why!  Sometimes it doesn't
work with a single file, sometimes it does.  sometimes if there's an existing
output file, and I'm adding a to the archive, it works.  While it almost
consistantly fails, sometimes it works...

Anyhow, thanks for the help!
Comment 1 Erik Williamson 2002-05-31 12:26:18 EDT
Sorry, I forgot to mention that I can prform the same task on RH 7.1 & 7.2 boxes
(completely patched) with the same dir mounted on the same server, and it works
just ducky.

Thanks Again!
Erik.
Comment 2 Ben LaHaise 2002-06-03 15:03:54 EDT
What network card/driver is being used?  It sounds like the driver is missing a
wakeup, which in turn causes NFS traffic to be delayed.
Comment 3 Erik Williamson 2002-06-03 15:23:28 EDT
The systems affected Are Dell Precision 530's with  - lsmod tells me they're
using the 3c59x module.

Here's what dmesg says:
3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
04:0b.0: 3Com PCI 3c905C Tornado at 0xec80. Vers LK1.1.16

FWIW, this is an on-board NIC.

Cheers & thanks for getting on this so quick!
e.
Comment 4 Ben LaHaise 2002-06-03 15:30:56 EDT
Hmm, the 3c59x driver is in pretty good shape.  What is rsize/wsize set to? 
(cat /proc/mounts)  Try limiting them to 4K if they're set larger, and see if
that makes a difference.  The next 2.4.18 kernel erratum includes patches to
default to 
a smaller [rw]size.
Comment 5 Erik Williamson 2002-06-03 18:09:08 EDT
Beautiful - the smaller [rw]size fixed it - Thanks for the help!  

When do you anticipate the kernel release to be?

Thanks - e.
Comment 6 Ben LaHaise 2002-06-03 20:54:18 EDT
I can't give an exact timeframe other than "soon".  The errata kernel will be
2.4.18-4 or higher; please reopen if the fix included in that kernel doesn't work.

Note You need to log in before you can comment on or make changes to this bug.