Bug 91577 - NFS corruptions with kernel 2.4.20-13.9
Summary: NFS corruptions with kernel 2.4.20-13.9
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 9
Hardware: All
OS: Linux
medium
high
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-05-24 19:08 UTC by Stephen John Smoogen
Modified: 2007-04-18 16:53 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2003-06-07 02:59:19 UTC
Embargoed:


Attachments (Terms of Use)

Description Stephen John Smoogen 2003-05-24 19:08:52 UTC
Description of problem:

For various policy reasons, our home directories are NFS mounted onto
our machines that have disks. Most of the machines are diskless. We
upgraded a couple of machines to 9 this week, and with the errata kernel
started seeing NFS/network issues if someone does something large in
their home directory. In my case it was recompiling the RPM-4.2 package
for our environment, and having the network 'fall' out under me as the
kernel said that it needed to be Half-Duplex. The kernel printk'd that
it was re-nogotiating with the link partner, and was told to be
Half-Duplex.. but the switch says it is full duplex. 

The funny thing is that I didnt have this problem with the RHL7.3 errata
kernel (kernel-2.4.20-13.7) and I was recompiling a lot larger packages
than RPM. (Same hardware just difference of 7.3 or 9). I can try forcing
the network to be full duplex only, but that seems to be strange that it
didnt need to before.

NFS server is a netapp 760 with updates from March. The mount is the following:

/home on /home type nfs
(rw,nosuid,nodev,fg,vers=3,rsize=8192,wsize=8192,hard,nointr,tcp,timeo=600,addr=


[smoogen@smoogen1 SPECS]$ lspci
00:00.0 Host bridge: Intel Corp. 82860 860 (Wombat) Chipset Host Bridge (MCH)
(rev 04)
00:01.0 PCI bridge: Intel Corp. 82850 850 (Tehama) Chipset AGP Bridge (rev 04)
00:02.0 PCI bridge: Intel Corp. 82860 860 (Wombat) Chipset AGP Bridge (rev 04)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB PCI Bridge (rev 04)
00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 04)
00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 04)
00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #1) (rev 04)
00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 04)
00:1f.4 USB Controller: Intel Corp. 82801BA/BAM USB (Hub #2) (rev 04)
00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio (rev 04)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon VE QY
02:1f.0 PCI bridge: Intel Corp. 82806AA PCI64 Hub PCI Bridge (rev 03)
03:00.0 PIC: Intel Corp. 82806AA PCI64 Hub Advanced Programmable Interrupt
Controller (rev 01)
03:0e.0 SCSI storage controller: Adaptec AIC-7892P U160/m (rev 02)
04:0b.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)
04:0c.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller
(Link)

I do have iptables installed with a default policy of drop, but nothing was
logged when the box went to half-duplex.

Take a large compile (XFree86, rpm, kernel) and compile over NFS from such a
mount. The problem has occured 3 out of 3 times with the auto-negotiate enabled
on Red Hat Linux 9,  and no times with the 7.3 machine.

Comment 1 Steve Dickson 2003-06-06 00:35:13 UTC
Please be more specific; what are the exact issues
an how are they repoduced.

things like error messages and network traces
would be a good start

Comment 2 Stephen John Smoogen 2003-06-07 02:59:19 UTC
After a week of this problem, I can no longer duplicate it. The network people
say they didnt fix anything on the switches between the netapp and my machine..
but i cant help any further :(


Note You need to log in before you can comment on or make changes to this bug.