Bug 335371 - mounting a Redhat EL5 server share makes Fedora 7 hang accessing this share
Summary: mounting a Redhat EL5 server share makes Fedora 7 hang accessing this share
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
low
urgent
Target Milestone: ---
: ---
Assignee: Andy Gospodarek
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-10-16 21:24 UTC by Thomas Schweikle
Modified: 2014-06-29 22:59 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-04-09 07:28:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Thomas Schweikle 2007-10-16 21:24:27 UTC
Description of problem:
Mounting a RedHat EL5 server share via nfs makes Fedora 7 hang accessing 
this share.

Version-Release number of selected component (if applicable):
Client:
- kernel-2.6.22.9-91.fc7
- nfs-utils-1.1.0-3.fc7
- nfs-utils-lib-1.0.8-10.fc7
- system-config-nfs-1.3.25-1.fc7

Server:
- kernel-PAE-2.6.18-8.1.8.el5
- nfs-utils-1.0.9-16.el5
- nfs-utils-lib-1.0.8-7.2.z2


How reproducible:
always

Steps to Reproduce:
1. Install Redhat EL5, update to the latest packet versions,
   create an exports file:
   /somepath       ip-address/netmask(rw,root_squash)

2. Install Fedora 7, update to the latest packet versions,
   create an auto.master:
   /- /etc/auto.direct

   create an auto.direct:
   /mountpoint -fstype=nfs,rw,rsize=8192,wsize=8192 server:/somepath

   Start autofs:
   /etc/init.d/autofs start

3. "ls -la /mountpoint" may work, but
   "cp /mountpoint/largefile /tmp" will suddenly hang.

Actual results:
The last entered command on the mounted share will hang.

Expected results:
The last entered command on the mounted share finishing without error.

Additional info:
The mounts worked for months without any problem since two weeks (patches 
installed Oct. 5, 2007) they do not work any more!

Comment 1 Steve Dickson 2007-11-01 11:38:25 UTC
would it be possible to post a bzip2 tshark network trace?
Something similar to:

    tshark -w /tmp/bz335371.pcap host <server>
    bzip2 /tmp/bz335371.pcap



Comment 2 Thomas Schweikle 2007-11-01 19:30:37 UTC
I was able to find out what made it hang meanwhile:

nfs hangs only if the connection to the nfs server was via a vlan. I had to set 
the MTU four bytes less than the MTU for the adapter to make it work again:

ifcfg-eth0:
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=dhcp
HWADDR=00:0c:29:f1:0f:50

ifcfg-eth0.2:
DEVICE=eth0.2
VLAN=yes
ONBOOT=yes
BOOTPROTO=dhcp
HWADDR=00:0c:29:f1:0f:50

I had to add "MTU=1496" to ifcfg-eth0.2:
DEVICE=eth0.2
VLAN=yes
ONBOOT=yes
BOOTPROTO=dhcp
HWADDR=00:0c:29:f1:0f:50
MTU=1496

It seems this had been done automaticly with the not updated installations. I 
am not sure if this is a kernel or a script problem. But in my opinion this is 
bad, since some devices allow for a larger MTU and some want it lower and I 
have to look at what the MTU actualy is for eth0 to set it four bytes less for 
eth0.2!

Comment 3 Bug Zapper 2008-05-14 14:46:25 UTC
This message is a reminder that Fedora 7 is nearing the end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 7. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '7'.

Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 7's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 7 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug. If you are unable to change the version, please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. If possible, it is recommended that you try the newest available Fedora distribution to see if your bug still exists.

Please read the Release Notes for the newest Fedora distribution to make sure it will meet your needs:
http://docs.fedoraproject.org/release-notes/

The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Thomas Schweikle 2008-05-15 05:22:13 UTC
This problem exists for Fedora 8/9 too. The VLAN MTU is set to the same size 
than the MTU for the real adapter. But this is wrong, since the tag needs four 
bytes! The MTU must be decremented by four for every VLAN created!

Comment 5 Steve Dickson 2008-06-27 18:55:50 UTC
I'm not sure why this is an nfs-utils bug. nfs-utils had nothing to do
with setting MTUs...

Comment 6 Thomas Schweikle 2008-06-28 06:41:24 UTC
This is not an nfs or nfs-utils bug, but it looked at the first glance as if it
where. It should be redirected to Network-Manager bugs or net-utils bugs.

Comment 7 Zdenek Prikryl 2008-07-09 12:48:55 UTC
Hello,
I think that default setting of MTU is module stuff. It seems, that if a module
of your card don't count with vlan tagging (4 bytes) in MTU, then you have to
change MTU by yourself. I tried to reproduce this bug, but I wasn't successful,
everything worked out of the box. In your case it is strange, that it worked.
What ethernet card do you use? 

Anyway, definitely it isn't a bug in net-tools.

Comment 8 Thomas Schweikle 2008-07-09 21:36:24 UTC
As I thought first: a kernel/module problem. Seems this is related to Broadcom 
some Broadcom devices. But since I do not have access to the Dell workstation 
any more I can't test and try out if it is gone changing network cards. AFAIK I 
could see this with some Intel Centrino Duo/Centrino devices to. But since I do 
not use Linux on my laptop at the moment --- to many drivers (Camera, SD-Slot, 
WLAN, Bluetooth) missing. It is impossible to test in near future.

Comment 9 Zdenek Prikryl 2008-07-10 13:40:19 UTC
Ok, I'm reassigning this to kernel, so they can add their comments.

Comment 10 Jeff Layton 2008-07-25 16:04:14 UTC
So is this problem a client or server issue? If a client issue then it should
probably get moved to the fedora kernel queue...


Comment 11 Zdenek Prikryl 2008-07-29 12:02:04 UTC
I'm not 100% sure which component sets MTU. But in my opinion, the kernel module
does it. So that is why I want to have comments from kernel side. 

In this particular case, the bug is on server side, because there vlan is set.
But I thing that the problem is on both side. I mean, if we use fedora pc as a
server and if we set vlan here, then the bug will be on fedora side.

Comment 12 Jeff Layton 2008-07-29 12:25:46 UTC
I'm afraid I'm still confused...

In comment #2, you mentioned that you had to reduce the MTU by 4 bytes to
account for the VLAN. What's not clear is whether you had to do this on the
client or server side, or both...


Comment 13 Zdenek Prikryl 2008-07-29 12:46:44 UTC
comment #2 isn't my comment, but anyway I'll try to answer :-). I think that you
have to do it no both sides. Unfortunately, I don't have hardware to reproduce
it, so this is only my guess.

Comment 14 Andy Gospodarek 2008-07-29 14:31:25 UTC
Thomas,

You mentioned that an upgrade caused this problem.  Was it an upgrade to the EL5
system that now prevents any Fedora system with VLANs and no MTU=1496 statement
in ifcfg-eth0.2?  If so, can you tell me what kernel version worked before?  Can
you also tell me exactly which type of Broadcom NIC you are using (lspci output
will be fine).

Thanks!

Comment 15 Thomas Schweikle 2008-07-31 00:23:47 UTC
I first noticed the problem on Fedora 6, sort after on CentOS 5, then on RedHat 
Enterprize Linux 5. If you are working with VLAN Linux ./. Linux it was 
necessary to set MTU for the VLAN to be four bytes less than the MTU for the 
corresponding none VLAN adapter. This was for the client and server.

The problem started with one of the first kernel updates for Fedora 6 and 
CantOS/RHEL 5. Going back to the old kernel made the whole thing work again.

Meanwhile I found this very same problem on Ubuntu 8.04, but not on SuSE 10.3, 
11.0. Since all of them use distribution specific patched kernels I tried a 
vanilla kernel from kernel.org: same problem. Looks like SuSE applies a working 
patch for this problem. Maybe this patch is already on its way into the kernel 
tree now.

Comment 16 Thomas Schweikle 2009-04-08 23:59:33 UTC
This bug seems gone. I could mount a RedHat Enterprize Linux 5 share from all Fedora systems I could try:
Fedora 6 --- realy old now upgraded to 10
Fedora 7 --- realy old now upgraded to 10
Fedora 8 --- old now upgraded to 10
Fedora 9 --- now upgraded to upcoming 11
Fedora 10
Fedora 11

Thus there is no test system available any more! Maybe we could close this bug?!

Comment 17 Zdenek Prikryl 2009-04-09 07:28:02 UTC
Ok, closing this.


Note You need to log in before you can comment on or make changes to this bug.