From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) Description of problem: We have a Sun NIS environment where around 200 redhat machines are there. Where some of the servers are Enterprise 3 while other are Enterprise 4. The client machines are mostly enteprise 3 or redhat 7.2. Till the nfs servers were Enterprise 3, it was working fine, though we use to face some issue with autofs unable to mount the nfs server. But once we restart the client autofs, the problem disapears. But recently when we upgraded some nfs server to Enterprise4, then most of the client autofs started behaving abnormal. Even restarting the the client autofs donot help much, only rebooting the client is the only solution. The client autofs version are autofs-3.1.7-21 in 7.2 clients or autofs-4.1.3-67 or autofs-4.1.3-131 Version-Release number of selected component (if applicable): nfs-utils-1.0.6-46 How reproducible: Sometimes Steps to Reproduce: 1.boot a redhat 7.2 machine with autofs-3.1.7-21 2.boot the nfs 4 server (Enterprise 4)and exportfs the file system. 3. try to mount the the filesystem thru autofs Actual Results: the client hangs and the machine needs rebooting Expected Results: The client should smoothly mount the exported filesystem Additional info:
This is a very vague problem description. Does the problem occur on the RHEL 3 clients? If so, then please provide the information asked for in the "Filing bug reports" section of the following URL: http://people.redhat.com/jmoyer/ If you can only reproduce with your RH 7.2 clients, then please configure syslog to capture debug messages. You can do this like so: o Add a line like the following to your /etc/syslog.conf: *.* /var/log/debug o Restart syslogd (or send it a HUP signal). Then, when the problem occurs, attach the /var/log/debug file to this bugzilla.
I am getting similar error in both redhat 7.2 as well as EL 3 clients. My auto.master looks like /delsoft yp:auto.home -o soft,intr --debug (I have added --debug on your suggestion) Also initially soft was not there, but whenever I use to access the filesystem of EL4 server, it use to hang. SO I added the soft option to avoid hang. Earlier when I use to do -------------------------------------------------------------------- file /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static- Linux2.a It use to hang, but now due to the soft,intr it doesnot hang and it gives error /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static-Linux2.a: file: read failed (Input/output error). ---------------------------------------------------------------------- Just FYI, the /delsoft/analyzer is in a EL4 server. ALso I donot get any error in /var/log/debug file, so I am not attaching the same. One more thing, if I contineously do -------------------------------------------------------------- file /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static- Linux2.a ------------------------------------------------------------------- sometimes it gives Input/output error and sometimes it shows proper status ------------------------------------------------- rajdeep@colorado [44] file /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static- Linux2.a /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static-Linux2.a: file: read failed (Input/output error). rajdeep@colorado [45] file /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static- Linux2.a /delsoft/analyzer/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static-Linux2.a: ELF 32-bit LSB relocatable, Intel 80386, version 1, not stripped --------------------------------------------------------------- But normally if you access for the first time, it gives error while 2nd or 3rd time it doesnot give any error. I have also noticed that if you wait for some time and again do file, it will gagain give error first time and then start working fine in 2nd or 3rd time. The uname -a output of my 7.2 machine is ---------------------------------------------------- uname -a Linux colorado 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown ------------------------------------------------------------- while the the uname -a output of my EL 3 machine is -------------------------------------------- uname -a Linux spider 2.4.21-4.ELsmp #1 SMP ------------------------------------------------ Both the machines is having more or less same issue..
ALso I forgot to attach the autofs versions.. The autofs version for 7.2 client machine is autofs-3.1.7-21 The autofs version for the EL3 client machine is autofs-3.1.7-41 The output of /etc/auto.master is /delsoft yp:auto.home -o soft,intr --debug
Your autofs version is very old. Can you upgrade to the latest RHEL update?
One further question: is the file system mounted when the "file" command fails? Does it always fail on the initial access to the mount (i.e. when it triggers an automount)?
First of all, we cannot change the version of redhat, the reason being, we send release to the customer on this well defined platforms, and we cannot change it. Answer to 2nd question: Yes, the file system remains mounted when the "file" command fails, that I have checked in the debug file. Also you are correct, it fails on the initial access to the mounts even if the filsystem is mounted for a long time. Once it gives the i/o error, then the nect time, it gives no error and works fine. Then if you continue to access the file, it works fine. But then if you try to access another file in the same directory, it gives error for the first time again. ALso if I remove the "soft intr" from my auto.master, then it hangs instead of giving i/o error. It hangs till I reboot the machine. Even restarting autofs doesnot work.
OK, so let's see if I have this all straight. You did not change anything on the client side. The NFS servers were upgraded to RHEL 4. Then you started seeing these problems. Is that right? If so, can you trigger the problem without the automounter? I.e. mount a share by hand, and see if the first access will give you the error? I'm putting Steve back on the CC list. This is really sounding like an NFS issue, to me.
Yes, I didnot change anything in the client side atleast for 7.2 machine. Though I tried a higher version of autofs in EL 3 machine, but got the same error. You are correct, as soon as I upgraded my NFS server to rhel 4, all these problem started. Now I have physically mounted it on a temp directory /a and then when I use "file" command, it hangs for ever.. -------------------------------------------------- [root@colorado /]# mount elephant:/home6/analyzer /a [root@colorado /]# file /a/OVL/3.5.0_ovl_2/cheetah/lib/libspyglassVE-static- SunOS5.a ------------------------------------------------------ I need to kill the shell..
Thanks very much for the clarification and the quick testing turn-around. Steve, this is most certainly not an autofs issue. I'm changing the component to kernel and reassigning to you.
Try running NFS clients with iptables disabled. If the problem goes away, then this bug may be a duplicate of https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=171267 Bug 171267 seems to be triggered when (after successful authorization) the client connects to the server to read the filesystem superblock. The client sends a TCP SYN to establish a connection, but the server (sometimes) responds with a pure ACK without the SYN bit; this is blocked by iptables, so the superblock read fails. On my systems, this bug is often triggered by client reboots.
We donot enable iptables in our network, so it is disabled by default. So bug 171267 is not our case atleast.
This happens here to us too. Both client and server are CentOS 4.4 machines. Client is x86_64 kernel-2.6.9-42.0.3.EL, server is i686, kernel-2.6.9-34.0.2.EL (yes, just noticed; will update ASAP), everything else is up to date. iptables is enabled, can't take it down.
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.