Bug 448898
Summary: | After upgrade to F9, NAS device can no longer be mounted | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jeffrey M. Birnbaum <jmbnyc> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED NEXTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 9 | CC: | ADent123, allan-redhat, bmartin, chris, drepper, flailios, gartim, gk4, jpazdziora, mike, pwaldenlinux |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-01-24 02:38:19 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Jeffrey M. Birnbaum
2008-05-29 12:41:01 UTC
That does mount -v nfs 192.168.0.107:/SHARE1 /share1 say? [root@arturo jmb]# date Tue Jun 10 16:15:01 EDT 2008 [root@arturo jmb]# mount -v -t nfs 192.168.0.107:/SHARE1 /share1 mount.nfs: timeout set for Tue Jun 10 16:17:05 2008 mount.nfs: text-based options: 'addr=192.168.0.107' mount.nfs: internal error I am also seeing this on my Fedora 9 machines. NFS mounts worked just fine on Fedora 8. After the F9 update, they get the same "mount.nfs: internal error" message when you try to mount. No mount options seem to change anything. would it be possible to get an strace of the mount? Created attachment 310064 [details]
mount.fs using fstab
I am not the original reporter, but I will go ahead and post my straces. The
first one has been generated with: sudo strace mount devsys3:/ >
mount.nfs.fstab.txt 2>&1
Devsys3 is a pingable and fully functional NFSv3 computer.
Created attachment 310065 [details]
mount.nfs using command line options
The second strace is by using as many command line options as necessary,
including using the IP address and mount path. sudo strace mount -t nfs
172.17.100.224:/ /media/devsys3 > mount.nfs.bash.txt 2>&1
Created attachment 310066 [details]
mount.nfs directly
This third and last strace may be the most helpful. I used the mount.nfs
command directly. sudo strace /sbin/mount.nfs 172.17.100.224:/ /media/devsys3
-v > mount.nfs.full.txt 2>&1
Created attachment 310091 [details]
strace /sbin/mount.nfs 192.168.0.107:/SHARE1 /share1 > mount.nfs.txt 2>&1
The first of two files with strace output.
strace /sbin/mount.nfs 192.168.0.107:/SHARE1 /share1 > mount.nfs.txt 2>&1
Created attachment 310093 [details]
strace mount -t nfs 192.168.0.107:/SHARE1 /share1 > mount.txt 2>&1
strace mount -t nfs 192.168.0.107:/SHARE1 /share1 > mount.txt 2>&1
Thanks for all the info... I'm wondering if the "mount.nfs: internal error" is being caused by the mount system call returning EIO... A couple of things, what machine architecture are you guys using? Is SELinux enabled? If disable it and see happens. Turn on the mount debug in the kernel with sudo rpcdebug -m nfs -s mount and then please post that output which will be found in either 'dmesg' or /var/log/message Here's the Fedora Forum thread with more users reporting this error. http://fedoraforum.org/forum/showthread.php?t=189949 This is also the output from dmesg: NFS: nfs mount opts='addr=172.17.100.224' NFS: parsing nfs mount option 'addr=172.17.100.224' NFS: sending MNT request for 172.17.100.224:/ NFS: failed to create RPC client, status=-5 NFS: unable to mount server 172.17.100.224, error -5 And from /var/log/messages: Jun 24 09:25:56 michael kernel: NFS: nfs mount opts='addr=172.17.100.224' Jun 24 09:25:56 michael kernel: NFS: parsing nfs mount option 'addr=172.17.100.224' Jun 24 09:25:56 michael kernel: NFS: sending MNT request for 172.17.100.224:/ Jun 24 09:25:56 michael kernel: NFS: failed to create RPC client, status=-5 Excuse the spam. Selinux is disabled. Pentium 4 with i686 kernel. Ok I do appreciate your patience... its a bit frustrating I can reproduce this... Looking at the output in the Fedora Forum thread I would like to verify at you are seeing the same rpcbind failure that is seen in the thread. So Lets turn on kernel RPC debugging with: sudo rpcdebug -m rpc -s call Due to the volume of output please redirect the dmesg output into a file (i.e. dmesg > /tmp/bz448898.dmesg) and then attached the file to this bz. Now if this is a remote rpcbind failure we should be able to see the server returning the error with a network trace. So in addition to setting the debug let get a network trace with the following command. tshark -w /tmp/bz448898.pcap host <server> bzip2 /tmp/bz448898.pcap then attache the bzip2-ed file. I'm getting the same problem with mount.nfs: internal error. mounting a snapserver drive successfully fedora Core 4-8. Break on Fedora 9 Here's the rpc debug output for my machine. RPC: creating mount client for snap (xprt ddcf0000) RPC: 23 call_start mount3 proc 0 (sync) RPC: 23 call_reserve (status 0) RPC: 23 call_reserveresult (status 0) RPC: 23 call_allocate (status 0) RPC: 23 call_bind (status 0) RPC: creating rpcbind client for snap (xprt ccca8c00) RPC: 24 call_start rpcbind4 proc 9 (async) RPC: 24 call_reserve (status 0) RPC: 24 call_reserveresult (status 0) RPC: 24 call_allocate (status 0) RPC: 24 call_bind (status 0) RPC: 24 call_connect xprt ccca8c00 is not connected RPC: rpc_release_client(f318b200) RPC: 24 call_connect_status (status -107) RPC: 24 call_timeout (minor) RPC: 24 call_bind (status 0) RPC: 24 call_connect xprt ccca8c00 is connected RPC: 24 call_transmit (status 0) RPC: 24 call_encode (status 0) RPC: 24 call_status (status 24) RPC: 24 call_decode (status 24) RPC: 24 call_verify: proc f8c4efec unsupported by program 100000, version 4 on server snap RPC: 24 call_verify: call failed with error -95 RPC: rpc_release_client(f318b200) RPC: destroying rpcbind client for snap RPC: 23 unrecognized rpcbind error (95) RPC: rpc_release_client(f304ba00) RPC: shutting down mount client for snap RPC: rpc_release_client(f304ba00) RPC: destroying mount client for snap It appears the problem might be that the older portmappers are only listening for requests on the UDP transport. I'm noting the the rpcinfo output in the original Description of problem. There is only one 'portmapper' entry and I'm guess (since the netid field is blank) that entiry is a UDP listener. Usually there are two 'portmapper' entry; one for UDP and one for TCP. To test this theory, please put a '-o udp' on the mount command which should cause the mount to succeed. Also a tshark trace as described in Comment #13 would also help greatly... mount.nfs using udp also produced the same internal error mount.nfs snap:/Drive1 /mnt/snap -o udp RPC: creating mount client for snap (xprt f4dfd800) RPC: 25 call_start mount3 proc 0 (sync) RPC: 25 call_reserve (status 0) RPC: 25 call_reserveresult (status 0) RPC: 25 call_allocate (status 0) RPC: 25 call_bind (status 0) RPC: creating rpcbind client for snap (xprt f249ec00) RPC: 26 call_start rpcbind4 proc 9 (async) RPC: 26 call_reserve (status 0) RPC: 26 call_reserveresult (status 0) RPC: 26 call_allocate (status 0) RPC: 26 call_bind (status 0) RPC: 26 call_connect xprt f249ec00 is not connected RPC: rpc_release_client(f79c1000) RPC: 26 call_connect_status (status -107) RPC: 26 call_timeout (minor) RPC: 26 call_bind (status 0) RPC: 26 call_connect xprt f249ec00 is connected RPC: 26 call_transmit (status 0) RPC: 26 call_encode (status 0) RPC: 26 call_status (status 24) RPC: 26 call_decode (status 24) RPC: 26 call_verify: proc f8efdfec unsupported by program 100000, version 4 on server snap RPC: 26 call_verify: call failed with error -95 RPC: 25 unrecognized rpcbind error (95) RPC: rpc_release_client(f79c1000) RPC: destroying rpcbind client for snap RPC: rpc_release_client(cb0c7000) RPC: shutting down mount client for snap RPC: rpc_release_client(cb0c7000) RPC: destroying mount client for snap tshark data for mount.nfs snap:/Drive1 /mnt/snap -o udp Attached as /tmp/bz448898.pcap bzip'ed. 1 0.000000 192.168.56.126 -> 192.168.56.100 Portmap V4 GETVERSADDR Call 2 0.000311 Adaptec_00:99:55 -> Broadcast ARP Who has 192.168.56.126? Tell 192.168.56.100 3 0.000338 Intel_77:37:73 -> Adaptec_00:99:55 ARP 192.168.56.126 is at 00:0e:0c:77:37:73 4 0.000460 192.168.56.100 -> 192.168.56.126 Portmap V4 GETVERSADDR Reply (Call In 1) 5 0.000492 Adaptec_00:99:55 -> Intel_77:37:73 ARP 192.168.56.100 is at 00:c0:b6:00:99:55 sorry - for some reason bugzilla won't allow me to add the pcap attachment. it says the file is empty; but it's not. I can email it to your redhat account. Please either email it to me or put it somewhere were I can down load it. also the debug output in Comment #16 is from 'rpcdebug -m rpc -s bind'? both should be in your inbox I'm having a similar NFS issue on F9. I turned the firewall off, then I was able to NFS mount. Firewall enabled or disabled makes no difference, at least for me. A firewall that blocks NFS is a completely different issue, i.e. if you block NFS then NFS mounts don't work. I installed a fresh FC9 on new hardware and NFS does not work. I blew that away and installed FC8 on the same hw and NFS works fine. The progress on this bug is very annoying. There is a real bug here that makes FC9 unusable. Somehow this bug has been downgraded to 'low severity' which makes zero sense. I find that odd for multiple reasons esp given that multiple people are reporting the problem. In addition, there has not been a single useful acknowledgment of the bug. Do we know exactly which part of the program is causing the termination? "internal error" is very non-descript. If this is not clear we could as a first step build binaries which print some more information (like file and line number of the call). (In reply to comment #21) > I'm having a similar NFS issue on F9. I turned the firewall off, then I was > able to NFS mount. A tangent comment, but useful if you came across this page while googling ¨Internal Error¨ for mount.nfs (as I did). However, turning off the firewall is only advised for testing. Once discovered that it was the firewall, I went through testing other ports. It turned out to be a port for mount daemon which was found with rpcinfo -p I know this is the wrong place for this info but it should be ruled out if one suspects this bug to be their issue, I did. Maybe someone should make mount.nfs a little bit more verbose and outlaw generic error messages? Steve, Is anyone ever going to step up to the plate and either explain what is causing the bug or fix it. Or just acknowledge that there is a bug that is causing mount to fail under FC9. /JMB I have an FC8 server, FC9 client same error "mount.nfs: internal error" Here is what worked for me: on the client machine, "service rpcbind start". I was then able to mount from FC9 to FC8 as expected don't forget "chkconfig rpcbind on" to make it persistent Fedora 10 fixed this for me. Whatever was done, thanks. (In reply to comment #28) > Fedora 10 fixed this for me. Whatever was done, thanks. Fedora 10 does not completely solve the problem for me. I can now mount all of my NAS devices but one of the NAS devices is very flaky. It constantly gets 'Stale NFS handle' errors. The NAS device in question is a SNAP Appliance device which has worked fine with every version of Fedora (and Ubuntu) up will Fedora 9. Because I saw no light at the end of the tunnel here I decided to replace this NAS device with a new one. I bought a ReadyNAS Duo and it is working fine with F10. Jeffrey, I would suspend the stale file handles are different problem. Could we address that in a different bz report? For the everybody else, I'm updated both the rpcbind and libtirpc packages in F-9 to latest upstream version (The same version that is in F-10) The builds are: http://koji.fedoraproject.org/koji/buildinfo?buildID=70422 http://koji.fedoraproject.org/koji/buildinfo?buildID=70421 Please given them a try and see if they help with the problem If I might chime in here, this issue have been driving me nuts for months. I have 2 identical (fully updated) F9 systems. NFS had worked fine for me ever since I set it up in Fedora Core mumble (3ish?). Suddenly when the update to kernel 2.6.27 came through it stopped working. I tried everything I could think of, but nothing worked. My solution was to revert the server to the previous 2.6.26 kernel. This allowed NFS to work as normal. This week both boxes received some yum updates among which were: Jan 15 21:53:01 Updated: 32:bind-libs-9.5.1-1.P1.fc9.i386 Jan 15 21:53:03 Updated: selinux-policy-3.3.1-117.fc9.noarch Jan 15 21:53:29 Updated: selinux-policy-devel-3.3.1-117.fc9.noarch Jan 15 21:53:50 Updated: selinux-policy-targeted-3.3.1-117.fc9.noarch Jan 15 21:54:25 Updated: file-4.23-7.fc9.i386 Jan 15 21:54:27 Updated: 32:bind-utils-9.5.1-1.P1.fc9.i386 Jan 15 21:54:30 Updated: 1:nfs-utils-1.1.2-9.fc9.i386 Now NFS no longer works with either kernel 2.6.26 or 2.6.27 I don't know what has caused the change, but I have tried it with the firewall on and off (on both systems) and selinux in both enforcing and permissive mode (on both systems). I am now stuck. My server is headless and now almost inaccessible to me... This - to me - is now severity "high". What more can I do? AD Had the same problem as AD above and urgently needed nfs working again. Tried many things, but think one of the following things made it work: 1) Added ALL: <ipnumber of client machine>/255.255.255.0 to my /etc/hosts.allow on the server 2) Added client machine to /etc/hosts on the server to allow name resolution to work I think the problem is in the nfs-utils-1.1.2-9.fc9.i386 upgrade, but do not know specifically what the problem is. Let me know if this fixes your problem or I need to dig further into what I did. > I think the problem is in the nfs-utils-1.1.2-9.fc9.i386 upgrade, but do not > know specifically what the problem is. With this the 1.1.2-9.fc9 nfs-utils version, I "fixed" the tcp wrapper code to work as it should. Denying mounts from unknown IP address. I believe this is the way other system daemons (such as sshd) also work. Also see: https://bugzilla.redhat.com/show_bug.cgi?id=480420 which I believe is a dup of this bug. (In reply to comment #33) > With this the 1.1.2-9.fc9 nfs-utils version, I "fixed" the tcp wrapper code > to work as it should. Denying mounts from unknown IP address. I believe > this is the way other system daemons (such as sshd) also work. > Ah, thanks Steve - that fits. I was checking out the changelog: * Mon Jan 05 2009 Steve Dickson <steved> 1.1.2-9 - Added warnings to tcp wrapper code when mounts are denied due to misconfigured DNS configurations. - gssd: By default, don't spam syslog when users' credentials expire * Sat Dec 20 2008 Steve Dickson <steved> 1.1.2-8 - Re-enabled and fixed/enhanced tcp wrappers. which made me believe that the 1.1.2-8 to 1.1.2-9 change would only add warnings. However, version 1.1.2-9 was the first nfs-utils update in fc9 - previous version (1.1.2-2) was from the core install. (Or I'm looking at a mirror without history.) So my yum updater probably updated from 1.1.2-2 to 1.1.2-9, which resulted in client no longer being able to mount it's nfs shares because the tcp wrappers fix in 1.1.2-8 was included. I am a simple home user. I am not entirely sure as to the benefit of running my own DNS server for my little network (but I will if someone can demonstrate the benefit - other than just for fixing this...) I can however confirm that putting "192.168.123.100 othersysname" into my /etc/hosts file enabled NFS to work for me as before. I have another problem however, which is related to but not exactly the same as this bug. For me - with both machines running fully updated F9 installs, trying to access my NFS directories from the client machine causes Natilus (or even the command line) to hang. This first happened when the kernel was upgraded from 2.6.26 to 2.6.27 - The only solution (for me) is to keep the server on 2.6.26 (the client can be on 2.6.27) and all works fine. I have written more about it here: http://www.linuxformat.co.uk/index.php?name=PNphpBB2&file=viewtopic&t=9185 Is this problem connected to this bug? Thanks... AD nfs-utils-1.1.2-10.fc9 has been submitted as an update for Fedora 9. http://admin.fedoraproject.org/updates/nfs-utils-1.1.2-10.fc9 nfs-utils-1.1.4-7.fc10 has been submitted as an update for Fedora 10. http://admin.fedoraproject.org/updates/nfs-utils-1.1.4-7.fc10 [root@n0 gartim]# yum list|grep nfs nfs-utils.x86_64 1:1.1.2-10.fc9 installed nfs-utils-lib.x86_64 1.1.1-5.fc9 installed just to verify, upgrading to 1:1.1.2-10.fc9 for nfs-utils.x86_64 worked for me. thanks Good report g. artim. I think that the underlying issue has been resolved, however can we do something with the weak error message that the mount command reports: mount.nfs: internal error The error message "internal error" does not point to underlying cause for error - to me it actually hints at some code error inside the mount command and not that the server denies access to the export. note this, I found well after the fact that my log on my nfs server was flagging a problem. I was so tired (this happened late at nite) that I never looked there for messages, just over myopically checked the nfs client: Jan 19 19:36:23 hostname mountd[14781]: Warning: Client IP address '192.168.1.2' not found in host lookup Jan 19 20:04:55 hostname mountd[2892]: Warning: Client IP address '192.168.1.2' not found in host lookup live and learn, -- Gary nfs-utils-1.1.4-7.fc10 has been pushed to the Fedora 10 stable repository. If problems still persist, please make note of it in this bug report. nfs-utils-1.1.2-10.fc9 has been pushed to the Fedora 9 stable repository. If problems still persist, please make note of it in this bug report. My FC9 system just updated this package and now I am getting getting an internal error when I attempt a mount. It was working recently, so this update makes me suspicious. [root@walden3 ~]# mount.nfs walden4:/opt2 /opt2 -v mount.nfs: timeout set for Mon Jan 26 19:50:37 2009 mount.nfs: text-based options: 'addr=192.168.1.140' mount.nfs: internal error |