Bug 448898

Summary: After upgrade to F9, NAS device can no longer be mounted
Product: [Fedora] Fedora Reporter: Jeffrey M. Birnbaum <jmbnyc>
Component: nfs-utilsAssignee: Steve Dickson <steved>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9CC: ADent123, allan-redhat, bmartin, chris, drepper, flailios, gartim, gk4, jpazdziora, mike, pwaldenlinux
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-24 02:38:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
mount.fs using fstab
none
mount.nfs using command line options
none
mount.nfs directly
none
strace /sbin/mount.nfs 192.168.0.107:/SHARE1 /share1 > mount.nfs.txt 2>&1
none
strace mount -t nfs 192.168.0.107:/SHARE1 /share1 > mount.txt 2>&1 none

Description Jeffrey M. Birnbaum 2008-05-29 12:41:01 UTC
Description of problem:

I upgraded to Fedora 9 and I can no longer mount an older NAS device that has
worked for years. A newer NAS device works fine.

[root@arturo /]# mount -t nfs 192.168.0.107:/SHARE1 /share1
mount.nfs: internal error

[root@arturo /]# mount -t nfs 192.168.0.112:/c/user /readynas
[root@arturo /]#

I have tried all type of -o options to specify version of nfs etc with no luck.
The rpcinfo command below shows the device to support nfs 2 and 3.

[root@arturo fix]# /usr/sbin/rpcinfo 192.168.0.107
   program version netid     address                service    owner
    100000    2              192.168.0.107.0.111    portmapper root
    100005    2              192.168.0.107.4.5      mountd     root
    100005    1              192.168.0.107.4.5      mountd     root
    100005    3              192.168.0.107.4.5      mountd     root
    100003    2              192.168.0.107.8.1      nfs        root
    100003    3              192.168.0.107.8.1      nfs        root
    150001    2              192.168.0.107.3.183    pcnfsd     root
    100024    1              192.168.0.107.4.36     status     root
    100021    3              192.168.0.107.4.35     nlockmgr   root
    100021    4              192.168.0.107.4.35     nlockmgr   root
    100021    2              192.168.0.107.4.35     nlockmgr   root
    100021    1              192.168.0.107.4.35     nlockmgr   root
    351396    1              192.168.0.107.4.37     -          root

Version-Release number of selected component (if applicable):

[root@arturo fix]# rpm -q nfs-utils
nfs-utils-1.1.2-2.fc9.x86_64

How reproducible:


Steps to Reproduce:
1. mount -t nfs 192.168.0.107:/SHARE1 /share1
2.
3.
  
Actual results:

mount.nfs: internal error

Expected results:

a clean mount (same command works fine on my FC8 and FC7 boxes)
Additional info:

Comment 1 Steve Dickson 2008-06-03 11:29:53 UTC
That does mount -v nfs 192.168.0.107:/SHARE1 /share1 say?

Comment 2 Jeffrey M. Birnbaum 2008-06-10 20:15:38 UTC
[root@arturo jmb]# date
Tue Jun 10 16:15:01 EDT 2008
[root@arturo jmb]# mount -v -t nfs 192.168.0.107:/SHARE1 /share1
mount.nfs: timeout set for Tue Jun 10 16:17:05 2008
mount.nfs: text-based options: 'addr=192.168.0.107'
mount.nfs: internal error


Comment 3 Michael Cronenworth 2008-06-12 19:04:43 UTC
I am also seeing this on my Fedora 9 machines. NFS mounts worked just fine on
Fedora 8. After the F9 update, they get the same "mount.nfs: internal error"
message when you try to mount. No mount options seem to change anything.

Comment 4 Steve Dickson 2008-06-23 17:47:05 UTC
would it be possible to get an strace of the mount?

Comment 5 Michael Cronenworth 2008-06-23 19:16:29 UTC
Created attachment 310064 [details]
mount.fs using fstab

I am not the original reporter, but I will go ahead and post my straces. The
first one has been generated with: sudo strace mount devsys3:/ >
mount.nfs.fstab.txt 2>&1

Devsys3 is a pingable and fully functional NFSv3 computer.

Comment 6 Michael Cronenworth 2008-06-23 19:17:54 UTC
Created attachment 310065 [details]
mount.nfs using command line options

The second strace is by using as many command line options as necessary,
including using the IP address and mount path. sudo strace mount -t nfs
172.17.100.224:/ /media/devsys3 > mount.nfs.bash.txt 2>&1

Comment 7 Michael Cronenworth 2008-06-23 19:22:58 UTC
Created attachment 310066 [details]
mount.nfs directly

This third and last strace may be the most helpful. I used the mount.nfs
command directly. sudo strace /sbin/mount.nfs 172.17.100.224:/ /media/devsys3
-v > mount.nfs.full.txt 2>&1

Comment 8 Jeffrey M. Birnbaum 2008-06-24 01:14:35 UTC
Created attachment 310091 [details]
strace /sbin/mount.nfs 192.168.0.107:/SHARE1 /share1 > mount.nfs.txt 2>&1

The first of two files with strace output.

strace /sbin/mount.nfs 192.168.0.107:/SHARE1 /share1 > mount.nfs.txt 2>&1

Comment 9 Jeffrey M. Birnbaum 2008-06-24 01:15:16 UTC
Created attachment 310093 [details]
strace mount -t nfs 192.168.0.107:/SHARE1 /share1 > mount.txt 2>&1

strace mount -t nfs 192.168.0.107:/SHARE1 /share1 > mount.txt 2>&1

Comment 10 Steve Dickson 2008-06-24 11:08:11 UTC
Thanks for all the info... I'm wondering if the "mount.nfs: internal error"
is being caused by the mount system call returning EIO... 

A couple of things, what machine architecture are you guys using?

Is SELinux enabled? If disable it and see happens.

Turn on the mount debug in the kernel with 
    sudo rpcdebug -m nfs -s mount

and then please post that output which will be found in 
either 'dmesg' or /var/log/message


Comment 11 Michael Cronenworth 2008-06-24 14:30:24 UTC
Here's the Fedora Forum thread with more users reporting this error.
http://fedoraforum.org/forum/showthread.php?t=189949

This is also the output from dmesg:
NFS: nfs mount opts='addr=172.17.100.224'
NFS:   parsing nfs mount option 'addr=172.17.100.224'
NFS: sending MNT request for 172.17.100.224:/
NFS: failed to create RPC client, status=-5
NFS: unable to mount server 172.17.100.224, error -5

And from /var/log/messages:
Jun 24 09:25:56 michael kernel: NFS: nfs mount opts='addr=172.17.100.224'
Jun 24 09:25:56 michael kernel: NFS:   parsing nfs mount option
'addr=172.17.100.224'
Jun 24 09:25:56 michael kernel: NFS: sending MNT request for 172.17.100.224:/
Jun 24 09:25:56 michael kernel: NFS: failed to create RPC client, status=-5

Comment 12 Michael Cronenworth 2008-06-24 14:32:37 UTC
Excuse the spam.

Selinux is disabled. Pentium 4 with i686 kernel.

Comment 13 Steve Dickson 2008-06-24 22:26:42 UTC
Ok I do appreciate your patience... its a bit frustrating I can reproduce this...

Looking at the output in the Fedora Forum thread I would like to
verify at you are seeing the same rpcbind failure that is seen
in the thread. So Lets turn on kernel RPC debugging with:

    sudo rpcdebug -m rpc -s call

Due to the volume of output please redirect the dmesg output into a 
file (i.e. dmesg > /tmp/bz448898.dmesg) and then attached the file to this bz.

Now if this is a remote rpcbind failure we should be able to see 
the server returning the error with a network trace. So in addition 
to setting the debug let get a network trace with the following command.

    tshark -w /tmp/bz448898.pcap host <server>
    bzip2 /tmp/bz448898.pcap

then attache the bzip2-ed file. 




Comment 14 john casey 2008-06-25 00:20:53 UTC
I'm getting the same problem with mount.nfs: internal error.   
mounting a snapserver drive successfully fedora Core 4-8.  Break on Fedora 9
Here's the rpc debug output for my machine.

RPC:       creating mount client for snap (xprt ddcf0000)
RPC:    23 call_start mount3 proc 0 (sync)
RPC:    23 call_reserve (status 0)
RPC:    23 call_reserveresult (status 0)
RPC:    23 call_allocate (status 0)
RPC:    23 call_bind (status 0)
RPC:       creating rpcbind client for snap (xprt ccca8c00)
RPC:    24 call_start rpcbind4 proc 9 (async)
RPC:    24 call_reserve (status 0)
RPC:    24 call_reserveresult (status 0)
RPC:    24 call_allocate (status 0)
RPC:    24 call_bind (status 0)
RPC:    24 call_connect xprt ccca8c00 is not connected
RPC:       rpc_release_client(f318b200)
RPC:    24 call_connect_status (status -107)
RPC:    24 call_timeout (minor)
RPC:    24 call_bind (status 0)
RPC:    24 call_connect xprt ccca8c00 is connected
RPC:    24 call_transmit (status 0)
RPC:    24 call_encode (status 0)
RPC:    24 call_status (status 24)
RPC:    24 call_decode (status 24)
RPC:    24 call_verify: proc f8c4efec unsupported by program 100000, version 4
on server snap
RPC:    24 call_verify: call failed with error -95
RPC:       rpc_release_client(f318b200)
RPC:       destroying rpcbind client for snap
RPC:    23 unrecognized rpcbind error (95)
RPC:       rpc_release_client(f304ba00)
RPC:       shutting down mount client for snap
RPC:       rpc_release_client(f304ba00)
RPC:       destroying mount client for snap


Comment 15 Steve Dickson 2008-06-25 19:04:19 UTC
It appears the problem might be that the older portmappers
are only listening for requests on the UDP transport. I'm 
noting the the rpcinfo output in the original Description of problem.
There is only one 'portmapper' entry and I'm guess (since the
netid field is blank) that entiry is a UDP listener. Usually
there are two 'portmapper' entry; one for UDP and one for TCP.

To test this theory, please put a '-o udp' on the mount command
which should cause the mount to succeed. 

Also a tshark trace as described in Comment #13 would also help
greatly...



Comment 16 john casey 2008-06-26 05:08:07 UTC
mount.nfs using udp also produced the same internal error

mount.nfs  snap:/Drive1 /mnt/snap -o udp

RPC:       creating mount client for snap (xprt f4dfd800)
RPC:    25 call_start mount3 proc 0 (sync)
RPC:    25 call_reserve (status 0)
RPC:    25 call_reserveresult (status 0)
RPC:    25 call_allocate (status 0)
RPC:    25 call_bind (status 0)
RPC:       creating rpcbind client for snap (xprt f249ec00)
RPC:    26 call_start rpcbind4 proc 9 (async)
RPC:    26 call_reserve (status 0)
RPC:    26 call_reserveresult (status 0)
RPC:    26 call_allocate (status 0)
RPC:    26 call_bind (status 0)
RPC:    26 call_connect xprt f249ec00 is not connected
RPC:       rpc_release_client(f79c1000)
RPC:    26 call_connect_status (status -107)
RPC:    26 call_timeout (minor)
RPC:    26 call_bind (status 0)
RPC:    26 call_connect xprt f249ec00 is connected
RPC:    26 call_transmit (status 0)
RPC:    26 call_encode (status 0)
RPC:    26 call_status (status 24)
RPC:    26 call_decode (status 24)
RPC:    26 call_verify: proc f8efdfec unsupported by program 100000, version 4
on server snap
RPC:    26 call_verify: call failed with error -95
RPC:    25 unrecognized rpcbind error (95)
RPC:       rpc_release_client(f79c1000)
RPC:       destroying rpcbind client for snap
RPC:       rpc_release_client(cb0c7000)
RPC:       shutting down mount client for snap
RPC:       rpc_release_client(cb0c7000)
RPC:       destroying mount client for snap


Comment 17 john casey 2008-06-26 05:26:08 UTC
tshark data for mount.nfs snap:/Drive1 /mnt/snap -o udp
Attached as /tmp/bz448898.pcap  bzip'ed.

 1   0.000000 192.168.56.126 -> 192.168.56.100 Portmap V4 GETVERSADDR Call
  2   0.000311 Adaptec_00:99:55 -> Broadcast    ARP Who has 192.168.56.126? 
Tell 192.168.56.100
  3   0.000338 Intel_77:37:73 -> Adaptec_00:99:55 ARP 192.168.56.126 is at
00:0e:0c:77:37:73
  4   0.000460 192.168.56.100 -> 192.168.56.126 Portmap V4 GETVERSADDR Reply
(Call In 1)
  5   0.000492 Adaptec_00:99:55 -> Intel_77:37:73 ARP 192.168.56.100 is at
00:c0:b6:00:99:55


Comment 18 john casey 2008-06-26 05:35:56 UTC
sorry - for some reason bugzilla won't allow me to add the pcap attachment.  it
says the file is empty; but it's not.  I can email it to your redhat account.


Comment 19 Steve Dickson 2008-06-26 09:27:52 UTC
Please either email it to me or put it somewhere were I can down load it.

also the debug output in Comment #16 is from 'rpcdebug -m rpc -s bind'? 

Comment 20 john casey 2008-06-26 15:46:05 UTC
both should be in your inbox

Comment 21 George Kraft 2008-08-27 20:34:37 UTC
I'm having a similar NFS issue on F9.  I turned the firewall off, then I was able to NFS mount.

Comment 22 Michael Cronenworth 2008-08-27 20:40:19 UTC
Firewall enabled or disabled makes no difference, at least for me.

Comment 23 Jeffrey M. Birnbaum 2008-08-27 22:24:55 UTC
A firewall that blocks NFS is a completely different issue, i.e. if you block NFS then NFS mounts don't work. 

I installed a fresh FC9 on new hardware and NFS does not work. I blew that away and installed FC8 on the same hw and NFS works fine. The progress on this bug is very annoying. There is a real bug here that makes FC9 unusable. Somehow this bug has been downgraded to 'low severity' which makes zero sense. I find that odd for multiple reasons esp given that multiple people are reporting the problem. In addition, there has not been a single useful acknowledgment of the bug.

Comment 24 Ulrich Drepper 2008-08-30 16:21:58 UTC
Do we know exactly which part of the program is causing the termination?  "internal error" is very non-descript.  If this is not clear we could as a first step build binaries which print some more information (like file and line number of the call).

Comment 25 Kent Reynolds 2008-09-03 06:06:38 UTC
(In reply to comment #21)
> I'm having a similar NFS issue on F9.  I turned the firewall off, then I was
> able to NFS mount.

A tangent comment, but useful if you came across this page while googling ¨Internal Error¨ for mount.nfs (as I did). However, turning off the firewall is only advised for testing. Once discovered that it was the firewall, I went through testing other ports. It turned out to be a port for mount daemon which was found with rpcinfo -p
I know this is the wrong place for this info but it should be ruled out if one suspects this bug to be their issue, I did.

Maybe someone should make mount.nfs a little bit more verbose and outlaw generic error messages?

Comment 26 Jeffrey M. Birnbaum 2008-10-31 03:04:07 UTC
Steve,
Is anyone ever going to step up to the plate and either explain what is causing the bug or fix it. Or just acknowledge that there is a bug that is causing mount to fail under FC9.
/JMB

Comment 27 Brian Martin 2008-11-04 17:52:25 UTC
I have an FC8 server, FC9 client

same error "mount.nfs: internal error"


Here is what worked for me:

on the client machine, "service rpcbind start".
I was then able to mount from FC9 to FC8 as expected

don't forget "chkconfig rpcbind on" to make it persistent

Comment 28 Michael Cronenworth 2008-12-05 18:15:31 UTC
Fedora 10 fixed this for me. Whatever was done, thanks.

Comment 29 Jeffrey M. Birnbaum 2008-12-06 09:24:52 UTC
(In reply to comment #28)
> Fedora 10 fixed this for me. Whatever was done, thanks.

Fedora 10 does not completely solve the problem for me. I can now mount all of my NAS devices but one of the NAS devices is very flaky. It constantly gets 'Stale NFS handle' errors. The NAS device in question is a SNAP Appliance device which has worked fine with every version of Fedora (and Ubuntu) up will Fedora 9. Because I saw no light at the end of the tunnel here I decided to replace this NAS  device with a new one. I bought a ReadyNAS Duo and it is working fine with F10.

Comment 30 Steve Dickson 2008-12-08 12:48:44 UTC
Jeffrey,

I would suspend the stale file handles are different problem.
Could we address that in a different bz report? 

For the everybody else,

I'm updated both the rpcbind and libtirpc packages 
in F-9 to latest upstream version (The same version that is in F-10)

The builds are:

http://koji.fedoraproject.org/koji/buildinfo?buildID=70422
http://koji.fedoraproject.org/koji/buildinfo?buildID=70421

Please given them a try and see if they help with the problem

Comment 31 Arthur Dent 2009-01-17 13:58:54 UTC
If I might chime in here, this issue have been driving me nuts for months.

I have 2 identical (fully updated) F9 systems. NFS had worked fine for me ever since I set it up in Fedora Core mumble (3ish?). Suddenly when the update to kernel 2.6.27 came through it stopped working.

I tried everything I could think of, but nothing worked. My solution was to revert the server to the previous 2.6.26 kernel. This allowed NFS to work as normal.

This week both boxes received some yum updates among which were:

Jan 15 21:53:01 Updated: 32:bind-libs-9.5.1-1.P1.fc9.i386
Jan 15 21:53:03 Updated: selinux-policy-3.3.1-117.fc9.noarch
Jan 15 21:53:29 Updated: selinux-policy-devel-3.3.1-117.fc9.noarch
Jan 15 21:53:50 Updated: selinux-policy-targeted-3.3.1-117.fc9.noarch
Jan 15 21:54:25 Updated: file-4.23-7.fc9.i386
Jan 15 21:54:27 Updated: 32:bind-utils-9.5.1-1.P1.fc9.i386
Jan 15 21:54:30 Updated: 1:nfs-utils-1.1.2-9.fc9.i386

Now NFS no longer works with either kernel 2.6.26 or 2.6.27

I don't know what has caused the change, but I have tried it with the firewall on and off (on both systems) and selinux in both enforcing and permissive mode (on both systems).

I am now stuck. My server is headless and now almost inaccessible to me...

This - to me - is now severity "high". What more can I do?

AD

Comment 32 allan-redhat 2009-01-19 09:20:17 UTC
Had the same problem as AD above and urgently needed nfs working again. Tried many things, but think one of the following things made it work:

1) Added 

ALL: <ipnumber of client machine>/255.255.255.0

to my /etc/hosts.allow on the server

2) Added client machine to /etc/hosts on the server to allow name resolution to work

I think the problem is in the nfs-utils-1.1.2-9.fc9.i386 upgrade, but do not know specifically what the problem is.

Let me know if this fixes your problem or I need to dig further into what I did.

Comment 33 Steve Dickson 2009-01-19 13:44:55 UTC
> I think the problem is in the nfs-utils-1.1.2-9.fc9.i386 upgrade, but do not
> know specifically what the problem is.

With this the 1.1.2-9.fc9 nfs-utils version, I "fixed" the tcp wrapper code
to work as it should. Denying mounts from unknown IP address. I believe 
this is the way other system daemons (such as sshd) also work. 

Also see:
 https://bugzilla.redhat.com/show_bug.cgi?id=480420

which I believe is a dup of this bug.

Comment 34 allan-redhat 2009-01-19 18:21:34 UTC
(In reply to comment #33)
> With this the 1.1.2-9.fc9 nfs-utils version, I "fixed" the tcp wrapper code
> to work as it should. Denying mounts from unknown IP address. I believe 
> this is the way other system daemons (such as sshd) also work. 
> 
Ah, thanks Steve - that fits. I was checking out the changelog:

* Mon Jan 05 2009 Steve Dickson <steved> 1.1.2-9
  - Added warnings to tcp wrapper code when mounts are 
    denied due to misconfigured DNS configurations.
  - gssd: By default, don't spam syslog when users' credentials expire
* Sat Dec 20 2008 Steve Dickson <steved> 1.1.2-8
  - Re-enabled and fixed/enhanced tcp wrappers.


which made me believe that the 1.1.2-8 to 1.1.2-9 change would only add warnings. However, version 1.1.2-9 was the first nfs-utils update in fc9 - previous version (1.1.2-2) was from the core install. (Or I'm looking at a mirror without history.)

So my yum updater probably updated from 1.1.2-2 to 1.1.2-9, which resulted in client no longer being able to mount it's nfs shares because the tcp wrappers fix in 1.1.2-8 was included.

Comment 35 Arthur Dent 2009-01-20 10:59:20 UTC
I am a simple home user. I am not entirely sure as to the benefit of running my own DNS server for my little network (but I will if someone can demonstrate the benefit - other than just for fixing this...)

I can however confirm that putting
"192.168.123.100 othersysname" 
into my /etc/hosts file enabled NFS to work for me as before.

I have another problem however, which is related to but not exactly the same as this bug.

For me - with both machines running fully updated F9 installs, trying to access my NFS directories from the client machine causes Natilus (or even the command line) to hang. This first happened when the kernel was upgraded from 2.6.26 to 2.6.27 - The only solution (for me) is to keep the server on 2.6.26 (the client can be on 2.6.27) and all works fine.

I have written more about it here: http://www.linuxformat.co.uk/index.php?name=PNphpBB2&file=viewtopic&t=9185

Is this problem connected to this bug?

Thanks...

AD

Comment 36 Fedora Update System 2009-01-20 12:53:01 UTC
nfs-utils-1.1.2-10.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/nfs-utils-1.1.2-10.fc9

Comment 37 Fedora Update System 2009-01-20 12:55:19 UTC
nfs-utils-1.1.4-7.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/nfs-utils-1.1.4-7.fc10

Comment 38 g. artim 2009-01-20 20:01:28 UTC
[root@n0 gartim]# yum list|grep nfs
nfs-utils.x86_64                       1:1.1.2-10.fc9             installed     
nfs-utils-lib.x86_64                   1.1.1-5.fc9                installed    

just to verify, upgrading to 1:1.1.2-10.fc9 for nfs-utils.x86_64 worked for me.

thanks

Comment 39 allan-redhat 2009-01-21 11:56:34 UTC
Good report g. artim.

I think that the underlying issue has been resolved, however can we do something with the weak error message that the mount command reports:

mount.nfs: internal error

The error message "internal error" does not point to underlying cause for error - to me it actually hints at some code error inside the mount command and not that the server denies access to the export.

Comment 40 g. artim 2009-01-21 23:25:31 UTC
note this, I found well after the fact that my log on my nfs server was flagging a problem. I was so tired (this happened late at nite) that I never looked there for messages, just over myopically checked the nfs client:

Jan 19 19:36:23 hostname mountd[14781]: Warning: Client IP address '192.168.1.2' not found in host lookup
Jan 19 20:04:55 hostname  mountd[2892]: Warning: Client IP address '192.168.1.2' not found in host lookup

live and learn,

-- Gary

Comment 41 Fedora Update System 2009-01-24 02:38:07 UTC
nfs-utils-1.1.4-7.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 42 Fedora Update System 2009-01-24 02:40:57 UTC
nfs-utils-1.1.2-10.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 43 Philip Walden 2009-01-27 04:00:17 UTC
My FC9 system just updated this package and now I am getting getting an internal error when I attempt a mount. It was working recently, so this update makes me suspicious.

[root@walden3 ~]# mount.nfs walden4:/opt2 /opt2 -v
mount.nfs: timeout set for Mon Jan 26 19:50:37 2009
mount.nfs: text-based options: 'addr=192.168.1.140'
mount.nfs: internal error