Description of problem: When trying to mount an NFS partition from a Fedora 8 server on a Fedora 10 client, I get the error message: mount.nfs: Unknown error 521. [root@willow src]# mount gabrielle:/export/vol/intra /mnt/nfs/ mount.nfs: Unknown error 521 [root@willow src]# The same command from a Fedora 8 client system to the Fedora 8 server works fine. The Fedora 10 machine works just fine as a server to the Fedora 8 machines being clients. There is an issue with the Fedora 8 server which delays the response to a mount request that has been previously reported. Version-Release number of selected component (if applicable): ernel-2.6.27.5-117.fc10.x86_64 How reproducible: Unsure at present - it's happened sometimes - other file systems are OK. Steps to Reproduce: 1. f10server# mount f8server:/export/partition /mnt/nfs/ Actual results: Error message: mount.nfs: Unknown error 521 Expected results: A meaningful error message: Permission Denied, Time-Out, etc. or success. Additional info:
Strange, up until this past weekend when I updated my server to F10, I had the same configuration and NFS was working fine... Does anything show up in dmesg when you do one of these failed mount attempts?
Jeff OK, I've been investigating further. Turns out a corruption had crept into my /etc/exports file and the filesystem wasn't being exported to the f10client. (The f10 machine is now my new home directory server and had started out life on a temporary IP address which was cached on the f8server...) This would now appear to be a simple case of a misleading error message - if it had just said "Permission Denied" or quite possibly "Request Timed Out", that would have made life much easier. I've moved this down to a severity low but maybe you could consider trying to fix it so that the "Unknown error 521" gets replaced by a permission denied; request timed out, or similar thing. The situation may be somewhat unusual as the f8server takes a good thirty seconds to respond to a mount request the first time it's asked for due to the symptoms detailed in bugzilla bug 452430. Basically due to the large number of lvm partitions (>40) the device scan gets into an endless loop, timing out before it's actually finished stat'ing that many mounted filesystems. This means that rpc.mountd disappears into a 99% CPU load thrash for about thirty seconds, until it too times out somewhere in the code and decides that the requested node is indeed exported just fine, and makes such a return to the requesting host, just seriously belatedly. I suspect that time out value maybe the same as that of the NFS client (in this case the f10 box) making the request. Hence the unknown error 521 message... Hope that clarifies things. Thanks for looking at the bug. Regards, Bevis.
With a corrupt exports file, we may not be able to fix this. Do you have a way to reproduce this? Could you attach a copy of your corrupted exports file to this case? On the other thing (30 seconds to respond to a mount request). Actually...that doesn't appear to be a bug in mountd, but rather one in libblkid1 that was fixed in Debian's e2fsprogs in 1.39+1.40-WIP-2007.04.07+dfsg-1. I vaguely recall a similar bug in Fedora/RHEL, but I thought it was fixed quite some time ago. F8 now uses e2fsprogs-1.40.4-3.fc8. Is this still a problem with that package? If so you may want to transition bug 452430 to an e2fsprogs bug, though I'm not sure whether it'll be fixed this close to F8's EOL. You may also (though I'm not certain) be able to work around that by assigning explicit fsid values to your exports. See the fsid option in exports(5).
Jeff - I'm sorry I no longer have an appropriate exports file - I didn't keep an archive of older versions. I think you'd actually get the same effect if the partition was simply not exported to that host. Regards, Bevis.
I see this bug trying to connect to FC8 box from either of two FC10 boxes. Both FC10 boxes are connected to each other fine and FC8 box is connected to them as well. The problem did not exist while all of them were FC7 and FC8. The update Package Arch Version Repository Size ============================================================================= Updating: e2fsprogs i386 1.40.4-3.fc8 updates-newkey 610 k Installing for dependencies: device-mapper-devel i386 1.02.22-1.fc8 fedora 137 k Updating for dependencies: e2fsprogs-devel i386 1.40.4-3.fc8 updates-newkey 644 k e2fsprogs-libs i386 1.40.4-3.fc8 updates-newkey 138 k of FC8 box did not change anything. What is really interesting is that sometimes I get NFS clients connected to the FC8 box, but it seems to me that this is possible only within some interval after reboot of FC8 but not right after the reboot. /etc/exports file on FC8 is trivial: / *.iv.dev.null(rw,sync,no_root_squash) /huge1 *.iv.dev.null(rw,sync,no_root_squash) /huge2 *.iv.dev.null(rw,sync,no_root_squash) /oldroot *.iv.dev.null(rw,sync,no_root_squash) All four exports are whole mounts of ext3 partitions, no special mount parameters used. The corresponding lines of /etc/fstab of FC10 clients are 10.1.1.1:/ /master nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 10.1.1.1:/huge1/ /master/huge1 nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 10.1.1.1:/huge2/ /master/huge2 nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 10.1.1.1:/oldroot/ /master/oldroot nfs rsize=8192,wsize=8192,timeo=14,intr 0 0 The error 521 message is returned instantly, that is surely not a timeout. So I have a live specimen, feel free to ask questions or ask to run some tests.
This error is: #define EBADHANDLE 521 /* Illegal NFS file handle */ ...which sounds like either the client or the server is sending along bad filehandles. What might be best is a binary network capture of a mount attempt between these two hosts. Something like this from the client: # tcpdump -i [ifname] -s0 -w /tmp/mount-attempt.pcap host [server] ...then attempt the mount. After it fails, ^c the capture and attach the file to the case so I can have a look at what's happening on the wire.
Created attachment 329229 [details] Network capture of an attempt that demonstrates error 521 The FC8 NFS server is on box named master.iv.dev.null with IP address 10.1.1.1 . The FC10 NFS client is on box named octo.iv.dev.null with IP address 10.1.1.16 . The capture is made by tcpdump -i eth0 -s0 -w /tmp/mount-attempt.pcap host 10.1.1.1 and port 2049 14 packets captured, 14 received by filter, 0 dropped by kernel. The tail of /var/log/messages at server box contains only one relevant line: Jan 16 22:43:02 master mountd[7404]: authenticated mount request from octo.iv.dev.null:817 for / (/)
the "port 2049" clause here is excluding all of the mountd traffic. We'll need to see that to see whether the server is sending a bogus filehandle or the client is mangling it somehow: I do see this in the capture: 10 0.000978 10.1.1.16 -> 10.1.1.1 NFS V3 FSINFO Call, FH:0x00007f89 11 0.001195 10.1.1.1 -> 10.1.1.16 NFS V3 FSINFO Reply (Call In 10) Error:NFS3ERR_BADHANDLE ...so that's where the error is coming from. The client is sending a filehandle and the server is rejecting it. The problem is that without the mountd communications we can't tell whether this is a client or server problem. Please redo the capture w/o filtering on the port...
The traffic between these two boxes is too big to be captured entirely (they are part of instrumental server farm used for RDBMS development). I've schedule them for maintainamce for Tuesday and it will be possible to re-try with all moisy services switched off and only NFS in use.
The capture shouldn't take long and I can filter out what we don't need to see. Another option is to determine what ports mountd is listening on the server and add those to the filter. Do: $ rpcinfo -p [server] ...and look for the mountd service. Usually there will be 2 ports, one for TCP and one for UDP. Be sure to get both: The filter will then look like: tcpdump -i eth0 -s0 -w /tmp/mount-attempt.pcap host 10.1.1.1 and port 2049 and port mountd_udp_port and port mountd_tcp_port ...that may give enough info to go on here.
Someone at RH pinged me internally on this. They had a RHEL5 server and a F10 client. The interesting parts from the mount attempt: 18 0.004937 10.11.243.176 -> 10.11.243.135 MOUNT V3 MNT Call /exports 19 0.005181 10.11.243.135 -> 10.11.243.176 MOUNT V3 MNT Reply (Call In 18) Error:ERR_ACCESS ...and... 42 0.012930 10.11.243.176 -> 10.11.243.135 NFS V3 FSINFO Call, FH:0x00007f34 43 0.012974 10.11.243.135 -> 10.11.243.176 NFS V3 FSINFO Reply (Call In 42) Error:NFS3ERR_BADHANDLE ...so it looks like we got an error from the mount request but the client still tried to do the mount anyway. In his case, he had exported /export (no "s" on the end) and had just fat-fingered the mount command. This may just be bad error handling by the mount helper program.
I had the same problem with my FC10. Onmy case, I have two boxes on the same subnet (192.168.1.0/24) and a third box which is a virtual box guest running in one of the previous servers. I was able to NFS mount from my client to the NFS server. I was NOT able to mount the NFS shares from the server into the server. I was NOT able to mount the NFS shares from the virtual box guest. Checking /var/log/messages, I found Feb 10 10:19:35 storage mountd[19664]: refused mount request from 192.168.1.50 for /var/cache/yum/updates/packages (/var/cache/yum/updates/packages): illegal port 58490 Feb 10 10:19:41 storage mountd[19664]: refused mount request from 192.168.1.50 for /var/cache/yum/fedora/packages (/var/cache/yum/fedora/packages): illegal port 58497 Feb 10 10:19:41 storage mountd[19664]: refused mount request from 192.168.1.50 for /var/cache/yum/updates/packages (/var/cache/yum/updates/packages): illegal port 58501 Feb 10 10:19:41 storage mountd[19664]: refused mount request from 192.168.1.50 for /store (/store): illegal port 58505 After a little research, the problem was caused by the VB NAT. Adding "insecure" to my /etc/exports entries fixed the problem. I hope this can help with your problem.
This turns out to be a kernel problem. The mount client there isn't recognizing the error properly. F11 doesn't have this problem. When I try to mount an export that doesn't exist, I get a proper error: # mount -t nfs salusa:/exportfoo /mnt/test mount.nfs: access denied by server while mounting salusa:/exportfoo ...looks like this was fixed in 2.6.28. I'm going to close this with a resolution of UPSTREAM. Should be fixed when f10 moves to 2.6.28 kernels or later.