Bug 868584
Description
Wendell Baker
2012-10-21 00:09:42 UTC
(In reply to comment #0) > (retry manually) > $ sudo mount.nfs -v drippie:/mnt/bisquick /tmp/drippie/bisquick > mount.nfs: timeout set for Sat Oct 20 16:28:46 2012 > mount.nfs: trying text-based options > 'vers=4,addr=clientaddr=fd12:3456:789a:0:212:34ff:fe56:789a,local_lock=none, > addr=fd12:3456:789a:0:298:76ff:fe54:3210' > mount.nfs: mount(2): Cannot allocate memory This means the actual mount(2) system call is failing. Could you please turn on kernel debugging in the mounting code by do rpcdebug -m nfs -s mount Then please post the contains of /var/log/message after the failure. Also a network trace could be handy as well. On the client: tshark -o /tmp/data.pcap host <server-name> bzip2 /tmp/data.pcap Also: is there anything different about the filesystem at /mnt/bisquick? (Could you just paste the output of "mount" run on the server?) Both devices (clarification & bisquick) are the same. Mounted the same. (last two lines) [wbaker:wbaker@drippie (F17 Beefy_Miracle, nfs server) $ mount proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1019204k,nr_inodes=214592,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) /dev/mapper/vg_drippie-lv_root on / type ext4 (rw,relatime,seclabel,data=ordered) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=31,pgrp=1,timeout=300,minproto=5,maxproto=5,direct) debugfs on /sys/kernel/debug type debugfs (rw,relatime) configfs on /sys/kernel/config type configfs (rw,relatime) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) tmpfs on /media type tmpfs (rw,nosuid,nodev,noexec,relatime,seclabel,mode=755) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime) sunrpc on /proc/fs/nfsd type nfsd (rw,relatime) /dev/sda1 on /boot type ext4 (rw,relatime,seclabel,data=ordered) /dev/mapper/vg_drippie-lv_home on /home type ext4 (rw,relatime,seclabel,data=ordered) /dev/mapper/vg_drippie-lv_local on /local type ext4 (rw,relatime,seclabel,data=ordered) /etc/auto/direct on /srv type autofs (rw,relatime,fd=6,pgrp=929,timeout=300,minproto=5,maxproto=5,direct) /etc/auto/direct on /work type autofs (rw,relatime,fd=6,pgrp=929,timeout=300,minproto=5,maxproto=5,direct) /etc/auto/backup on /backup type autofs (rw,relatime,fd=11,pgrp=929,timeout=300,minproto=5,maxproto=5,indirect) /etc/auto/borg on /borg type autofs (rw,relatime,fd=17,pgrp=929,timeout=300,minproto=5,maxproto=5,indirect) /etc/auto/net on /net type autofs (rw,relatime,fd=23,pgrp=929,timeout=60,minproto=5,maxproto=5,indirect) /etc/auto/dlna on /dlna type autofs (rw,relatime,fd=29,pgrp=929,timeout=300,minproto=5,maxproto=5,indirect) /etc/auto/vol on /vol type autofs (rw,relatime,fd=35,pgrp=929,timeout=300,minproto=5,maxproto=5,indirect) /dev/mapper/luks-4c8d49ad-a06f-49be-9e65-789b41721af8 on /mnt/bisquick type ext4 (rw,nosuid,nodev,relatime,seclabel,data=ordered) /dev/mapper/luks-547d2d1a-2896-4fa5-bd05-c26597e7eff0 on /mnt/clarification type ext4 (rw,nosuid,nodev,relatime,seclabel,data=ordered) They are both happy as local filesystems on drippie $ ls -al /mnt/bisquick total 76 drwxrwxrwt. 15 root root 4096 Jul 21 10:12 . drwxr-xr-x. 4 root root 4096 Oct 14 17:37 .. drwxrwsr-t. 5 wbaker source 4096 Jan 18 2012 archive drwxrwsr-t. 11 wbaker source 4096 Oct 22 10:00 areas drwxrwxr-x. 4 wbaker bakers 4096 Feb 6 2012 deprecated drwxrwsr-x. 4 wbaker bakers 4096 Dec 23 2011 devices drwxrwsr-t. 75 wbaker bakers 4096 Feb 5 2012 disks drwxrwsr-t. 7 wbaker source 4096 Dec 20 2011 foreign drwxrwxr-x. 4 wbaker wbaker 4096 Jul 21 10:06 j drwxrwsr-t. 9 wbaker source 4096 Feb 4 2012 local drwx------. 2 root root 16384 Jan 15 2012 lost+found -rw-rw-r--. 1 wbaker wbaker 177 Jul 21 10:12 PROVENANCE drwxrwsr-x. 2 wbaker source 4096 Jul 21 10:08 self drwx------. 4 wbaker wbaker 4096 Jan 29 2011 .Trash-500 drwxrwxr-x. 6 wbaker bakers 4096 Jan 6 2012 users drwxrwsr-x. 17 wbaker source 4096 Oct 15 10:09 whole $ ls -al /mnt/clarification total 36 drwxrwsrwt. 5 wbaker source 4096 Apr 17 2012 . drwxr-xr-x. 4 root root 4096 Oct 14 17:37 .. drwxrwsr-x. 2 wbaker source 4096 Jan 6 2012 logs drwx------. 2 root root 16384 Dec 17 2011 lost+found -rwxrwxr-x. 1 wbaker source 1684 Apr 17 2012 sync drwxrwsrwt. 9 wbaker source 4096 Mar 9 2012 volumes working on the tcpdump exhibition. eggplants is busy doing stuff now, so I had to use another F17 host for the client. eggplants is on a more modern kernel $ uname -a Linux eggplants 3.6.1-1.fc17.i686.PAE #1 SMP Wed Oct 10 12:32:58 UTC 2012 i686 i686 i386 GNU/Linux frequented exhibits the issue (similar to F14 oddly enough) frequented is the client; $ uname -a; cat /etc/fedora-release ; rpm -q -a | grep nfs | sort Linux frequented 3.4.6-2.fc17.i686 #1 SMP Thu Jul 19 22:15:33 UTC 2012 i686 i686 i386 GNU/Linux Fedora release 17 (Beefy Miracle) libnfsidmap-0.25-3.fc17.i686 nfs-utils-1.2.6-3.fc17.i686 pertaining to the nearby pcap txt files Experiment #1 $ for i in /mnt/bisquick /mnt/clarification /local /opt ; do b=$(basename $i) ; echo $i ; mkdir -p /tmp/drippie/$b ; sudo mount -t nfs drippie:$i /tmp/drippie/$b ; done /mnt/bisquick mount.nfs: Connection timed out /mnt/clarification /local /opt Experiment #2 $ sudo mount.nfs drippie:/mnt/bisquick /tmp/drippie/bisquick -v -v mount.nfs: timeout set for Mon Oct 22 11:49:03 2012 mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: trying text-based options 'vers=4,addr=fd12:3456:789a:0:298:76ff:fe54:3210,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' mount.nfs: mount(2): Stale NFS file handle mount.nfs: Connection timed out Surrounding both experiments on client & server ... on frequented for exp in "exp1" "exp2" sudo tcpdump -i em1 -w on_frequented_nfs_between_frequented_and_drippie_$exp.pcap -e 'host drippie || host frequented' on drippie for exp in "exp1" "exp2" sudo tcpdump -i em1 -w on_drippie_nfs_between_frequented_and_drippie_$exp.pcap -e 'host drippie || host frequented' for i in *.pcap ; do tcpdump -r $i >& o.$i.txt; done for i in *.txt ; do grep -Ee '(frequented.*drippie|drippie.*frequented)' $i > $i.only; done perl -i -p -e "s/$(dnsdomainname)/example.com/g" *.only for i in *.only ; do bzip2 < $i > $i.bz2; done *.bz2 submitted nearby Created attachment 631699 [details]
on frequented: nfs between frequented and drippie, experiment #1; edited; compressed text
Created attachment 631700 [details]
on frequented: nfs between frequented and drippie, experiment #1; edited; compressed text
Created attachment 631701 [details]
on frequented: nfs between frequented and drippie, experiment #2; edited; compressed text
Created attachment 631702 [details]
on drippie: nfs between frequented and drippie, experiment #2; edited; compressed text
Created attachment 631703 [details]
on drippie: nfs between frequented and drippie, experiment #1; edited; compressed text
exhibition of rpcdebug from the client wbaker:wbaker@frequented $ sudo rpcdebug -m nfs -s mount nfs mount $ sudo mount.nfs drippie:/mnt/bisquick /tmp/drippie/bisquick -v -v <output previously exhibited> $ sudo rpcdebug -m nfs -c mount nfs <no flags set> $ tail /var/log/messages Oct 22 12:18:09 frequented kernel: [7341782.538068] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:09 frequented kernel: [7341782.538089] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:09 frequented kernel: [7341782.538102] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:09 frequented kernel: [7341782.538119] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:09 frequented kernel: [7341782.538132] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:09 frequented kernel: [7341782.538136] --> nfs4_try_mount() Oct 22 12:18:09 frequented kernel: [7341782.569053] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:09 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:10 frequented kernel: [7341783.570130] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:10 frequented kernel: [7341783.570161] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:10 frequented kernel: [7341783.570182] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:10 frequented kernel: [7341783.570208] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:10 frequented kernel: [7341783.570229] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:10 frequented kernel: [7341783.570236] --> nfs4_try_mount() Oct 22 12:18:10 frequented kernel: [7341783.584068] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:12 frequented kernel: [7341785.584985] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:12 frequented kernel: [7341785.585057] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:12 frequented kernel: [7341785.585077] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:12 frequented kernel: [7341785.585103] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:12 frequented kernel: [7341785.585124] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:12 frequented kernel: [7341785.585130] --> nfs4_try_mount() Oct 22 12:18:12 frequented kernel: [7341785.603054] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:12 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:16 frequented kernel: [7341789.603964] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:16 frequented kernel: [7341789.603997] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:16 frequented kernel: [7341789.604063] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:16 frequented kernel: [7341789.604089] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:16 frequented kernel: [7341789.604110] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:16 frequented kernel: [7341789.604117] --> nfs4_try_mount() Oct 22 12:18:16 frequented kernel: [7341789.621067] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:16 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:24 frequented kernel: [7341797.621881] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:24 frequented kernel: [7341797.621914] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:24 frequented kernel: [7341797.621935] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:24 frequented kernel: [7341797.621961] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:24 frequented kernel: [7341797.621982] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:24 frequented kernel: [7341797.621988] --> nfs4_try_mount() Oct 22 12:18:24 frequented kernel: [7341797.643090] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:24 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:27 frequented named[11830]: client 74.53.87.138#52069 (d.8.3.0.7.6.0.0.0.7.4.0.1.0.0.2.ip6.arpa): view edge-clients: query: d.8.3.0.7.6.0.0.0.7.4.0.1.0.0.2.ip6.arpa IN SOA -E (173.228.29.167) Oct 22 12:18:34 frequented kernel: [7341807.644068] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:34 frequented kernel: [7341807.644101] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:34 frequented kernel: [7341807.644121] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:34 frequented kernel: [7341807.644148] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:34 frequented kernel: [7341807.644169] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:34 frequented kernel: [7341807.644175] --> nfs4_try_mount() Oct 22 12:18:34 frequented kernel: [7341807.662073] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:34 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:44 frequented kernel: [7341817.663162] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:44 frequented kernel: [7341817.663196] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:44 frequented kernel: [7341817.663216] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:44 frequented kernel: [7341817.663242] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:44 frequented kernel: [7341817.663264] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:44 frequented kernel: [7341817.663270] --> nfs4_try_mount() Oct 22 12:18:44 frequented kernel: [7341817.683088] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:44 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:18:54 frequented kernel: [7341827.684163] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:54 frequented kernel: [7341827.684193] NFS: parsing nfs mount option 'vers=4' Oct 22 12:18:54 frequented kernel: [7341827.684213] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:18:54 frequented kernel: [7341827.684239] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:18:54 frequented kernel: [7341827.684260] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:18:54 frequented kernel: [7341827.684267] --> nfs4_try_mount() Oct 22 12:18:54 frequented kernel: [7341827.704088] <-- nfs4_try_mount() = -116 [error] Oct 22 12:18:54 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:04 frequented kernel: [7341837.705307] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:04 frequented kernel: [7341837.705339] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:04 frequented kernel: [7341837.705360] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:04 frequented kernel: [7341837.705386] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:04 frequented kernel: [7341837.705407] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:04 frequented kernel: [7341837.705413] --> nfs4_try_mount() Oct 22 12:19:04 frequented kernel: [7341837.723045] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:04 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:14 frequented kernel: [7341847.222609] NFS: nfs mount opts='sloppy,vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:14 frequented kernel: [7341847.222632] NFS: parsing nfs mount option 'sloppy' Oct 22 12:19:14 frequented kernel: [7341847.222638] NFS: relaxing parsing rules Oct 22 12:19:14 frequented kernel: [7341847.222641] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:14 frequented kernel: [7341847.222653] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:14 frequented kernel: [7341847.222669] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:14 frequented kernel: [7341847.222683] NFS: MNTPATH: '/mnt/clarification' Oct 22 12:19:14 frequented kernel: [7341847.222687] --> nfs4_try_mount() Oct 22 12:19:14 frequented kernel: [7341847.243053] <-- nfs4_try_mount() = 0 Oct 22 12:19:14 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:14 frequented kernel: [7341847.724590] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:14 frequented kernel: [7341847.724623] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:14 frequented kernel: [7341847.724643] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:14 frequented kernel: [7341847.724670] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:14 frequented kernel: [7341847.724691] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:14 frequented kernel: [7341847.724697] --> nfs4_try_mount() Oct 22 12:19:14 frequented kernel: [7341847.738055] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:24 frequented kernel: [7341857.738991] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:24 frequented kernel: [7341857.739068] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:24 frequented kernel: [7341857.739089] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:24 frequented kernel: [7341857.739115] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:24 frequented kernel: [7341857.739136] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:24 frequented kernel: [7341857.739142] --> nfs4_try_mount() Oct 22 12:19:24 frequented kernel: [7341857.759065] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:25 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:34 frequented kernel: [7341867.760007] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:34 frequented kernel: [7341867.760083] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:34 frequented kernel: [7341867.760104] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:34 frequented kernel: [7341867.760130] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:34 frequented kernel: [7341867.760151] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:34 frequented kernel: [7341867.760158] --> nfs4_try_mount() Oct 22 12:19:35 frequented kernel: [7341867.778060] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:35 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:45 frequented kernel: [7341877.778860] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:45 frequented kernel: [7341877.778892] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:45 frequented kernel: [7341877.778913] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:45 frequented kernel: [7341877.778939] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:45 frequented kernel: [7341877.778960] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:45 frequented kernel: [7341877.778966] --> nfs4_try_mount() Oct 22 12:19:45 frequented kernel: [7341877.799069] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:45 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:19:49 frequented named[11830]: client 74.53.87.138#63953 (0.a.f.9.4.0.0.0.8.a.5.0.1.0.0.2.ip6.arpa): view edge-clients: query: 0.a.f.9.4.0.0.0.8.a.5.0.1.0.0.2.ip6.arpa IN SOA -E (173.228.29.167) Oct 22 12:19:55 frequented kernel: [7341887.800150] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:55 frequented kernel: [7341887.800183] NFS: parsing nfs mount option 'vers=4' Oct 22 12:19:55 frequented kernel: [7341887.800203] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:19:55 frequented kernel: [7341887.800229] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:19:55 frequented kernel: [7341887.800250] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:19:55 frequented kernel: [7341887.800256] --> nfs4_try_mount() Oct 22 12:19:55 frequented kernel: [7341887.820089] <-- nfs4_try_mount() = -116 [error] Oct 22 12:19:55 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:20:05 frequented kernel: [7341897.820923] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:20:05 frequented kernel: [7341897.820954] NFS: parsing nfs mount option 'vers=4' Oct 22 12:20:05 frequented kernel: [7341897.820974] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:20:05 frequented kernel: [7341897.821000] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:20:05 frequented kernel: [7341897.821063] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:20:05 frequented kernel: [7341897.821070] --> nfs4_try_mount() Oct 22 12:20:05 frequented kernel: [7341897.835051] <-- nfs4_try_mount() = -116 [error] Oct 22 12:20:05 frequented udevd[350]: specified group 'plugdev' unknown Oct 22 12:20:15 frequented kernel: [7341907.835965] NFS: nfs mount opts='vers=4,addr=fd12:3456:789a:0:2ab:cdff:feab:cdef,clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:20:15 frequented kernel: [7341907.835999] NFS: parsing nfs mount option 'vers=4' Oct 22 12:20:15 frequented kernel: [7341907.836063] NFS: parsing nfs mount option 'addr=fd12:3456:789a:0:2ab:cdff:feab:cdef' Oct 22 12:20:15 frequented kernel: [7341907.836089] NFS: parsing nfs mount option 'clientaddr=fd12:3456:789a:0:2ab:cdff:feef:abcd' Oct 22 12:20:15 frequented kernel: [7341907.836110] NFS: MNTPATH: '/mnt/bisquick' Oct 22 12:20:15 frequented kernel: [7341907.836117] --> nfs4_try_mount() Oct 22 12:20:15 frequented kernel: [7341907.856066] <-- nfs4_try_mount() = -116 [error] Oct 22 12:20:15 frequented udevd[350]: specified group 'plugdev' unknown Reminder ... NFS "works" between frequented and drippie for /mnt/bisquick, one just has to use NFSv2. NFS "works" between frequented and drippie for /mnt/clarification in nfs4 [on frequented] $ sudo mount.nfs drippie:/mnt/bisquick /tmp/drippie/bisquick -o vers=2 -v -v mount.nfs: timeout set for Mon Oct 22 12:29:22 2012 mount.nfs: trying text-based options 'vers=2,addr=<drippie'sfd::/8>' mount.nfs: prog 100003, trying vers=2, prot=6 mount.nfs: trying <drippie'sfd::/8> prog 100003 vers 2 prot TCP port 2049 mount.nfs: prog 100005, trying vers=1, prot=17 mount.nfs: trying <drippie'sfd::/8> prog 100005 vers 1 prot UDP port 20048 (OK) [on frequented] $ ls -l /tmp/drippie/clarification total 28 drwxrwsr-x. 2 wbaker source 4096 Jan 6 2012 logs drwx------. 2 root root 16384 Dec 17 2011 lost+found -rwxrwxr-x. 1 wbaker source 1684 Apr 17 2012 sync drwxrwsrwt. 9 wbaker source 4096 Mar 9 2012 volumes $ ls -l /tmp/drippie/bisquick total 64 drwxrwsr-t. 5 wbaker source 4096 Jan 18 2012 archive drwxrwsr-t. 11 wbaker source 4096 Oct 22 10:00 areas drwxrwxr-x. 4 wbaker bakers 4096 Feb 6 2012 deprecated drwxrwsr-x. 4 wbaker bakers 4096 Dec 23 2011 devices drwxrwsr-t. 75 wbaker bakers 4096 Feb 5 2012 disks drwxrwsr-t. 7 wbaker source 4096 Dec 20 2011 foreign drwxrwxr-x. 4 wbaker wbaker 4096 Jul 21 10:06 j drwxrwsr-t. 9 wbaker source 4096 Feb 4 2012 local drwx------. 2 root root 16384 Jan 15 2012 lost+found -rw-rw-r--. 1 wbaker wbaker 177 Jul 21 10:12 PROVENANCE drwxrwsr-x. 2 wbaker source 4096 Jul 21 10:08 self drwxrwxr-x. 6 wbaker bakers 4096 Jan 6 2012 users drwxrwsr-x. 17 wbaker source 4096 Oct 15 10:09 whole $ grep -Ee '(clarification|bisquick)' /proc/mounts drippie:/mnt/clarification/ /tmp/drippie/clarification nfs4 rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp6,port=0,timeo=600,retrans=2,sec=sys,clientaddr=<frequented'sfd::/8>,local_lock=none,addr=<drippie'sfd::/8> 0 0 drippie:/mnt/bisquick/ /tmp/drippie/bisquick nfs rw,relatime,vers=2,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,mountaddr=<drippie'sfd::/8>,mountvers=1,mountport=20048,mountproto=udp6,local_lock=none,addr=<drippie'sfd::/8> 0 0 tcpdump output isn't very useful for NFSv4, which is why Steve asked for the full binary data. In any case, I don't see any obvious reason why we should be getting ESTALE's. One of the two exported directories has sgid bit set, not the other, and they have different ownership, but I can't see why either of those should matter. Do I understand correctly that NFSv3 mounts also fail? That's a bit surprising--I would have expected v2 and v3 to behave the same way. Created attachment 632900 [details]
experiment #1, raw pcap
Created attachment 632901 [details]
experiment #2, raw pcap
Created attachment 632902 [details]
experiment #1; raw pcap (client side)
Created attachment 632903 [details]
experiment #2; raw pcap (client side)
Created attachment 632906 [details]
trials across nfs4, nfs3, nfs2
Exhibition
for i in 4 3 2 ; do
echo trying vers=$i ...
sudo mount.nfs drippie:/mnt/bisquick /tmp/drippie/bisquick -o vers=$i -v -v
done
Just looking at attachment 632900 [details]....
In frames 319 and 320 there's a putrootfh/getfh, to get filehandle with hash 0x62d40c52.
Then a lookup of "mnt" against that filehandle in frames 339/340 gives filehandle with hash 0xc0ebafb6.
Then a lookup of "bisquick" against that filehandle in frames 353 and 354 gives filehandle with hash 0xd93120a5.
Then an attempt to use that filehandle gives ESTALE.
Also slightly odd: both of those lookups are duplicated. Why would the client be sending identical lookups one right after the other? (For example, frames 339/340 are and 337/338 are different rpc's with different xid's, but do the same lookup.)
The contents of /proc/net/rpc/*/content after this test might be interesting.
Is it possible that those two filesystems could have the same uuid? For example, is one some kind of clone or snapshot of the other?
That might explain the problem as the v2 filehandle might be too small to contain uuid and might fall back on dev major, minor (I don't remember the details). Though I'm not sure why that would cause this ESTALE.
The filesystems of bisquick and clarification were made in different weeks. they are not replicas. They are maintained by rsync. The hardware on each is a 6GB Western Digital MyBook formatted as a single LUKS container (thus there are two partitions on the device). Both were created by pushing the buttons on the palimpsest of the F13 era. Seems like 2011-02 was when palimpest became useable enough that it was easier to go graphical than run the fdisk & cryptsetup recipe from the command line (these devices would have had to use parted anyway) The history here is that drippie was F13 until a few weeks ago when I upgraded it to F17. Until that time drippie was working fine as a server (and using nfs2 or nfs3 ... never bothered to study it much ... it "just worked"). There's been some (ahem) teething pains getting nfs to work on F17 with the identity mapping technologies changing across F14, F16 & F17 and nfs2, nfs3, nfs4. History: clarification created 2011-12-17 bisquick created 2011-12-30 Neither is a replica of the other. Both are used for backup type activity, but the structure of the ext4 filesystems on them is completely different. on drippie (the server) $ sudo parted -l [deleted /dev/sda and the boot disk lvms -- WCB] Model: WD My Book (scsi) Disk /dev/sdb: 6001GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 20.5kB 210MB 210MB fat32 EFI System Partition boot 2 210MB 6001GB 6001GB bisquick lvm Model: WD My Book (scsi) Disk /dev/sdc: 6001GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 20.5kB 210MB 210MB fat32 EFI System Partition boot 2 210MB 6001GB 6001GB clarification lvm Model: Linux device-mapper (crypt) (dm) Disk /dev/mapper/luks-547d2d1a-2896-4fa5-bd05-c26597e7eff0: 6001GB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 6001GB 6001GB ext4 Model: Linux device-mapper (crypt) (dm) Disk /dev/mapper/luks-4c8d49ad-a06f-49be-9e65-789b41721af8: 6001GB Sector size (logical/physical): 512B/512B Partition Table: loop Disk Flags: Number Start End Size File system Flags 1 0.00B 6001GB 6001GB ext4 Created attachment 633487 [details]
for i in /proc/net/rpc/*/content; do echo "file $i" ; sudo cat $i; done (after the experiment)
[on frequented, the client]
$ sudo umount /tmp/drippie/bisquick
[on frequented, the client]
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
<snip/>
drippie:/mnt/clarification/ 5768225280 3528027776 1947188480 65% /tmp/drippie/clarification
drippie:/local/ 370016256 54267904 296952704 16% /tmp/drippie/local
drippie:/opt/ 51606144 3910400 45074304 8% /tmp/drippie/opt
drippie:/mnt/bisquick/ 5768225200 4697623096 777593112 86% /tmp/drippie/bisquick
(the point being that bisquick isn't mounted any more)
(run the experiment)
[on frequented, the client]
$ for i in 4 3 2 ; do echo trying vers=$i ... ; sudo mount.nfs drippie:/mnt/bisquick /tmp/drippie/bisquick -o vers=$i -v -v; done >& o.ignore.out
(we already saw the output)
[on frequented, the client]$ for i in /proc/net/rpc/*/content; do echo "file $i" ; sudo cat $i; done >& o.on_frequented_nfs_proc_net_rpc.out
(is uninteresting)
[on frequented, the client]
$ cat o.on_frequented_nfs_proc_net_rpc.out
file /proc/net/rpc/auth.unix.gid/content
#uid cnt: gids...
file /proc/net/rpc/auth.unix.ip/content
#class IP domain
[on drippie, the server]
$ for i in /proc/net/rpc/*/content; do echo "file $i" ; sudo cat $i; done > o.on_drippie_nfs_proc_net_rpc_again.out 2>&1
attached nearby.
This line: # *.sanguine.emerson.baker.org 6 0x8e958c4ccee53bfa0000000000000000 in /proc/net/rpc/nfsd.fh/content says that the kernel asked mountd whether there was a filesystem with uuid 0x8e95..., and something went wrong--mountd didn't answer, I think, I'm not sure. Despite that, we can see that there's a filesystem with that uuid: /mnt/bisquick *.sanguine.emerson.baker.org(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=4c8c958e:fa3be5ce:00000000:00000000) (note that uuid's byte-swapped, but it's the same one.) That suggests some bug or misconfiguration affecting mountd or libblkid. The output of the "blkid" command might be interesting. Also possibly useful: restart the server, then strace -p($pidof rpc.mountd) -s4096 -eopen,read,write -oOUTPUT then try the mounts, then kill the strace and attach OUTPUT. I'm also not completely sure why there is a line: *.sanguine.emerson.baker.org 5 0x8e958c4ccee53bfa /mnt/bisquick Probably that's from a successful v2 mount, as v2 likely needs the smaller uuid to fit v2 filehandle limits. Created attachment 634369 [details]
testing and repair; restarting nfs; and other output
The problem went away with the restart of nfs-server.service
Diagnosis: mountd got confused somehow (how?)
Treatment: restart nfs (restart mountd)
The commands are roughly in the order shown.
nfs2, nfs3, nfs4 behave correctly now.
frequented behaves correctly.
eggplants behaves correctly.
the problem was on drippie (the server).
One for the record books. Unless you guys can figure out what went wrong, there is no problem to exhibit any more.
Created attachment 634370 [details]
(after restart nfs-server.service) sudo strace -p$(pidof rpc.mountd) -s4096 -eopen,read,write -oo.on_drippie_strace_rpc_mountd_open_read_write.out
Created attachment 634371 [details]
(after restart nfs-server.service) for i in /proc/net/rpc/*/content; do echo "file $i" ; sudo cat $i; done >& o.on_drippie_after_restart_nfs_proc_net_rpc.out
OK, thanks--reopen if you figure out a reproducer. It happened again. I'm not going to reopen the ticket unless/until I have more hard data. I didn't do a lot of detail work debugging or reproducing. There's a "there" here somehow, but there's an easy workaround. For reasons related here or not, drippie needed a reboot. It was rebooted. 1. Reboot drippie. 2. on drippie mounted clarification with palimpest (supply the password) 3. on drippie mounted bisquick with palimpest (supply the password) From remote machines on F14 & F16, via nfs autofs, try to ls /vol/clarification ls /vol/bisquick The ls of /vol/clarification worked. The ls of /vol/bisquick hung. The tcpdump in both cases showed the funny Stale NFS File Handle behavior. remediation: on drippie systemctl restart nfs-server.service ... and all is well. For example: 14:50:52.442547 IP6 wantowen.905 > drippie.nfs: Flags [.], ack 1069, win 56, options [nop,nop,TS val 523126768 ecr 736806], length 0 14:50:52.442592 IP6 wantowen.4106132903 > drippie.nfs: 140 fsinfo fh Unknown/010006004C8C958EFA3BE5CE0000000000000000000000000000000000000000 14:50:52.442889 IP6 drippie.nfs > wantowen.4106132903: reply ok 32 fsinfo ERROR: Stale NFS file handle 14:50:52.442936 IP6 wantowen.4122910119 > drippie.nfs: 140 fsinfo fh Unknown/010006004C8C958EFA3BE5CE000000000000000000000014010006000CF9586E 14:50:52.443222 IP6 drippie.nfs > wantowen.4122910119: reply ok 32 fsinfo ERROR: Stale NFS file handle I know this is closed, and I probably can't add much to it, but I just thought it worth mentioning that I've been seeing exactly the same problem. The box had plenty of memory available, but v4 mounts where failing with ENOMEM. v3 mounts were hanging indefinitely. v2 mounts were working. leto:~% rpm -q rpcbind nfs-utils rpcbind-0.2.0-19.fc17.x86_64 nfs-utils-1.2.6-6.fc17.x86_64 Restarting the NFS server (CentOS 6.3) seemed to fix the problem, at least for now. To clarify #27 ... (in case it's reopened) It's deterministic and repeatable - boot the server (clients need not cold start) - start up nfs-server.service - be at runlevel 5 and operating - bring multiple new storage units online (e.g. hotplug, supply password) - $XYZABC - nfs mount multiple storage from the clients (autofs, manually, whatever) - observe the problem The remediation is to insert $XYZABC = systemctl restart nfs-server.service One can show this with a laptop running F17 as server a laptop running F17 as client 2x usb ports on the server (3x or more is fine too) 2x outboard HDD on the server (same) |