Bug 661341
Summary: | home dir on nfsv4: completely unreliable | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Thomas Sailer <fedora> | ||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 14 | CC: | dougsland, gansalmon, itamar, jonathan, kernel-maint, kmcmartin, madhu.chinakonda | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-08-16 22:20:00 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Thomas Sailer
2010-12-08 15:55:27 UTC
When the processes are hung please log in a root type the following commands: dmesg -c > /dev/null echo t > /proc/sysrq-trigger dmesg > /tmp/system.txt Then please attache /tmp/system.txt to this bz. Also that type of NFS server are you using. Linux? If so what version of Linux? As a work around you could set the Defaultvers=3 in the new /etc/nfsmount.conf in the NFSMount_Global_Options section to make all mounts use version 3 as the default version. Please see man nfsmount.conf(5) for details (In reply to comment #1) Hi Steve, here you go. I don't think it's very illuminating, the interesting processes (like 2772) are not in the "system.txt" log. > Also that type of NFS server are you using. Linux? > If so what version of Linux? Fedora 14, Linux server.xxxxx.com 2.6.35.9-64.fc14.x86_64 #1 SMP Fri Dec 3 12:19:41 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux nfs-utils-1.2.3-2.fc14.x86_64 krb5-libs-1.8.2-7.fc14.x86_64 > As a work around you could set the Defaultvers=3 Yes, that's the workaround I'm currently using (i.e. mounting with version 3), the downside is that I'm loosing encryption... Created attachment 468384 [details]
output of echo t > /proc/sysrq-trigger
Created attachment 468385 [details]
Output of ps axwl
(In reply to comment #2) > (In reply to comment #1) > > Hi Steve, > > here you go. I don't think it's very illuminating, the interesting processes > (like 2772) are not in the "system.txt" log. > > > Also that type of NFS server are you using. Linux? > > If so what version of Linux? > > Fedora 14, Linux server.xxxxx.com 2.6.35.9-64.fc14.x86_64 #1 SMP Fri Dec 3 > 12:19:41 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux > > nfs-utils-1.2.3-2.fc14.x86_64 > krb5-libs-1.8.2-7.fc14.x86_64 > > > As a work around you could set the Defaultvers=3 > > Yes, that's the workaround I'm currently using (i.e. mounting with version 3), > the downside is that I'm loosing encryption... FYI... Secure NFS is available with *all* version of NFS... (In reply to comment #3) > Created attachment 468384 [details] > output of echo t > /proc/sysrq-trigger Here are the four process that appear to be hung. They are all waiting for a mutex lock, not in the NFS code. Plus in this backtrace there are no processes in the NFS code... Which is a bit odd since I would expect to see at least one or two rpciod kernel processes.. [ 561.404130] gnome-volume- D ffff8800b80eea00 0 2842 2716 0x00000080 [ 561.404130] Call Trace: [ 561.404130] [<ffffffff81468c10>] __mutex_lock_common.clone.5+0x12f/0x196 [ 561.404130] [<ffffffff81120335>] ? putname+0x34/0x36 [ 561.404130] [<ffffffff81468c8a>] __mutex_lock_slowpath+0x13/0x15 [ 561.404130] [<ffffffff81468ac7>] mutex_lock+0x36/0x50 [ 561.404130] [<ffffffff81115a79>] chown_common.clone.10+0x65/0x89 [ 561.404130] [<ffffffff8111f61d>] ? path_put+0x22/0x27 [ 561.404130] [<ffffffff81116364>] sys_chown+0x51/0x7c [ 561.404130] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b [ 561.404130] canberra-gtk- D ffff8800b8010e00 0 2849 2716 0x00000080 [ 561.404130] Call Trace: [ 561.404130] [<ffffffff81468c10>] __mutex_lock_common.clone.5+0x12f/0x196 [ 561.404130] [<ffffffff81120335>] ? putname+0x34/0x36 [ 561.404130] [<ffffffff81468c8a>] __mutex_lock_slowpath+0x13/0x15 [ 561.404130] [<ffffffff81468ac7>] mutex_lock+0x36/0x50 [ 561.404130] [<ffffffff81115a79>] chown_common.clone.10+0x65/0x89 [ 561.404130] [<ffffffff8111f61d>] ? path_put+0x22/0x27 [ 561.404130] [<ffffffff81116364>] sys_chown+0x51/0x7c [ 561.404130] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b [ 561.404130] pulseaudio D ffff880037a07b80 0 2850 2848 0x00000080 [ 561.404130] Call Trace: [ 561.404130] [<ffffffff81468c10>] __mutex_lock_common.clone.5+0x12f/0x196 [ 561.404130] [<ffffffff81468c8a>] __mutex_lock_slowpath+0x13/0x15 [ 561.404130] [<ffffffff81468ac7>] mutex_lock+0x36/0x50 [ 561.404130] [<ffffffff8111f768>] do_lookup+0xa5/0x18b [ 561.404130] [<ffffffff811dedc3>] ? selinux_file_alloc_security+0x0/0x75 [ 561.404130] [<ffffffff81121ebd>] do_last+0x180/0x5d4 [ 561.404130] [<ffffffff81122541>] do_filp_open+0x230/0x5e1 [ 561.404130] [<ffffffff810e929d>] ? pmd_offset+0x19/0x40 [ 561.404130] [<ffffffff81221850>] ? might_fault+0x21/0x23 [ 561.404130] [<ffffffff81221950>] ? __strncpy_from_user+0x1f/0x4e [ 561.404130] [<ffffffff8112b765>] ? alloc_fd+0x74/0x11f [ 561.404130] [<ffffffff811165b3>] do_sys_open+0x64/0x110 [ 561.404130] [<ffffffff8111667f>] sys_open+0x20/0x22 [ 561.404130] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b [ 561.404130] restorecond D 0000000000000001 0 2856 2716 0x00000080 [ 561.404130] Call Trace: [ 561.404130] [<ffffffff81468c10>] __mutex_lock_common.clone.5+0x12f/0x196 [ 561.404130] [<ffffffff81127245>] ? dput+0x45/0x110 [ 561.404130] [<ffffffff81468c8a>] __mutex_lock_slowpath+0x13/0x15 [ 561.404130] [<ffffffff81468ac7>] mutex_lock+0x36/0x50 [ 561.404130] [<ffffffff8113061f>] vfs_setxattr+0x55/0x9f [ 561.404130] [<ffffffff81130736>] setxattr+0xcd/0xff [ 561.404130] [<ffffffff81120335>] ? putname+0x34/0x36 [ 561.404130] [<ffffffff81121b73>] ? user_path_at+0x62/0x91 [ 561.404130] [<ffffffff811182f5>] ? fput+0x1de/0x1ed [ 561.404130] [<ffffffff81130864>] sys_lsetxattr+0x6a/0x8f [ 561.404130] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b (In reply to comment #5) > FYI... Secure NFS is available with *all* version of NFS... How? mount -t nfs4 -o soft,intr,rsize=8192,wsize=8192,rw,sec=krb5p server.xxxxx.com:/export xx gives me a V4 mount mount -t nfs4 -o soft,intr,nfsvers=3,rsize=8192,wsize=8192,rw,sec=krb5p server.xxxxx.com:/export xx gives me -EINVAL Setting Defaultvers=3 in /etc/mount.conf gives me a V4 mount, and setting Nfsvers=3 in /etc/mount.conf gives me -EINVAL. (In reply to comment #7) > (In reply to comment #5) > > FYI... Secure NFS is available with *all* version of NFS... > > How? > > mount -t nfs4 -o soft,intr,rsize=8192,wsize=8192,rw,sec=krb5p > server.xxxxx.com:/export xx > gives me a V4 mount > >' nfs4 -o soft,intr,nfsvers=3,rsize=8192,wsize=8192,rw,sec=krb5p > server.xxxxx.com:/export xx > gives me -EINVAL > > Setting Defaultvers=3 in /etc/mount.conf gives me a V4 mount, and setting > Nfsvers=3 in /etc/mount.conf gives me -EINVAL. Try 'mount -t nfs -o intr,v3,rsize=8192,wsize=8192,rw,sec=krb5p' The reason you were getting the -EINVAL error was you were specifying both v4 (-t nfs4) and v3 (-o nfsvers=3) Note, doing soft mounts is not a good idea. They can easily lead to data corruptions... When I do that (-t nfs ...), I get -EACCESS, even though rpc.mountd on the server thinks it's ok: rpc.mountd: check_default: access by 192.168.1.244 ALLOWED (cached) rpc.mountd: Received NULL request from 192.168.1.244 rpc.mountd: check_default: access by 192.168.1.244 ALLOWED (cached) rpc.mountd: Received NULL request from 192.168.1.244 rpc.mountd: check_default: access by 192.168.1.244 ALLOWED (cached) rpc.mountd: Received MNT3(/mnt/data/home-axsem) request from 192.168.1.244 rpc.mountd: authenticated mount request from 192.168.1.244:944 for /mnt/data (/mnt/data) rpc.mountd: check_default: access by 192.168.1.244 ALLOWED (cached) rpc.mountd: Received UMNT(/mnt/data/home-axsem) request from 192.168.1.244 rpc.mountd: authenticated unmount request from 192.168.1.244:803 for /mnt/data (/mnt/data) rpc.gssd on the client and rpc.svcgssd on the server remain silent... In /etc/sysconfig/nfs please set RPCGSSDARGS="-vvv" for rpc.gssd on the client and set RPCSVCGSSDARGS="-vv" for the server side.. rpc.svcgssd on the server is happily handling requests from other clients, but none from the client above, and rpc.gssd on the client just says rpc.gssd[9607]: beginning poll nothing else It appears SELinux is enabled... Does 'setenforce 0' make any difference? It's actually permissive, both on the client and the server. WRT getting secure NFS working on an F14 server... Try adding "rdns = false" to the libdefaults section in /etc/krb5.conf... I needed that to get the rpc.svcgssd to start up correctly... This message is a notice that Fedora 14 is now at end of life. Fedora has stopped maintaining and issuing updates for Fedora 14. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At this time, all open bugs with a Fedora 'version' of '14' have been closed as WONTFIX. (Please note: Our normal process is to give advanced warning of this occurring, but we forgot to do that. A thousand apologies.) Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, feel free to reopen this bug and simply change the 'version' to a later Fedora version. Bug Reporter: Thank you for reporting this issue and we are sorry that we were unable to fix it before Fedora 14 reached end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged to click on "Clone This Bug" (top right of this page) and open it against that version of Fedora. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping |