Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 533835

Summary: nfs4-daemon hangs after a while when fc12-client is connected
Product: Red Hat Enterprise Linux 5 Reporter: Espen Stefansen <libbe>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: low    
Version: 5.4CC: jlayton, sprabhu, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-02 18:45:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Espen Stefansen 2009-11-09 13:43:18 UTC
Description of problem:
We have a nfs4/kerberos/ldap system at work. This server is the nfs4-server. When rhel 5.4 clients is connected, everything works as it should. When a fedora 12 client is connected the nfs4-daemon hangs within a short while. The nfs-service can't be restarted, unless the f12 client is turned off.

This is the error messages in /var/log/messages:
Nov  9 12:53:25 artem kernel: slab error in kmem_cache_destroy(): cache `nfsd4_files': Can't free all objects
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: Call Trace:
Nov  9 12:53:25 artem kernel:  [<ffffffff800db76f>] kmem_cache_destroy+0x7e/0x177
Nov  9 12:53:25 artem kernel:  [<ffffffff888abef9>] :nfsd:nfsd4_free_slab+0x11/0x4d
Nov  9 12:53:25 artem kernel:  [<ffffffff888abf51>] :nfsd:nfsd4_free_slabs+0x1c/0x33
Nov  9 12:53:25 artem kernel:  [<ffffffff888acec7>] :nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov  9 12:53:25 artem kernel:  [<ffffffff88896570>] :nfsd:nfsd_last_thread+0x45/0x76
Nov  9 12:53:25 artem kernel:  [<ffffffff8877eff4>] :sunrpc:svc_destroy+0x77/0xad
Nov  9 12:53:25 artem kernel:  [<ffffffff88896856>] :nfsd:nfsd+0x2b5/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: BUG: warning at fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: P     )
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: Call Trace:
Nov  9 12:53:25 artem kernel:  [<ffffffff888abf51>] :nfsd:nfsd4_free_slabs+0x1c/0x33
Nov  9 12:53:25 artem kernel:  [<ffffffff888acec7>] :nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov  9 12:53:25 artem kernel:  [<ffffffff88896570>] :nfsd:nfsd_last_thread+0x45/0x76
Nov  9 12:53:25 artem kernel:  [<ffffffff8877eff4>] :sunrpc:svc_destroy+0x77/0xad
Nov  9 12:53:25 artem kernel:  [<ffffffff88896856>] :nfsd:nfsd+0x2b5/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: slab error in kmem_cache_destroy(): cache `nfsd4_delegations': Can't free all objects
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: Call Trace:
Nov  9 12:53:25 artem kernel:  [<ffffffff800db76f>] kmem_cache_destroy+0x7e/0x177
Nov  9 12:53:25 artem kernel:  [<ffffffff888abef9>] :nfsd:nfsd4_free_slab+0x11/0x4d
Nov  9 12:53:25 artem kernel:  [<ffffffff888acec7>] :nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov  9 12:53:25 artem kernel:  [<ffffffff88896570>] :nfsd:nfsd_last_thread+0x45/0x76
Nov  9 12:53:25 artem kernel:  [<ffffffff8877eff4>] :sunrpc:svc_destroy+0x77/0xad
Nov  9 12:53:25 artem kernel:  [<ffffffff88896856>] :nfsd:nfsd+0x2b5/0x2cb 
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: BUG: warning at fs/nfsd/nfs4state.c:1016/nfsd4_free_slab() (Tainted: P     )
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: Call Trace:
Nov  9 12:53:25 artem kernel:  [<ffffffff888acec7>] :nfsd:nfs4_state_shutdown+0x17e/0x18a
Nov  9 12:53:25 artem kernel:  [<ffffffff88896570>] :nfsd:nfsd_last_thread+0x45/0x76
Nov  9 12:53:25 artem kernel:  [<ffffffff8877eff4>] :sunrpc:svc_destroy+0x77/0xad
Nov  9 12:53:25 artem kernel:  [<ffffffff88896856>] :nfsd:nfsd+0x2b5/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff888965a1>] :nfsd:nfsd+0x0/0x2cb
Nov  9 12:53:25 artem kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Nov  9 12:53:25 artem kernel: 
Nov  9 12:53:25 artem kernel: nfsd: last server has exited
Nov  9 12:53:25 artem kernel: nfsd: unexporting all filesystems
Nov  9 12:53:48 artem kernel: kmem_cache_create: duplicate cache nfsd4_files
Nov  9 12:53:48 artem kernel: 
Nov  9 12:53:48 artem kernel: Call Trace:
Nov  9 12:53:48 artem kernel:  [<ffffffff800396ce>] kmem_cache_create+0x56a/0x5a4
Nov  9 12:53:48 artem kernel:  [<ffffffff888acf25>] :nfsd:nfs4_state_start+0x52/0x18f
Nov  9 12:53:48 artem kernel:  [<ffffffff888963ae>] :nfsd:nfsd_svc+0x6c/0x1e9
Nov  9 12:53:48 artem kernel:  [<ffffffff88896f8e>] :nfsd:write_threads+0x0/0xa9
Nov  9 12:53:48 artem kernel:  [<ffffffff88896ffd>] :nfsd:write_threads+0x6f/0xa9
Nov  9 12:53:48 artem kernel:  [<ffffffff8002b8e9>] get_zeroed_page+0x21/0x82
Nov  9 12:53:48 artem kernel:  [<ffffffff800f0e2b>] simple_transaction_get+0x8b/0xa5
Nov  9 12:53:48 artem kernel:  [<ffffffff88896f8e>] :nfsd:write_threads+0x0/0xa9
Nov  9 12:53:48 artem kernel:  [<ffffffff88896d59>] :nfsd:nfsctl_transaction_write+0x42/0x77
Nov  9 12:53:48 artem kernel:  [<ffffffff80016927>] vfs_write+0xce/0x174
Nov  9 12:53:48 artem kernel:  [<ffffffff800171df>] sys_write+0x45/0x6e
Nov  9 12:53:48 artem kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Nov  9 12:53:48 artem kernel: 
Nov  9 12:53:48 artem nfsd[5493]: nfssvc: Cannot allocate memory


Version-Release number of selected component (if applicable):
kernel-2.6.18-164.3.1.el5
nfs-utils-1.0.9-42.el5
nfs-utils-lib-1.0.8-7.6.el5


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Espen Stefansen 2010-02-27 16:19:47 UTC
I managed to find the root of the problem. I had one file that didn't have the correct uid/gid (instead it had a huge number), so when tracker found this file, the nfs-server froze. When i changed the file to be owned by me, the problem never occured again.

Comment 2 Sachin Prabhu 2010-03-01 10:10:29 UTC
Espen,

This could be related to bz 519184. A uid/gid of 4294967294 confused idmapd on nfsserver causing it to hang. A fix for this is available on RHEL 5.5 beta. If possible, can you please check this version and report if this fixes the issue for you on bz 519184.

Sachin Prabhu

Comment 3 Jeff Layton 2011-05-02 18:45:01 UTC
Marking as a duplicate of bug 519184 under the presumption that Sachin's analysis is correct. Please reopen if that's not the case.

*** This bug has been marked as a duplicate of bug 519184 ***