Bug 338691

Summary: lockd hang in D state
Product: [Fedora] Fedora Reporter: David Rees <drees76>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 7CC: chris.brown
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-16 13:08:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Rees 2007-10-18 19:45:40 UTC
I had a NFS server's lockd process hang on D state today. It stopped responding.
kill -9 didn't affect the process. A reboot cleared it up. It's the first time
I've seen this happen.

I first noticed it when NFS clients lock requests stopped responding. Rebooting
the server cleared up the problem. Before rebooting the server, I used SysRq-W
to grab the blocked state of the lockd process.

This is on kernel 2.6.22.9-91.fc7PAE, running on a Dell 1420 with dual Xeon
processors (HT disabled) and 4GB of ram. Let me know if I can provide any more
information. SysRq-W output below:

SysRq : Show Blocked State
  task                PC stack   pid father
lockd         D f7364d5c  2152  1694      2
       f7364d70 00000046 00000002 f7364d5c f7364d54 00000000 f7364000 00000001 
       00000000 f7a78c00 f7a78d9c c349fa80 00000001 c4395700 00000000 00000000 
       cad59b40 f799ce90 0885e879 cad59b40 e90f6a48 ffffffff 00000000 00000000 
Call Trace:
 [<c060d512>] __mutex_lock_slowpath+0x43/0x72
 [<c060d40b>] mutex_lock+0x26/0x29
 [<f8cae533>] fh_put+0x14e/0x15e [nfsd]
 [<f8bcdfb5>] nlmsvc_traverse_blocks+0x1d/0x81 [lockd]
 [<f8bcf06c>] nlmsvc_mark_host+0x0/0x7 [lockd]
 [<f8bcf06c>] nlmsvc_mark_host+0x0/0x7 [lockd]
 [<f8bcf226>] nlm_traverse_files+0x18f/0x1d9 [lockd]
 [<f8bcc3cb>] nlm_gc_hosts+0x47/0x183 [lockd]
 [<f8bcc881>] nlm_lookup_host+0xab/0x2af [lockd]
 [<f8bccab2>] nlmsvc_lookup_host+0x2d/0x33 [lockd]
 [<f8bce224>] nlmsvc_lock+0xc7/0x307 [lockd]
 [<f8bd169f>] nlm4svc_proc_lock+0x8b/0xd4 [lockd]
 [<f8c55682>] svc_process+0x33f/0x67d [sunrpc]
 [<f8c5825d>] svc_recv+0x326/0x395 [sunrpc]
 [<f8bcd240>] lockd+0x14a/0x222 [lockd]
 [<c0404e72>] ret_from_fork+0x6/0x1c
 [<f8bcd0f6>] lockd+0x0/0x222 [lockd]
 [<f8bcd0f6>] lockd+0x0/0x222 [lockd]
 [<c0405b6b>] kernel_thread_helper+0x7/0x10
 =======================

Comment 1 Christopher Brown 2008-01-16 02:47:54 UTC
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel?

If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.

Comment 2 David Rees 2008-01-16 03:42:18 UTC
I have not seen this issue since my original report. I'm not sure what state to
put it into since none of the resolved bugzilla states seem to fit the case of
"have not seen the bug again since reporting, may or may not be resolved" so I
have not resolved the bug.

Comment 3 Christopher Brown 2008-01-16 13:08:55 UTC
Hi David,

Thanks for the update - as its been three months I'll close as INSUFFICIENT_DATA
but please re-open should it re-occur. Thanks for taking the time to file the
original report.

Cheers
Chris