Bug 592512

Summary: nfs: possible circular locking dependency detected
Product: Red Hat Enterprise Linux 6 Reporter: Qian Cai <qcai>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: anton, dougsland, emcnabb, esandeen, gansalmon, itamar, jlayton, jonathan, kernel-maint, mikko.tiihonen, mvadkert, rwheeler, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 561763 Environment:
Last Closed: 2010-10-14 01:10:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 561763    
Bug Blocks:    

Description Qian Cai 2010-05-15 03:28:56 UTC
+++ This bug was initially created as a clone of Bug #561763 +++

Description of problem:
When running the connectathon testsuite on a NFSv4 root,

./runcthon --server 10.34.33.104 --serverdir /nfs --onlyv4

./server -b -F nfs4 -o proto=tcp -m /mnt/nfsv4tcp -p /nfs/nfsv4tcp 10.34.33.104
Waiting for 'b' to finish...

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.33-0.26.rc6.git1.fc13.i686.PAE #1
-------------------------------------------------------
test5/26460 is trying to acquire lock:
 (&sb->s_type->i_mutex_key#16){+.+.+.}, at: [<f8e2b9a4>] nfs_revalidate_mapping+0x67/0xa1 [nfs]

but task is already holding lock:
 (&mm->mmap_sem){++++++}, at: [<c04cf27c>] sys_mmap_pgoff+0xab/0xee

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&mm->mmap_sem){++++++}:
       [<c046a958>] __lock_acquire+0xa17/0xb76
       [<c046ab4a>] lock_acquire+0x93/0xb1
       [<c04c78e9>] might_fault+0x69/0x86
       [<c05cb432>] copy_to_user+0x34/0x10a
       [<c04f37ec>] filldir64+0x9c/0xd0
       [<f8e2779f>] nfs_do_filldir+0x310/0x3bb [nfs]
       [<f8e27ed8>] nfs_readdir+0x68e/0x70c [nfs]
       [<c04f3a14>] vfs_readdir+0x6d/0x99
       [<c04f3aa8>] sys_getdents64+0x68/0xaa
       [<c0408b5f>] sysenter_do_call+0x12/0x38

-> #0 (&sb->s_type->i_mutex_key#16){+.+.+.}:
       [<c046a85a>] __lock_acquire+0x919/0xb76
       [<c046ab4a>] lock_acquire+0x93/0xb1
       [<c07c1afb>] __mutex_lock_common+0x32/0x30a
       [<c07c1e80>] mutex_lock_nested+0x35/0x3d
       [<f8e2b9a4>] nfs_revalidate_mapping+0x67/0xa1 [nfs]
       [<f8e291ca>] nfs_file_mmap+0x55/0x5d [nfs]
       [<c04ced93>] mmap_region+0x250/0x3f7
       [<c04cf181>] do_mmap_pgoff+0x247/0x297
       [<c04cf295>] sys_mmap_pgoff+0xc4/0xee
       [<c0408b5f>] sysenter_do_call+0x12/0x38

other info that might help us debug this:

1 lock held by test5/26460:
 #0:  (&mm->mmap_sem){++++++}, at: [<c04cf27c>] sys_mmap_pgoff+0xab/0xee

stack backtrace:
Pid: 26460, comm: test5 Not tainted 2.6.33-0.26.rc6.git1.fc13.i686.PAE #1
Call Trace:
 [<c07c091d>] ? printk+0x14/0x17
 [<c0469c13>] print_circular_bug+0x8a/0x96
 [<c046a85a>] __lock_acquire+0x919/0xb76
 [<c045e4f7>] ? sched_clock_cpu+0x125/0x12d
 [<c046ab4a>] lock_acquire+0x93/0xb1
 [<f8e2b9a4>] ? nfs_revalidate_mapping+0x67/0xa1 [nfs]
 [<c07c1afb>] __mutex_lock_common+0x32/0x30a
 [<f8e2b9a4>] ? nfs_revalidate_mapping+0x67/0xa1 [nfs]
 [<f8e47f55>] ? rcu_read_unlock+0x0/0x1e [nfs]
 [<c07c1e80>] mutex_lock_nested+0x35/0x3d
 [<f8e2b9a4>] ? nfs_revalidate_mapping+0x67/0xa1 [nfs]
 [<f8e2b9a4>] nfs_revalidate_mapping+0x67/0xa1 [nfs]
 [<f8e291ca>] nfs_file_mmap+0x55/0x5d [nfs]
 [<c04ced93>] mmap_region+0x250/0x3f7
 [<c04cf181>] do_mmap_pgoff+0x247/0x297
 [<c04cf295>] sys_mmap_pgoff+0xc4/0xee
 [<c0408b5f>] sysenter_do_call+0x12/0x38

Version-Release number of selected component (if applicable):
kernel-2.6.33-0.26.rc6.git1.fc13
nfs-utils-1.2.1-16.fc13

How reproducible:
always

Steps to Reproduce:
1. following the instruction here.
http://fedoraproject.org/wiki/QA:Testcase_nfs_connectathon
  
Actual results:
Warnings in dmesg.

Expected results:
No warning in dmesg?

--- Additional comment from mvadkert on 2010-02-05 07:01:54 EST ---

Same failure in connectathon testsuite with krb5:
https://fedoraproject.org/wiki/QA:Testcase_nfs_connectathon_secure
./runcthon --server dhcp-lab-104.englab.brq.redhat.com --serverdir /nfs --onlyv4 --onlykrb5

--- Additional comment from fedora-triage-list on 2010-03-15 10:22:41 EDT ---


This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from mikko.tiihonen on 2010-04-10 13:07:49 EDT ---

I see the same lockdep warning on 2.6.33.1-19.fc13.x86_64 kernel on every boot when nfs4 home directory is mounted.

Comment 1 RHEL Program Management 2010-05-15 03:45:15 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 2 Jeff Layton 2010-06-07 19:10:22 UTC
I think this was fixed in some of the recent patches that steved proposed. Are you still able to reproduce this on -33.el6 or so?

Comment 3 RHEL Program Management 2010-07-15 14:39:09 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 5 Jeff Layton 2010-10-14 01:10:44 UTC
I'm fairly certain this bug is no longer present in more recent RHEL6 kernels. Closing as resolved.