Bug 998283

Summary: crda possible circular locking dependency detected
Product: [Fedora] Fedora Reporter: Vinson Lee <vlee>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: gansalmon, itamar, jonathan, kernel-maint, ktmdms, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-26 14:46:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vinson Lee 2013-08-18 21:44:51 UTC
Description of problem:
possible circular locking dependency detected

Version-Release number of selected component (if applicable):
3.11.0-0.rc5.git4.1.fc20.x86_64

How reproducible:


Steps to Reproduce:
1. Boot.
2.
3.

Actual results:

[   26.523950] ======================================================
[   26.523951] [ INFO: possible circular locking dependency detected ]
[   26.523953] 3.11.0-0.rc5.git4.1.fc20.x86_64 #1 Not tainted
[   26.523954] -------------------------------------------------------
[   26.523955] crda/547 is trying to acquire lock:
[   26.523957]  (genl_mutex){+.+.+.}, at: [<ffffffff816220b2>] ctrl_dumpfamily+0x172/0x180
[   26.523964] 
but task is already holding lock:
[   26.523965]  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161e5e1>] netlink_dump+0x21/0x230
[   26.523969] 
which lock already depends on the new lock.

[   26.523970] 
the existing dependency chain (in reverse order) is:
[   26.523971] 
-> #1 (nlk->cb_mutex){+.+.+.}:
[   26.523973]        [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   26.523977]        [<ffffffff81727ef7>] mutex_lock_nested+0x87/0x3e0
[   26.523979]        [<ffffffff8161f1a5>] __netlink_dump_start+0xb5/0x1c0
[   26.523981]        [<ffffffff81623274>] genl_family_rcv_msg+0x264/0x300
[   26.523983]        [<ffffffff8162339e>] genl_rcv_msg+0x8e/0xd0
[   26.523984]        [<ffffffff81621059>] netlink_rcv_skb+0xa9/0xc0
[   26.523986]        [<ffffffff816215b8>] genl_rcv+0x28/0x40
[   26.523988]        [<ffffffff8162069d>] netlink_unicast+0xdd/0x190
[   26.523989]        [<ffffffff81620a79>] netlink_sendmsg+0x329/0x750
[   26.523991]        [<ffffffff815d1119>] sock_sendmsg+0x99/0xd0
[   26.523997]        [<ffffffff815d153e>] ___sys_sendmsg+0x39e/0x3b0
[   26.523999]        [<ffffffff815d2aa2>] __sys_sendmsg+0x42/0x80
[   26.524001]        [<ffffffff815d2af2>] SyS_sendmsg+0x12/0x20
[   26.524003]        [<ffffffff817371d9>] system_call_fastpath+0x16/0x1b
[   26.524006] 
-> #0 (genl_mutex){+.+.+.}:
[   26.524008]        [<ffffffff810e88b4>] __lock_acquire+0x17b4/0x1b20
[   26.524010]        [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   26.524011]        [<ffffffff81727ef7>] mutex_lock_nested+0x87/0x3e0
[   26.524013]        [<ffffffff816220b2>] ctrl_dumpfamily+0x172/0x180
[   26.524016]        [<ffffffff8161e640>] netlink_dump+0x80/0x230
[   26.524018]        [<ffffffff8161eac8>] netlink_recvmsg+0x2d8/0x390
[   26.524020]        [<ffffffff815d17c8>] sock_recvmsg+0xa8/0xe0
[   26.524023]        [<ffffffff815d2233>] ___sys_recvmsg+0x113/0x290
[   26.524026]        [<ffffffff815d2d12>] __sys_recvmsg+0x42/0x80
[   26.524029]        [<ffffffff815d2d62>] SyS_recvmsg+0x12/0x20
[   26.524032]        [<ffffffff817371d9>] system_call_fastpath+0x16/0x1b
[   26.524034] 
other info that might help us debug this:

[   26.524036]  Possible unsafe locking scenario:

[   26.524037]        CPU0                    CPU1
[   26.524038]        ----                    ----
[   26.524039]   lock(nlk->cb_mutex);
[   26.524041]                                lock(genl_mutex);
[   26.524043]                                lock(nlk->cb_mutex);
[   26.524044]   lock(genl_mutex);
[   26.524046] 
 *** DEADLOCK ***

[   26.524048] 1 lock held by crda/547:
[   26.524049]  #0:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161e5e1>] netlink_dump+0x21/0x230
[   26.524053] 
stack backtrace:
[   26.524055] CPU: 0 PID: 547 Comm: crda Not tainted 3.11.0-0.rc5.git4.1.fc20.x86_64 #1
[   26.524057] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012
[   26.524058]  ffffffff82584270 ffff88001ca5f910 ffffffff817242e6 ffffffff82584270
[   26.524061]  ffff88001ca5f950 ffffffff8171fee7 ffff88001ca5f9a0 ffff88002237cd50
[   26.524063]  ffff88002237c500 0000000000000001 0000000000000001 ffff88002237cd50
[   26.524065] Call Trace:
[   26.524068]  [<ffffffff817242e6>] dump_stack+0x54/0x74
[   26.524071]  [<ffffffff8171fee7>] print_circular_bug+0x201/0x20f
[   26.524073]  [<ffffffff810e88b4>] __lock_acquire+0x17b4/0x1b20
[   26.524076]  [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   26.524078]  [<ffffffff816220b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.524079]  [<ffffffff81727ef7>] mutex_lock_nested+0x87/0x3e0
[   26.524081]  [<ffffffff816220b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.524083]  [<ffffffff816220b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.524086]  [<ffffffff815db081>] ? __kmalloc_reserve.isra.25+0x31/0x90
[   26.524088]  [<ffffffff816220b2>] ctrl_dumpfamily+0x172/0x180
[   26.524089]  [<ffffffff8161e640>] netlink_dump+0x80/0x230
[   26.524091]  [<ffffffff8161eac8>] netlink_recvmsg+0x2d8/0x390
[   26.524093]  [<ffffffff815d17c8>] sock_recvmsg+0xa8/0xe0
[   26.524096]  [<ffffffff810e359d>] ? trace_hardirqs_off+0xd/0x10
[   26.524098]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.524100]  [<ffffffff810e9868>] ? lock_release_non_nested+0x308/0x350
[   26.524102]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.524104]  [<ffffffff815d2233>] ___sys_recvmsg+0x113/0x290
[   26.524106]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.524109]  [<ffffffff810a5b5f>] ? up_read+0x1f/0x40
[   26.524113]  [<ffffffff812101a7>] ? fget_light+0x3a7/0x510
[   26.524115]  [<ffffffff815d2d12>] __sys_recvmsg+0x42/0x80
[   26.524117]  [<ffffffff815d2d62>] SyS_recvmsg+0x12/0x20
[   26.524119]  [<ffffffff817371d9>] system_call_fastpath+0x16/0x1b


Expected results:


Additional info:

Comment 1 Vinson Lee 2013-08-19 07:27:51 UTC
3.11.0-0.rc5.git5.1.fc20.x86_64 - bad
3.11.0-0.rc5.git4.1.fc20.x86_64 - bad
3.11.0-0.rc5.git2.1.fc20.x86_64 - good
3.11.0-0.rc5.git0.1.fc20.x86_64 - good

Comment 2 kevin martin 2013-08-20 20:47:33 UTC
3.11.0-0.rc5.git6.1.fc20.x86_64 - bad


[   17.919778] ======================================================
[   17.919778] [ INFO: possible circular locking dependency detected ]
[   17.919781] 3.11.0-0.rc5.git6.1.fc20.x86_64 #1 Not tainted
[   17.919781] -------------------------------------------------------
[   17.919782] crda/317 is trying to acquire lock:
[   17.919793]  (genl_mutex){+.+.+.}, at: [<ffffffff81622152>] ctrl_dumpfamily+0x172/0x180
[   17.919796] 
but task is already holding lock:
[   17.919799]  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161e681>] netlink_dump+0x21/0x230
[   17.919800] 
which lock already depends on the new lock.

[   17.919800] 
the existing dependency chain (in reverse order) is:
[   17.919803] 
-> #1 (nlk->cb_mutex){+.+.+.}:
[   17.919808]        [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   17.919811]        [<ffffffff81727f97>] mutex_lock_nested+0x87/0x3e0
[   17.919813]        [<ffffffff8161f245>] __netlink_dump_start+0xb5/0x1c0
[   17.919815]        [<ffffffff81623314>] genl_family_rcv_msg+0x264/0x300
[   17.919817]        [<ffffffff8162343e>] genl_rcv_msg+0x8e/0xd0
[   17.919820]        [<ffffffff816210f9>] netlink_rcv_skb+0xa9/0xc0
[   17.919822]        [<ffffffff81621658>] genl_rcv+0x28/0x40
[   17.919824]        [<ffffffff8162073d>] netlink_unicast+0xdd/0x190
[   17.919826]        [<ffffffff81620b19>] netlink_sendmsg+0x329/0x750
[   17.919832]        [<ffffffff815d11b9>] sock_sendmsg+0x99/0xd0
[   17.919834]        [<ffffffff815d15de>] ___sys_sendmsg+0x39e/0x3b0
[   17.919837]        [<ffffffff815d2b42>] __sys_sendmsg+0x42/0x80
[   17.919839]        [<ffffffff815d2b92>] SyS_sendmsg+0x12/0x20
[   17.919843]        [<ffffffff81737259>] system_call_fastpath+0x16/0x1b
[   17.919845] 
-> #0 (genl_mutex){+.+.+.}:
[   17.919848]        [<ffffffff810e88b4>] __lock_acquire+0x17b4/0x1b20
[   17.919850]        [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   17.919852]        [<ffffffff81727f97>] mutex_lock_nested+0x87/0x3e0
[   17.919854]        [<ffffffff81622152>] ctrl_dumpfamily+0x172/0x180
[   17.919856]        [<ffffffff8161e6e0>] netlink_dump+0x80/0x230
[   17.919858]        [<ffffffff8161eb68>] netlink_recvmsg+0x2d8/0x390
[   17.919861]        [<ffffffff815d1868>] sock_recvmsg+0xa8/0xe0
[   17.919863]        [<ffffffff815d22d3>] ___sys_recvmsg+0x113/0x290
[   17.919865]        [<ffffffff815d2db2>] __sys_recvmsg+0x42/0x80
[   17.919868]        [<ffffffff815d2e02>] SyS_recvmsg+0x12/0x20
[   17.919870]        [<ffffffff81737259>] system_call_fastpath+0x16/0x1b
[   17.919871] 
other info that might help us debug this:

[   17.919872]  Possible unsafe locking scenario:

[   17.919872]        CPU0                    CPU1
[   17.919873]        ----                    ----
[   17.919874]   lock(nlk->cb_mutex);
[   17.919876]                                lock(genl_mutex);
[   17.919877]                                lock(nlk->cb_mutex);
[   17.919878]   lock(genl_mutex);
[   17.919879] 
 *** DEADLOCK ***

[   17.919881] 1 lock held by crda/317:
[   17.919885]  #0:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161e681>] netlink_dump+0x21/0x230
[   17.919885] 
stack backtrace:
[   17.919888] CPU: 0 PID: 317 Comm: crda Not tainted 3.11.0-0.rc5.git6.1.fc20.x86_64 #1
[   17.919889] Hardware name: TOSHIBA Qosmio X505/Qosmio X505, BIOS V2.90    12/10/2010
[   17.919893]  ffffffff825870f0 ffff8801328c7910 ffffffff81724386 ffffffff825870f0
[   17.919896]  ffff8801328c7950 ffffffff8171ff87 ffff8801328c79a0 ffff880132cc0850
[   17.919898]  ffff880132cc0000 0000000000000001 0000000000000001 ffff880132cc0850
[   17.919899] Call Trace:
[   17.919905]  [<ffffffff81724386>] dump_stack+0x54/0x74
[   17.919910]  [<ffffffff8171ff87>] print_circular_bug+0x201/0x20f
[   17.919912]  [<ffffffff810e88b4>] __lock_acquire+0x17b4/0x1b20
[   17.919915]  [<ffffffff810e9412>] lock_acquire+0xa2/0x1f0
[   17.919918]  [<ffffffff81622152>] ? ctrl_dumpfamily+0x172/0x180
[   17.919920]  [<ffffffff81727f97>] mutex_lock_nested+0x87/0x3e0
[   17.919922]  [<ffffffff81622152>] ? ctrl_dumpfamily+0x172/0x180
[   17.919924]  [<ffffffff81622152>] ? ctrl_dumpfamily+0x172/0x180
[   17.919928]  [<ffffffff815db121>] ? __kmalloc_reserve.isra.25+0x31/0x90
[   17.919931]  [<ffffffff81622152>] ctrl_dumpfamily+0x172/0x180
[   17.919933]  [<ffffffff8161e6e0>] netlink_dump+0x80/0x230
[   17.919935]  [<ffffffff8161eb68>] netlink_recvmsg+0x2d8/0x390
[   17.919940]  [<ffffffff810b7855>] ? sched_clock_cpu+0xb5/0x100
[   17.919943]  [<ffffffff815d1868>] sock_recvmsg+0xa8/0xe0
[   17.919946]  [<ffffffff810e359d>] ? trace_hardirqs_off+0xd/0x10
[   17.919949]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   17.919951]  [<ffffffff810e9868>] ? lock_release_non_nested+0x308/0x350
[   17.919954]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   17.919957]  [<ffffffff815d22d3>] ___sys_recvmsg+0x113/0x290
[   17.919959]  [<ffffffff810e40ff>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   17.919963]  [<ffffffff810a5b5f>] ? up_read+0x1f/0x40
[   17.919973]  [<ffffffff812101a7>] ? fget_light+0x3a7/0x510
[   17.919976]  [<ffffffff815d2db2>] __sys_recvmsg+0x42/0x80
[   17.919978]  [<ffffffff815d2e02>] SyS_recvmsg+0x12/0x20
[   17.919981]  [<ffffffff81737259>] system_call_fastpath+0x16/0x1b

Comment 3 Vinson Lee 2013-08-24 18:55:35 UTC
3.11.0-0.rc6.git2.1.fc21.x86_64

[   26.812822] ======================================================
[   26.812823] [ INFO: possible circular locking dependency detected ]
[   26.812825] 3.11.0-0.rc6.git2.1.fc21.x86_64 #1 Not tainted
[   26.812826] -------------------------------------------------------
[   26.812835] crda/547 is trying to acquire lock:
[   26.812836]  (genl_mutex){+.+.+.}, at: [<ffffffff816215b2>] ctrl_dumpfamily+0x172/0x180
[   26.812843] 
but task is already holding lock:
[   26.812844]  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161dae1>] netlink_dump+0x21/0x230
[   26.812847] 
which lock already depends on the new lock.

[   26.812849] 
the existing dependency chain (in reverse order) is:
[   26.812850] 
-> #1 (nlk->cb_mutex){+.+.+.}:
[   26.812852]        [<ffffffff810e9422>] lock_acquire+0xa2/0x1f0
[   26.812856]        [<ffffffff81727317>] mutex_lock_nested+0x87/0x3e0
[   26.812858]        [<ffffffff8161e6a5>] __netlink_dump_start+0xb5/0x1c0
[   26.812860]        [<ffffffff81622774>] genl_family_rcv_msg+0x264/0x300
[   26.812862]        [<ffffffff8162289e>] genl_rcv_msg+0x8e/0xd0
[   26.812863]        [<ffffffff81620559>] netlink_rcv_skb+0xa9/0xc0
[   26.812865]        [<ffffffff81620ab8>] genl_rcv+0x28/0x40
[   26.812867]        [<ffffffff8161fb9d>] netlink_unicast+0xdd/0x190
[   26.812868]        [<ffffffff8161ff79>] netlink_sendmsg+0x329/0x750
[   26.812870]        [<ffffffff815d0619>] sock_sendmsg+0x99/0xd0
[   26.812877]        [<ffffffff815d0a3e>] ___sys_sendmsg+0x39e/0x3b0
[   26.812878]        [<ffffffff815d1fa2>] __sys_sendmsg+0x42/0x80
[   26.812881]        [<ffffffff815d1ff2>] SyS_sendmsg+0x12/0x20
[   26.812882]        [<ffffffff817365d9>] system_call_fastpath+0x16/0x1b
[   26.812885] 
-> #0 (genl_mutex){+.+.+.}:
[   26.812887]        [<ffffffff810e88c4>] __lock_acquire+0x17b4/0x1b20
[   26.812889]        [<ffffffff810e9422>] lock_acquire+0xa2/0x1f0
[   26.812891]        [<ffffffff81727317>] mutex_lock_nested+0x87/0x3e0
[   26.812892]        [<ffffffff816215b2>] ctrl_dumpfamily+0x172/0x180
[   26.812894]        [<ffffffff8161db40>] netlink_dump+0x80/0x230
[   26.812895]        [<ffffffff8161dfc8>] netlink_recvmsg+0x2d8/0x390
[   26.812897]        [<ffffffff815d0cc8>] sock_recvmsg+0xa8/0xe0
[   26.812899]        [<ffffffff815d1733>] ___sys_recvmsg+0x113/0x290
[   26.812901]        [<ffffffff815d2212>] __sys_recvmsg+0x42/0x80
[   26.812903]        [<ffffffff815d2262>] SyS_recvmsg+0x12/0x20
[   26.812904]        [<ffffffff817365d9>] system_call_fastpath+0x16/0x1b
[   26.812906] 
other info that might help us debug this:

[   26.812908]  Possible unsafe locking scenario:

[   26.812909]        CPU0                    CPU1
[   26.812909]        ----                    ----
[   26.812910]   lock(nlk->cb_mutex);
[   26.812911]                                lock(genl_mutex);
[   26.812913]                                lock(nlk->cb_mutex);
[   26.812914]   lock(genl_mutex);
[   26.812915] 
 *** DEADLOCK ***

[   26.812925] 1 lock held by crda/547:
[   26.812925]  #0:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff8161dae1>] netlink_dump+0x21/0x230
[   26.812928] 
stack backtrace:
[   26.812930] CPU: 0 PID: 547 Comm: crda Not tainted 3.11.0-0.rc6.git2.1.fc21.x86_64 #1
[   26.812931] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012
[   26.812933]  ffffffff82584270 ffff8800382ff910 ffffffff81723706 ffffffff82584270
[   26.812935]  ffff8800382ff950 ffffffff8171f307 ffff8800382ff9a0 ffff88001d21e450
[   26.812937]  ffff88001d21dc00 0000000000000001 0000000000000001 ffff88001d21e450
[   26.812939] Call Trace:
[   26.812943]  [<ffffffff81723706>] dump_stack+0x54/0x74
[   26.812946]  [<ffffffff8171f307>] print_circular_bug+0x201/0x20f
[   26.812948]  [<ffffffff810e88c4>] __lock_acquire+0x17b4/0x1b20
[   26.812950]  [<ffffffff810e9422>] lock_acquire+0xa2/0x1f0
[   26.812952]  [<ffffffff816215b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.812953]  [<ffffffff81727317>] mutex_lock_nested+0x87/0x3e0
[   26.812955]  [<ffffffff816215b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.812957]  [<ffffffff816215b2>] ? ctrl_dumpfamily+0x172/0x180
[   26.812959]  [<ffffffff815da581>] ? __kmalloc_reserve.isra.25+0x31/0x90
[   26.812961]  [<ffffffff816215b2>] ctrl_dumpfamily+0x172/0x180
[   26.812963]  [<ffffffff8161db40>] netlink_dump+0x80/0x230
[   26.812964]  [<ffffffff8161dfc8>] netlink_recvmsg+0x2d8/0x390
[   26.812966]  [<ffffffff815d0cc8>] sock_recvmsg+0xa8/0xe0
[   26.812968]  [<ffffffff810e35ad>] ? trace_hardirqs_off+0xd/0x10
[   26.812970]  [<ffffffff810e410f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.812972]  [<ffffffff810e9878>] ? lock_release_non_nested+0x308/0x350
[   26.812974]  [<ffffffff810e410f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.812976]  [<ffffffff815d1733>] ___sys_recvmsg+0x113/0x290
[   26.812978]  [<ffffffff810e410f>] ? lock_release_holdtime.part.28+0xf/0x1a0
[   26.812981]  [<ffffffff810a5b6f>] ? up_read+0x1f/0x40
[   26.812985]  [<ffffffff812101a7>] ? fget_light+0x3a7/0x510
[   26.812987]  [<ffffffff815d2212>] __sys_recvmsg+0x42/0x80
[   26.812989]  [<ffffffff815d2262>] SyS_recvmsg+0x12/0x20
[   26.812991]  [<ffffffff817365d9>] system_call_fastpath+0x16/0x1b

Comment 4 Josh Boyer 2013-08-26 14:46:40 UTC
This should be fixed in 3.11-rc7, which will be built later today.