Bug 438584

Summary: iwl4965/938 is trying to acquire lock: [...] but task is already holding lock: [...]
Product: [Fedora] Fedora Reporter: Adrian "Adi1981" P. <adi1981.2k5>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: rawhideCC: grgustaf, jane.lv, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-07 11:56:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Excerpt from /var/log/messages none

Description Adrian "Adi1981" P. 2008-03-22 13:49:41 UTC
Description of problem:

I hae found this today in dmesg, but in fact i've no idea what situation did
produce this error. My wlan ( Intel Corporation PRO/Wireless 4965 AG or AGN
Network Connection (rev 61) ) is working normally and have no problems with
connection to AP (WPA encrypted, using wpa_supplicant)

wlan0: No ProbeResp from current AP 00:0e:2e:ea:9f:e2 - assume out of range

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.25-0.121.rc5.git4.fc9 #1
-------------------------------------------------------
iwl4965/938 is trying to acquire lock:
 (rtnl_mutex){--..}, at: [<c05cd150>] rtnl_lock+0xf/0x11

but task is already holding lock:
 (&ifsta->work){--..}, at: [<c043651e>] run_workqueue+0x91/0x1a1

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&ifsta->work){--..}:
       [<c0445774>] __lock_acquire+0xa99/0xc11
       [<c0445956>] lock_acquire+0x6a/0x90
       [<c043655a>] run_workqueue+0xcd/0x1a1
       [<c04366e4>] worker_thread+0xb6/0xc2
       [<c04392e6>] kthread+0x3b/0x61
       [<c04069ef>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

-> #1 ((name)){--..}:
       [<c0445774>] __lock_acquire+0xa99/0xc11
       [<c0445956>] lock_acquire+0x6a/0x90
       [<c0436de2>] flush_workqueue+0x44/0x85
       [<f89fb421>] ieee80211_stop+0x28d/0x348 [mac80211]
       [<c05c5a11>] dev_close+0x52/0x6f
       [<c05c5768>] dev_change_flags+0x9f/0x152
       [<c05cc2eb>] do_setlink+0x24a/0x2fc
       [<c05cc47f>] rtnl_setlink+0xe2/0xe6
       [<c05cd31a>] rtnetlink_rcv_msg+0x1a2/0x1bc
       [<c05da486>] netlink_rcv_skb+0x30/0x86
       [<c05cd170>] rtnetlink_rcv+0x1e/0x26
       [<c05d9faa>] netlink_unicast+0x1b7/0x215
       [<c05da260>] netlink_sendmsg+0x258/0x265
       [<c05ba4e9>] sock_sendmsg+0xde/0xf9
       [<c05ba643>] sys_sendmsg+0x13f/0x192
       [<c05bb577>] sys_socketcall+0x16b/0x188
       [<c0405d12>] syscall_call+0x7/0xb
       [<ffffffff>] 0xffffffff

-> #0 (rtnl_mutex){--..}:
       [<c0445693>] __lock_acquire+0x9b8/0xc11
       [<c0445956>] lock_acquire+0x6a/0x90
       [<c0637e09>] mutex_lock_nested+0xdb/0x271
       [<c05cd150>] rtnl_lock+0xf/0x11
       [<f8a03628>] ieee80211_associated+0x15c/0x19b [mac80211]
       [<f8a05eaa>] ieee80211_sta_work+0x15ad/0x172a [mac80211]
       [<c0436560>] run_workqueue+0xd3/0x1a1
       [<c04366e4>] worker_thread+0xb6/0xc2
       [<c04392e6>] kthread+0x3b/0x61
       [<c04069ef>] kernel_thread_helper+0x7/0x10
       [<ffffffff>] 0xffffffff

other info that might help us debug this:

2 locks held by iwl4965/938:
 #0:  ((name)){--..}, at: [<c043651e>] run_workqueue+0x91/0x1a1
 #1:  (&ifsta->work){--..}, at: [<c043651e>] run_workqueue+0x91/0x1a1

stack backtrace:
Pid: 938, comm: iwl4965 Not tainted 2.6.25-0.121.rc5.git4.fc9 #1
 [<c0444ac5>] print_circular_bug_tail+0x5b/0x66
 [<c0444938>] ? print_circular_bug_entry+0x39/0x43
 [<c0445693>] __lock_acquire+0x9b8/0xc11
 [<c040a2f4>] ? native_sched_clock+0xb5/0xd1
 [<c04447bb>] ? trace_hardirqs_on+0xe9/0x10a
 [<c0445956>] lock_acquire+0x6a/0x90
 [<c05cd150>] ? rtnl_lock+0xf/0x11
 [<c0637e09>] mutex_lock_nested+0xdb/0x271
 [<c05cd150>] ? rtnl_lock+0xf/0x11
 [<c05cd150>] ? rtnl_lock+0xf/0x11
 [<c05cd150>] rtnl_lock+0xf/0x11
 [<f8a03628>] ieee80211_associated+0x15c/0x19b [mac80211]
 [<f8a05eaa>] ieee80211_sta_work+0x15ad/0x172a [mac80211]
 [<c040a2f4>] ? native_sched_clock+0xb5/0xd1
 [<c040a2f4>] ? native_sched_clock+0xb5/0xd1
 [<c040a2f4>] ? native_sched_clock+0xb5/0xd1
 [<c040a2f4>] ? native_sched_clock+0xb5/0xd1
 [<c040a026>] ? sched_clock+0x8/0xb
 [<c0436560>] run_workqueue+0xd3/0x1a1
 [<c043651e>] ? run_workqueue+0x91/0x1a1
 [<f8a048fd>] ? ieee80211_sta_work+0x0/0x172a [mac80211]
 [<c04366e4>] worker_thread+0xb6/0xc2
 [<c0439537>] ? autoremove_wake_function+0x0/0x33
 [<c043662e>] ? worker_thread+0x0/0xc2
 [<c04392e6>] kthread+0x3b/0x61
 [<c04392ab>] ? kthread+0x0/0x61
 [<c04069ef>] kernel_thread_helper+0x7/0x10
 =======================

Version-Release number of selected component (if applicable):
kernel-2.6.25-0.121.rc5.git4.fc9.i686

Steps to Reproduce:
1. Just booted laptop and started to using wifi, but i don't have more info when
exactly error occured

  
Actual results:
Wireless is working though.

Comment 1 Volker Braun 2008-03-29 20:58:21 UTC
Created attachment 299594 [details]
Excerpt from /var/log/messages

I found a very similar trace with 2.6.25-0.150.rc6.git7.fc9 (Fedora 9 beta +
yum update). In my case, it is triggered by suspend-to-ram (happens during the
suspend phase). Resume works perfectly, I get right back to X with the
out-of-the-box install (awesome work, Fedora team!).

Comment 2 John W. Linville 2008-03-31 13:50:25 UTC
*** Bug 439712 has been marked as a duplicate of this bug. ***

Comment 3 John W. Linville 2008-05-02 20:38:05 UTC
Are you still seeing this?  I haven't seen any reports like this in a while.

Comment 4 Adrian "Adi1981" P. 2008-05-03 17:14:44 UTC
Well, i saw this only once or twice, i don't know what need to be done to
reproduce this, to check if this will repeat or not. For this moment i can tell
that i didn't saw it since some time. If kerneloops is handling those kind of
alerts also, then you will know when (if) it'll appear again :)

Comment 5 Dave Jones 2008-05-03 17:37:04 UTC
since the regular kernel now has debugging disabled, you'll need to install
kernel-debug to see these (assuming the bug is still there)

Comment 6 Adrian "Adi1981" P. 2008-05-04 00:18:27 UTC
So kerneloops will not catch those bugs if debug won't be enabled? In this case
i can install -debuginfo, but i'm quite sure that this bug will not appear again
(or i will probably not catch it :] ). Anyway this bug can be closed imo, if
i'll see it once more, i can always reopen it.

Comment 7 Dave Jones 2008-05-04 01:23:32 UTC
note, not -debuginfo, kernel-debug. It's a seperate kernel image, that has
debugging config options turned on.   -debuginfo is just a bunch of symbols.

And yes, kerneloops won't catch this msg on the regular kernel, because it won't
happen :)  There's quite a bit of overhead with the lockdep debug feature, which
is why we disable it in the non-debug builds.

hope this clears things up.

Comment 8 Adrian "Adi1981" P. 2008-05-06 23:53:39 UTC
Yes, now it's clear to me. I'll let you know if i will found this again.

Comment 9 John W. Linville 2008-05-07 11:56:47 UTC
I'll close this for now, and Adrian can reopen it if the problem reoccurs.  
Thanks, Adrian!