Bug 734519 - possible circular locking dependency detected in restore_regulatory_settings
Summary: possible circular locking dependency detected in restore_regulatory_settings
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-30 16:44 UTC by Mikko Tiihonen
Modified: 2012-09-07 16:07 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-09-07 16:07:55 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Mikko Tiihonen 2011-08-30 16:44:37 UTC
Description of problem:
=======================================================
[ INFO: possible circular locking dependency detected ]
3.1.0-0.rc4.git0.0.fc16.x86_64 #1
-------------------------------------------------------
kworker/5:2/511 is trying to acquire lock:
 (cfg80211_mutex){+.+.+.}, at: [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]

but task is already holding lock:
 ((reg_timeout).work){+.+...}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 ((reg_timeout).work){+.+...}:
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81074f79>] wait_on_work+0x55/0xc7
       [<ffffffff810760b3>] __cancel_work_timer+0xcc/0x10a
       [<ffffffff81076103>] cancel_delayed_work_sync+0x12/0x14
       [<ffffffffa02eecb8>] reg_set_request_processed+0x4e/0x68 [cfg80211]
       [<ffffffffa02efeef>] set_regdom+0x43c/0x4c0 [cfg80211]
       [<ffffffffa02f9159>] nl80211_set_reg+0x1ce/0x22a [cfg80211]
       [<ffffffff8143af6e>] genl_rcv_msg+0x1db/0x206
       [<ffffffff8143a997>] netlink_rcv_skb+0x43/0x8f
       [<ffffffff8143ad8c>] genl_rcv+0x26/0x2d
       [<ffffffff8143a493>] netlink_unicast+0xec/0x156
       [<ffffffff8143a781>] netlink_sendmsg+0x284/0x2c5
       [<ffffffff81402f0f>] sock_sendmsg+0xe6/0x109
       [<ffffffff81404cc3>] __sys_sendmsg+0x226/0x2cf
       [<ffffffff81405efb>] sys_sendmsg+0x42/0x60
       [<ffffffff8150b742>] system_call_fastpath+0x16/0x1b

-> #1 (reg_mutex){+.+.+.}:
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
       [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
       [<ffffffffa02ef10d>] reg_todo+0x32/0x4a4 [cfg80211]
       [<ffffffff81075c49>] process_one_work+0x205/0x3e7
       [<ffffffff810768f7>] worker_thread+0xda/0x15d
       [<ffffffff8107a2bd>] kthread+0xa8/0xb0
       [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10

-> #0 (cfg80211_mutex){+.+.+.}:
       [<ffffffff8108e963>] __lock_acquire+0xa2f/0xd0c
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
       [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
       [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
       [<ffffffffa02ef8cd>] reg_timeout_work+0x13/0x15 [cfg80211]
       [<ffffffff81075c49>] process_one_work+0x205/0x3e7
       [<ffffffff810768f7>] worker_thread+0xda/0x15d
       [<ffffffff8107a2bd>] kthread+0xa8/0xb0
       [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex --> (reg_timeout).work

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((reg_timeout).work);
                               lock(reg_mutex);
                               lock((reg_timeout).work);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

2 locks held by kworker/5:2/511:
 #0:  (events){.+.+.+}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7
 #1:  ((reg_timeout).work){+.+...}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7

stack backtrace:
Pid: 511, comm: kworker/5:2 Tainted: G        W   3.1.0-0.rc4.git0.0.fc16.x86_64 #1
Call Trace:
 [<ffffffff814fa254>] print_circular_bug+0x1f8/0x209
 [<ffffffff8108e963>] __lock_acquire+0xa2f/0xd0c
 [<ffffffff8108dd41>] ? mark_lock+0x2d/0x220
 [<ffffffff8108e439>] ? __lock_acquire+0x505/0xd0c
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef8ba>] ? restore_regulatory_settings+0x2e6/0x2e6 [cfg80211]
 [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffff810804e6>] ? local_clock+0x36/0x4d
 [<ffffffff81014ded>] ? paravirt_read_tsc+0x9/0xd
 [<ffffffff810152b7>] ? native_sched_clock+0x34/0x36
 [<ffffffff8108b9b5>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffffa02ef8ba>] ? restore_regulatory_settings+0x2e6/0x2e6 [cfg80211]
 [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
 [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef8cd>] reg_timeout_work+0x13/0x15 [cfg80211]
 [<ffffffff81075c49>] process_one_work+0x205/0x3e7
 [<ffffffff81075b91>] ? process_one_work+0x14d/0x3e7
 [<ffffffff8108d01b>] ? lock_acquired+0x210/0x243
 [<ffffffff810768f7>] worker_thread+0xda/0x15d
 [<ffffffff8107681d>] ? manage_workers+0x176/0x176
 [<ffffffff8107a2bd>] kthread+0xa8/0xb0
 [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10
 [<ffffffff81504db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8107a215>] ? __init_kthread_worker+0x5a/0x5a
 [<ffffffff8150d940>] ? gs_change+0x13/0x13


Version-Release number of selected component (if applicable):
kernel 3.1.0-0.rc4.git0.0.fc16.x86_64

How reproducible:
I had flaky wlan that caused the connection to switch between world regulatory domain and the local regulatory domain. So far I have only found this bug once from the system logs.

Comment 1 Dave Jones 2012-03-22 17:14:33 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 2 Dave Jones 2012-03-22 17:16:48 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 3 Dave Jones 2012-03-22 17:25:32 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 4 Ben Greear 2012-05-18 22:48:56 UTC
I hit this same thing in a somewhat-hacked 3.3.6+ kernel, so it seems
that it still exists in the latest stable code.

Comment 5 Josh Boyer 2012-09-07 16:07:55 UTC
This appears to have been fixed with commit fe20b39ec32e975f1054c0b7866c873a954adf05 in 3.5.  That was backported to 3.4.5 and should thus be fixed in F16.


Note You need to log in before you can comment on or make changes to this bug.