Bug 734519 - possible circular locking dependency detected in restore_regulatory_settings
possible circular locking dependency detected in restore_regulatory_settings
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
16
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: John W. Linville
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-30 12:44 EDT by Mikko Tiihonen
Modified: 2012-09-07 12:07 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-09-07 12:07:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Mikko Tiihonen 2011-08-30 12:44:37 EDT
Description of problem:
=======================================================
[ INFO: possible circular locking dependency detected ]
3.1.0-0.rc4.git0.0.fc16.x86_64 #1
-------------------------------------------------------
kworker/5:2/511 is trying to acquire lock:
 (cfg80211_mutex){+.+.+.}, at: [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]

but task is already holding lock:
 ((reg_timeout).work){+.+...}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 ((reg_timeout).work){+.+...}:
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81074f79>] wait_on_work+0x55/0xc7
       [<ffffffff810760b3>] __cancel_work_timer+0xcc/0x10a
       [<ffffffff81076103>] cancel_delayed_work_sync+0x12/0x14
       [<ffffffffa02eecb8>] reg_set_request_processed+0x4e/0x68 [cfg80211]
       [<ffffffffa02efeef>] set_regdom+0x43c/0x4c0 [cfg80211]
       [<ffffffffa02f9159>] nl80211_set_reg+0x1ce/0x22a [cfg80211]
       [<ffffffff8143af6e>] genl_rcv_msg+0x1db/0x206
       [<ffffffff8143a997>] netlink_rcv_skb+0x43/0x8f
       [<ffffffff8143ad8c>] genl_rcv+0x26/0x2d
       [<ffffffff8143a493>] netlink_unicast+0xec/0x156
       [<ffffffff8143a781>] netlink_sendmsg+0x284/0x2c5
       [<ffffffff81402f0f>] sock_sendmsg+0xe6/0x109
       [<ffffffff81404cc3>] __sys_sendmsg+0x226/0x2cf
       [<ffffffff81405efb>] sys_sendmsg+0x42/0x60
       [<ffffffff8150b742>] system_call_fastpath+0x16/0x1b

-> #1 (reg_mutex){+.+.+.}:
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
       [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
       [<ffffffffa02ef10d>] reg_todo+0x32/0x4a4 [cfg80211]
       [<ffffffff81075c49>] process_one_work+0x205/0x3e7
       [<ffffffff810768f7>] worker_thread+0xda/0x15d
       [<ffffffff8107a2bd>] kthread+0xa8/0xb0
       [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10

-> #0 (cfg80211_mutex){+.+.+.}:
       [<ffffffff8108e963>] __lock_acquire+0xa2f/0xd0c
       [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
       [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
       [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
       [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
       [<ffffffffa02ef8cd>] reg_timeout_work+0x13/0x15 [cfg80211]
       [<ffffffff81075c49>] process_one_work+0x205/0x3e7
       [<ffffffff810768f7>] worker_thread+0xda/0x15d
       [<ffffffff8107a2bd>] kthread+0xa8/0xb0
       [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex --> (reg_timeout).work

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((reg_timeout).work);
                               lock(reg_mutex);
                               lock((reg_timeout).work);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

2 locks held by kworker/5:2/511:
 #0:  (events){.+.+.+}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7
 #1:  ((reg_timeout).work){+.+...}, at: [<ffffffff81075b91>] process_one_work+0x14d/0x3e7

stack backtrace:
Pid: 511, comm: kworker/5:2 Tainted: G        W   3.1.0-0.rc4.git0.0.fc16.x86_64 #1
Call Trace:
 [<ffffffff814fa254>] print_circular_bug+0x1f8/0x209
 [<ffffffff8108e963>] __lock_acquire+0xa2f/0xd0c
 [<ffffffff8108dd41>] ? mark_lock+0x2d/0x220
 [<ffffffff8108e439>] ? __lock_acquire+0x505/0xd0c
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffff8108f143>] lock_acquire+0xf3/0x13e
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef8ba>] ? restore_regulatory_settings+0x2e6/0x2e6 [cfg80211]
 [<ffffffff81502cb3>] __mutex_lock_common+0x5d/0x39a
 [<ffffffffa02ef603>] ? restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffff810804e6>] ? local_clock+0x36/0x4d
 [<ffffffff81014ded>] ? paravirt_read_tsc+0x9/0xd
 [<ffffffff810152b7>] ? native_sched_clock+0x34/0x36
 [<ffffffff8108b9b5>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffffa02ef8ba>] ? restore_regulatory_settings+0x2e6/0x2e6 [cfg80211]
 [<ffffffff815030ff>] mutex_lock_nested+0x40/0x45
 [<ffffffffa02ef603>] restore_regulatory_settings+0x2f/0x2e6 [cfg80211]
 [<ffffffffa02ef8cd>] reg_timeout_work+0x13/0x15 [cfg80211]
 [<ffffffff81075c49>] process_one_work+0x205/0x3e7
 [<ffffffff81075b91>] ? process_one_work+0x14d/0x3e7
 [<ffffffff8108d01b>] ? lock_acquired+0x210/0x243
 [<ffffffff810768f7>] worker_thread+0xda/0x15d
 [<ffffffff8107681d>] ? manage_workers+0x176/0x176
 [<ffffffff8107a2bd>] kthread+0xa8/0xb0
 [<ffffffff8150d944>] kernel_thread_helper+0x4/0x10
 [<ffffffff81504db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8107a215>] ? __init_kthread_worker+0x5a/0x5a
 [<ffffffff8150d940>] ? gs_change+0x13/0x13


Version-Release number of selected component (if applicable):
kernel 3.1.0-0.rc4.git0.0.fc16.x86_64

How reproducible:
I had flaky wlan that caused the connection to switch between world regulatory domain and the local regulatory domain. So far I have only found this bug once from the system logs.
Comment 1 Dave Jones 2012-03-22 13:14:33 EDT
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.
Comment 2 Dave Jones 2012-03-22 13:16:48 EDT
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.
Comment 3 Dave Jones 2012-03-22 13:25:32 EDT
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.
Comment 4 Ben Greear 2012-05-18 18:48:56 EDT
I hit this same thing in a somewhat-hacked 3.3.6+ kernel, so it seems
that it still exists in the latest stable code.
Comment 5 Josh Boyer 2012-09-07 12:07:55 EDT
This appears to have been fixed with commit fe20b39ec32e975f1054c0b7866c873a954adf05 in 3.5.  That was backported to 3.4.5 and should thus be fixed in F16.

Note You need to log in before you can comment on or make changes to this bug.