Bug 718428

Summary: shutdown: possible recursive locking detected
Product: [Fedora] Fedora Reporter: John Reiser <jreiser>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WORKSFORME QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: aquini, gansalmon, itamar, jonathan, jwboyer, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-15 15:41:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
serial console output captured by minicom none

Description John Reiser 2011-07-02 20:45:23 UTC
Created attachment 511012 [details]
serial console output captured by minicom

Description of problem: Shutdown from command line triggers a complaint from the kernel about possible recursive locking.


Version-Release number of selected component (if applicable):
kernel-3.0-0.rc5.git0.1.fc16.x86_64

How reproducible: every time (3 times so far)


Steps to Reproduce:
1. boot kernel-3.0-0.rc5.git0.1.fc16.x86_64
2. login as root (superuser privileges)
3. shutdown -h now
  
Actual results: (from serial console)
[   71.778245] Power down.
[   71.781041]
[   71.781041] =============================================
[   71.781711] [ INFO: possible recursive locking detected ]
[   71.781711] 3.0-0.rc5.git0.1.fc16.x86_64 #1
[   71.781711] ---------------------------------------------
[   71.781711] systemd-shutdow/1 is trying to acquire lock:
[   71.781711]  (&(lock)->rlock){......}, at: [<ffffffff812a5947>] acpi_os_acquire_lock+0xe/0x10
[   71.781711]
[   71.781711] but task is already holding lock:
[   71.781711]  (&(lock)->rlock){......}, at: [<ffffffff812a5947>] acpi_os_acquire_lock+0xe/0x10
[   71.781711]
[   71.781711] other info that might help us debug this:
[   71.781711]  Possible unsafe locking scenario:
[   71.781711]
[   71.781711]        CPU0
[   71.781711]        ----
[   71.781711]   lock(&(lock)->rlock);
[   71.781711]   lock(&(lock)->rlock);
[   71.781711]
[   71.781711]  *** DEADLOCK ***
[   71.781711]
[   71.781711]  May be due to missing lock nesting notation
[   71.781711]
[   71.781711] 2 locks held by systemd-shutdow/1:
[   71.781711]  #0:  (reboot_mutex){+.+.+.}, at: [<ffffffff8106b4fb>] sys_reboot+0x9f/0x1b0
[   71.781711]  #1:  (&(lock)->rlock){......}, at: [<ffffffff812a5947>] acpi_os_acquire_lock+0xe/0x10
[   71.781711]
[   71.781711] stack backtrace:
[   71.781711] Pid: 1, comm: systemd-shutdow Not tainted 3.0-0.rc5.git0.1.fc16.x86_64 #1
[   71.781711] Call Trace:
[   71.781711]  [<ffffffff81088b11>] __lock_acquire+0x917/0xcf7
[   71.781711]  [<ffffffff810886ea>] ? __lock_acquire+0x4f0/0xcf7
[   71.781711]  [<ffffffff812a5947>] ? acpi_os_acquire_lock+0xe/0x10
[   71.781711]  [<ffffffff8108937e>] lock_acquire+0xbf/0x103
[   71.781711]  [<ffffffff812a5947>] ? acpi_os_acquire_lock+0xe/0x10
[   71.781711]  [<ffffffff812c128c>] ? acpi_hw_enable_runtime_gpe_block+0x48/0x48
[   71.781711]  [<ffffffff814f2eff>] _raw_spin_lock_irqsave+0x45/0x7f
[   71.781711]  [<ffffffff812a5947>] ? acpi_os_acquire_lock+0xe/0x10
[   71.781711]  [<ffffffff812a5947>] acpi_os_acquire_lock+0xe/0x10
[   71.781711]  [<ffffffff812b8e10>] acpi_ev_walk_gpe_list+0x28/0x92
[   71.781711]  [<ffffffff812c1a7e>] acpi_hw_clear_acpi_status+0x84/0xa2
[   71.781711]  [<ffffffff812c1e51>] acpi_enter_sleep_state+0x92/0x20e
[   71.781711]  [<ffffffff814e912e>] ? printk+0x51/0x53
[   71.781711]  [<ffffffff812a6b62>] acpi_power_off+0x34/0x36
[   71.781711]  [<ffffffff8101fa41>] native_machine_power_off+0x27/0x33
[   71.781711]  [<ffffffff8101fa07>] machine_power_off+0xf/0x11
[   71.781711]  [<ffffffff8106ac18>] kernel_power_off+0x48/0x4a
[   71.781711]  [<ffffffff8106b57e>] sys_reboot+0x122/0x1b0
[   71.781711]  [<ffffffff81085cd2>] ? trace_hardirqs_off+0xd/0xf
[   71.781711]  [<ffffffff8107a4ea>] ? local_clock+0x36/0x4d
[   71.781711]  [<ffffffff81086216>] ? lock_release_holdtime.part.8+0x6b/0x72
[   71.781711]  [<ffffffff81089296>] ? lock_release+0x173/0x19c
[   71.781711]  [<ffffffff814fa1ba>] ? sysret_check+0x2e/0x69
[   71.781711]  [<ffffffff81089781>] ? trace_hardirqs_on_caller+0x10b/0x12f
[   71.781711]  [<ffffffff8125cbae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   71.781711]  [<ffffffff814f3605>] ? _raw_spin_unlock+0x28/0x2c
[   71.781711]  [<ffffffff814fa182>] system_call_fastpath+0x16/0x1b



Expected results: no complaint

Additional info:

Comment 1 John Reiser 2011-07-02 20:47:53 UTC
systemd-29-1.fc16.x86_64

Comment 2 Josh Boyer 2011-08-15 15:06:06 UTC
Is this still happening on a current f16 kernel?

Comment 3 John Reiser 2011-08-15 15:32:11 UTC
The exact situation is not reproducible in kernel-3.0.0-1.fc16.x86_64, but I do see this which might be related:
[   34.027138] =============================================
[   34.027140] [ INFO: possible recursive locking detected ]
[   34.027141] 3.0.0-1.fc16.x86_64 #1
[   34.027142] ---------------------------------------------
[   34.027144] systemd-logind/945 is trying to acquire lock:
[   34.027145]  (&ep->mtx){+.+.+.}, at: [<ffffffff8116ee12>] ep_scan_ready_list+0x3a/0x19f
[   34.027151]
[   34.027152] but task is already holding lock:
[   34.027153]  (&ep->mtx){+.+.+.}, at: [<ffffffff8116f38c>] sys_epoll_ctl+0x120/0x51d
[   34.027156]
[   34.027157] o info that might help us debug this:
[   34.027158]  Possible unsafe locking scenario:
[   34.027158]
[   34.027159]        CPU0
[   34.027160]        ----
[   34.027160]   lock(&ep->mtx);
[   34.027162]   lock(&ep->mtx);
[   34.027163]
[   34.027163]  *** DEADLOCK ***
[   34.027164]
[   34.027165]  May be due to missing lock nesting notation
[   34.027165]
[   34.027166] 2 locks held by systemd-logind/945:
[   34.027168]  #0:  (epmutex){+.+.+.}, at: [<ffffffff8116f344>] sys_epoll_ctl+0xd8/0x51d
[   34.027171]  #1:  (&ep->mtx){+.+.+.}, at: [<ffffffff8116f38c>] sys_epoll_ctl+0x120/0x51d
[   34.027174]
[   34.027175] stack backtrace:
[   34.027176] Pid: 945, comm: systemd-logind Not tainted 3.0.0-1.fc16.x86_64 #1
[   34.027178] Call Trace:
[   34.027182]  [<ffffffff8108a375>] __lock_acquire+0x917/0xcf7
[   34.027185]  [<ffffffff8100af3a>] ? dump_trace+0x2fe/0x30d
[   34.027188]  [<ffffffff8100e9fd>] ? paravirt_read_tsc+0x9/0xd
[   34.027190]  [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f
[   34.027192]  [<ffffffff8108abe2>] lock_acquire+0xbf/0x103
[   34.027194]  [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f
[   34.027196]  [<ffffffff81087ea6>] ? save_trace+0x3d/0xa7
[   34.027198]  [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f
[   34.027201]  [<ffffffff8116e8bf>] ? ep_remove+0xb4/0xb4
[   34.027204]  [<ffffffff814dc8f2>] __mutex_lock_common+0x4c/0x361
[   34.027206]  [<ffffffff8116ee12>] ? ep_scan_ready_list+0x3a/0x19f
[   34.027208]  [<ffffffff8100e9fd>] ? paravirt_read_tsc+0x9/0xd
[   34.027210]  [<ffffffff8100eec7>] ? native_sched_clock+0x34/0x36
[   34.027213]  [<ffffffff8100eed2>] ? sched_clock+0x9/0xd
[   34.027215]  [<ffffffff8108986b>] ? mark_lock+0x2d/0x220
[   34.027217]  [<ffffffff8116e8bf>] ? ep_remove+0xb4/0xb4
[   34.027219]  [<ffffffff814dcd16>] mutex_lock_nested+0x40/0x45
[   34.027221]  [<ffffffff8116ee12>] ep_scan_ready_list+0x3a/0x19f
[   34.027223]  [<ffffffff8116ef77>] ? ep_scan_ready_list+0x19f/0x19f
[   34.027225]  [<ffffffff8116ef8e>] ep_poll_readyevents_proc+0x17/0x19
[   34.027228]  [<ffffffff8116eb9d>] ep_call_nested.constprop.3+0x90/0xcc
[   34.027230]  [<ffffffff8116ecc2>] ep_eventpoll_poll+0x4e/0x5d
[   34.027232]  [<ffffffff8116f49e>] sys_epoll_ctl+0x232/0x51d
[   34.027234]  [<ffffffff8116ea58>] ? ep_set_mstimeout+0x49/0x49
[   34.027237]  [<ffffffff814e4fc2>] system_call_fastpath+0x16/0x1b

Comment 4 Josh Boyer 2011-08-15 15:41:47 UTC
(In reply to comment #3)
> The exact situation is not reproducible in kernel-3.0.0-1.fc16.x86_64, but I do
> see this which might be related:

No, that's bug 722472 which is already fixed.

I'm going to close this bug out as WORKSFORME.  If you see the ACPI error again, please re-open.