Bug 1056170

Summary: INFO: inconsistent lock state
Product: [Fedora] Fedora Reporter: Peter H. Jones <jones.peter.busi>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 22CC: collura, gansalmon, itamar, jonathan, jwboyer, kernel-maint, madhu.chinakonda
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-3.13.5-101.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-25 12:41:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg output
none
dmesg output containing backtrace
none
dmesg output none

Description Peter H. Jones 2014-01-21 15:59:15 UTC
Created attachment 853351 [details]
dmesg output

Description of problem:
Inconsistent lock state messages in dmesg

Version-Release number of selected component (if applicable):
kernel-3.13.0-0.rc8.git4.1.fc21.x86_64

How reproducible:
Tried only once

Steps to Reproduce:
1. Booted Fedora-Live-Jam-KDE-x86_64-rawhide-20140120.iso
2. Selected Troubleshooting, Start with basic graphics
3.

Actual results:
Color patches, then dark screen

Expected results:
Live-Jam startup

Additional info:
Was able to get startup with CTRL-ALT-F2, login as liveuser, and startx

Comment 1 Stanislaw Gruszka 2014-01-23 20:44:48 UTC
This is because of threadirqs boot option, which make irq handler run from thread context. We could mark usb_hcd_irq with IRQF_NO_THREAD, but most likely various other interrupt handlers need that flag as well to avoid similar LOCKDEP warnings. As threadirq is debug only option to prevent kernel crashes from hard irq context, I think I prefer to just ignore this warning. Eventually make LOCKDEP smarter and do not print messages (or print different messages) if threadirqs is in use.

Comment 2 Peter H. Jones 2014-02-22 20:58:41 UTC
Still seeing this in kernel-3.14.0-0.rc3.git2.1.fc21.x86_64, from booting a copy of Fedora-Live-Jam-KDE-x86_64-rawhide-20140220.iso copied to a USB drive with dd.

Comment 3 Peter H. Jones 2014-02-23 13:13:08 UTC
Created attachment 866661 [details]
dmesg output containing backtrace

Adding call trace from dmesg output for comment 2:

"[    5.613492] usb 1-1: new high-speed USB device number 2 using ehci-pci

[    5.621177] =================================
[    5.621289] [ INFO: inconsistent lock state ]
[    5.621402] 3.14.0-0.rc3.git2.1.fc21.x86_64 #1 Not tainted
[    5.621519] ---------------------------------
[    5.621631] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[    5.621756] swapper/4/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[    5.621872]  (&(&ehci->lock)->rlock){?.-...}, at: [<ffffffff81592f4b>] ehci_hrtimer_func+0x3b/0xf0
[    5.622030] {HARDIRQ-ON-W} state was registered at:
[    5.622030]   [<ffffffff810f93d9>] __lock_acquire+0x659/0x1c40
[    5.622030]   [<ffffffff810fb1b2>] lock_acquire+0xa2/0x1d0
[    5.622030]   [<ffffffff817d888e>] _raw_spin_lock+0x3e/0x80
[    5.622030]   [<ffffffff8159ac41>] ehci_irq+0x41/0x570
[    5.622030]   [<ffffffff815794f6>] usb_hcd_irq+0x26/0x40
[    5.622030]   [<ffffffff8110ed5f>] irq_forced_thread_fn+0x2f/0x60
[    5.622030]   [<ffffffff8110f1cf>] irq_thread+0x13f/0x170
[    5.622030]   [<ffffffff810c432f>] kthread+0xff/0x120
[    5.622030]   [<ffffffff817e373c>] ret_from_fork+0x7c/0xb0
[    5.622030] irq event stamp: 61096
[    5.622030] hardirqs last  enabled at (61093): [<ffffffff8162b890>] cpuidle_enter_state+0x50/0xd0
[    5.622030] hardirqs last disabled at (61094): [<ffffffff817e44ed>] apic_timer_interrupt+0x6d/0x80
[    5.622030] softirqs last  enabled at (61096): [<ffffffff8109be22>] _local_bh_enable+0x22/0x50
[    5.622030] softirqs last disabled at (61095): [<ffffffff8109da20>] irq_enter+0x60/0x90
[    5.622030]
other info that might help us debug this:
[    5.622030]  Possible unsafe locking scenario:

[    5.622030]        CPU0
[    5.622030]        ----
[    5.622030]   lock(&(&ehci->lock)->rlock);
[    5.622030]   <Interrupt>
[    5.622030]     lock(&(&ehci->lock)->rlock);
[    5.622030]
 *** DEADLOCK ***

[    5.622030] no locks held by swapper/4/0.
[    5.622030]
stack backtrace:
[    5.622030] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.14.0-0.rc3.git2.1.fc21.x86_64 #1
[    5.622030] Hardware name: TOSHIBA Qosmio X770/PGRAA, BIOS 1.60 12/22/2011
[    5.622030]  0000000000000000 de3d1b47a65db5ab ffff880236c03c68 ffffffff817cf992
[    5.622030]  ffff8802331bcc50 ffff880236c03cb8 ffffffff817cb22a 0000000000000000
[    5.622030]  ffff880200000000 ffff880200000001 0000000000000002 ffff8802331bcc50
[    5.622030] Call Trace:
[    5.622030]  <IRQ>  [<ffffffff817cf992>] dump_stack+0x4d/0x66
[    5.622030]  [<ffffffff817cb22a>] print_usage_bug+0x1f4/0x205
[    5.622030]  [<ffffffff810f7be0>] ? check_usage_backwards+0x180/0x180
[    5.622030]  [<ffffffff810f8633>] mark_lock+0x223/0x2b0
[    5.622030]  [<ffffffff810f96c2>] __lock_acquire+0x942/0x1c40
[    5.622030]  [<ffffffff81024729>] ? sched_clock+0x9/0x10
[    5.622030]  [<ffffffff810dd1bd>] ? sched_clock_local+0x1d/0x80
[    5.622030]  [<ffffffff810246b5>] ? native_sched_clock+0x35/0xa0
[    5.622030]  [<ffffffff81024729>] ? sched_clock+0x9/0x10
[    5.622030]  [<ffffffff810dd1bd>] ? sched_clock_local+0x1d/0x80
[    5.622030]  [<ffffffff813e1f9d>] ? debug_object_deactivate+0x14d/0x1a0
[    5.622030]  [<ffffffff810fb1b2>] lock_acquire+0xa2/0x1d0
[    5.622030]  [<ffffffff81592f4b>] ? ehci_hrtimer_func+0x3b/0xf0
[    5.622030]  [<ffffffff817d93d6>] _raw_spin_lock_irqsave+0x56/0x90
[    5.622030]  [<ffffffff81592f4b>] ? ehci_hrtimer_func+0x3b/0xf0
[    5.622030]  [<ffffffff81592f4b>] ehci_hrtimer_func+0x3b/0xf0
[    5.622030]  [<ffffffff810c7f26>] __run_hrtimer+0x86/0x440
[    5.622030]  [<ffffffff81592f10>] ? ehci_enable_event+0x80/0x80
[    5.622030]  [<ffffffff810c8d87>] hrtimer_interrupt+0x107/0x260
[    5.622030]  [<ffffffff8104fcb7>] local_apic_timer_interrupt+0x37/0x60
[    5.622030]  [<ffffffff817e5baf>] smp_apic_timer_interrupt+0x3f/0x60
[    5.622030]  [<ffffffff817e44f2>] apic_timer_interrupt+0x72/0x80
[    5.622030]  <EOI>  [<ffffffff8162b894>] ? cpuidle_enter_state+0x54/0xd0
[    5.622030]  [<ffffffff8162b890>] ? cpuidle_enter_state+0x50/0xd0
[    5.622030]  [<ffffffff8162b9c9>] cpuidle_idle_call+0xb9/0x380
[    5.622030]  [<ffffffff8102636e>] arch_cpu_idle+0xe/0x40
[    5.622030]  [<ffffffff8110d385>] cpu_startup_entry+0xf5/0x420
[    5.622030]  [<ffffffff8104dc80>] start_secondary+0x240/0x300
"

Hope it helps.

Comment 4 Stanislaw Gruszka 2014-02-24 07:34:28 UTC
Problem happens because with threadirqs option hrtimers still use softirq context for they callbacks. This should be fixed in hrtimer code, which is not easy. I posted temporal usb code fix for this bug: 

http://marc.info/?l=linux-usb&m=139280201613876&w=2

Josh, please apply this patch as fix for this bug. It is temporal fix, until proper changes in hrtimer code will show up (use thread for hrtimer callbacks if thredirqs option is used).

Comment 5 Josh Boyer 2014-02-24 16:15:17 UTC
Applied in git.  Thanks!

Comment 6 Fedora Update System 2014-02-24 21:11:01 UTC
kernel-3.13.5-200.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.13.5-200.fc20

Comment 7 Fedora Update System 2014-02-26 02:15:04 UTC
kernel-3.13.5-101.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/FEDORA-2014-2887/kernel-3.13.5-101.fc19

Comment 8 Fedora Update System 2014-02-26 13:54:36 UTC
Package kernel-3.13.5-200.fc20:
* should fix your issue,
* was pushed to the Fedora 20 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.13.5-200.fc20'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-3094/kernel-3.13.5-200.fc20
then log in and leave karma (feedback).

Comment 9 Fedora Update System 2014-02-28 18:35:38 UTC
kernel-3.13.5-200.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 10 Fedora Update System 2014-03-01 14:04:41 UTC
kernel-3.13.5-101.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 11 Peter H. Jones 2014-03-14 05:18:02 UTC
Created attachment 874247 [details]
dmesg output

Got this bug (re)booting http://kojipkgs.fedoraproject.org//work/tasks/7637/6627637/Fedora-Live-Jam-KDE-x86_64-rawhide-20140312.iso .
I used the TAB key to add a 3 to the end of the boot command line.

The trace information starts around line 945, for about 80 lines.

Kernel version is 3.14.0-0.rc6.git1.1.fc21.x86_64 .

Comment 12 Stanislaw Gruszka 2014-03-14 06:01:30 UTC
This is the same situation, but with different lock:

[   89.664545]   lock(hcd_urb_list_lock);
[   89.667579]   <Interrupt>
[   89.670583]     lock(hcd_urb_list_lock);

I think we need decent fix for that i.e. make hrtimer use threads instead of softirq when threadirqs option is used, otherwise we will get more report of those kind of problems.

Comment 13 Jaroslav Reznik 2015-03-03 16:58:32 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle.
Changing version to '22'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22