Bug 1415516 - Possible Nouveau or PM deadlock kworker Xorg [NEEDINFO]
Summary: Possible Nouveau or PM deadlock kworker Xorg
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 25
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-23 00:03 UTC by Brad Hubbard
Modified: 2019-01-09 12:54 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-04-28 17:08:10 UTC
Type: Bug
jforbes: needinfo?


Attachments (Terms of Use)

Description Brad Hubbard 2017-01-23 00:03:24 UTC
Description of problem:

On boot X does ot start and a kworker thread and Xorg* show d-state in ps output. This could be PM related maybe?

Jan 23 09:28:05 rskikr kernel: sysrq: SysRq : Show Blocked State
Jan 23 09:28:05 rskikr kernel:  task                        PC stack   pid father
Jan 23 09:28:05 rskikr kernel: kworker/3:1     D    0    90      2 0x00000000
Jan 23 09:28:05 rskikr kernel: Workqueue: events output_poll_execute [drm_kms_helper]
Jan 23 09:28:05 rskikr kernel: ffff92575a0a0000 ffff92575a0a0000 ffff92576b411e80 ffff92577e2d9600
Jan 23 09:28:05 rskikr kernel: ffff9257536e0000 ffffae9bc1bc3c58 ffffffff9e817a45 ffffffff9e484805
Jan 23 09:28:05 rskikr kernel: 00ffae9bc1bc3c28 ffff92577e2d9600 0000000000000002 ffff92576b411e80
Jan 23 09:28:05 rskikr kernel: Call Trace:
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817a45>] ? __schedule+0x225/0x6d0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e484805>] ? acpi_os_signal_semaphore+0x2c/0x38
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817f26>] schedule+0x36/0x80
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e556183>] rpm_resume+0x123/0x6b0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0e7270>] ? prepare_to_wait_event+0x100/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e5575be>] __pm_runtime_resume+0x4e/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffffc0792060>] nouveau_connector_detect+0x70/0x520 [nouveau]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0daad3>] ? set_next_entity+0xc3/0x1a0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e025728>] ? __switch_to+0x2a8/0x5c0
Jan 23 09:28:05 rskikr kernel: [<ffffffffc04bab71>] output_poll_execute+0xe1/0x1e0 [drm_kms_helper]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0bc4e4>] process_one_work+0x184/0x430
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0bc7de>] worker_thread+0x4e/0x490
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0bc790>] ? process_one_work+0x430/0x430
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0c2569>] kthread+0xd9/0xf0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e0c2490>] ? kthread_park+0x60/0x60
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e81ced5>] ret_from_fork+0x25/0x30
Jan 23 09:28:05 rskikr kernel: Xorg            D    0  1342      1 0x00000004
Jan 23 09:28:05 rskikr kernel: ffff92575a0a0000 0000000000000000 ffff9257536e0000 ffff92577e2d9600
Jan 23 09:28:05 rskikr kernel: ffff92576c2cbd00 ffffae9bc8c0f938 ffffffff9e817a45 0000000000000000
Jan 23 09:28:05 rskikr kernel: 00ffae9bc8c0f930 ffff92577e2d9600 ffff925765c9b800 ffff9257536e0000
Jan 23 09:28:05 rskikr kernel: Call Trace:
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817a45>] ? __schedule+0x225/0x6d0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817f26>] schedule+0x36/0x80
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e8181ee>] schedule_preempt_disabled+0xe/0x10
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819c5f>] __mutex_lock_slowpath+0xaf/0x120
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819cef>] mutex_lock+0x1f/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffffc04bb26e>] drm_kms_helper_poll_enable+0x1e/0x40 [drm_kms_helper]
Jan 23 09:28:05 rskikr kernel: [<ffffffffc078622a>] nouveau_pmops_runtime_resume+0x8a/0x120 [nouveau]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e44960b>] pci_pm_runtime_resume+0x7b/0xa0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e555af7>] __rpm_callback+0x27/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e555b64>] rpm_callback+0x24/0x80
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e449590>] ? pci_restore_standard_config+0x40/0x40
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e556504>] rpm_resume+0x4a4/0x6b0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e37ae92>] ? cred_has_capability+0x72/0x120
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e5575be>] __pm_runtime_resume+0x4e/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffffc078674b>] nouveau_drm_open+0x3b/0x1a0 [nouveau]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e37afa0>] ? selinux_capable+0x20/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e3715b8>] ? security_capable+0x48/0x60
Jan 23 09:28:05 rskikr kernel: [<ffffffffc0445b31>] drm_open+0x201/0x3f0 [drm]
Jan 23 09:28:05 rskikr kernel: [<ffffffffc044bd8b>] drm_stub_open+0xbb/0x100 [drm]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e2d0>] chrdev_open+0xb0/0x180
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25674a>] do_dentry_open+0x20a/0x2f0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e220>] ? cdev_put+0x30/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257b8c>] vfs_open+0x4c/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2654eb>] ? may_open+0x9b/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e268994>] path_openat+0x664/0x1380
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e26a9b1>] do_filp_open+0x91/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e278dcf>] ? __alloc_fd+0x3f/0x170
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257fa0>] do_sys_open+0x130/0x220
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2580ae>] SyS_open+0x1e/0x20
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e81cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
Jan 23 09:28:05 rskikr kernel: Xorg.wrap       D    0  6635      1 0x00000000
Jan 23 09:28:05 rskikr kernel: ffff9257685daec0 0000000000000000 ffff9257649b1e80 ffff92577e399600
Jan 23 09:28:05 rskikr kernel: ffff92576c2f9e80 ffffae9bca6bfb78 ffffffff9e817a45 0000000000000246
Jan 23 09:28:05 rskikr kernel: 0000000000000246 ffff92577e399600 ffffffff9e375236 ffff9257649b1e80
Jan 23 09:28:05 rskikr kernel: Call Trace:
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817a45>] ? __schedule+0x225/0x6d0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e375236>] ? avc_compute_av+0x146/0x1e0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817f26>] schedule+0x36/0x80
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e8181ee>] schedule_preempt_disabled+0xe/0x10
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819c5f>] __mutex_lock_slowpath+0xaf/0x120
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819cef>] mutex_lock+0x1f/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffffc044bd0c>] drm_stub_open+0x3c/0x100 [drm]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e2d0>] chrdev_open+0xb0/0x180
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25674a>] do_dentry_open+0x20a/0x2f0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e220>] ? cdev_put+0x30/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257b8c>] vfs_open+0x4c/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2654eb>] ? may_open+0x9b/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e268994>] path_openat+0x664/0x1380
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e1fc37d>] ? alloc_set_pte+0x26d/0x630
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e26a9b1>] do_filp_open+0x91/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e278dcf>] ? __alloc_fd+0x3f/0x170
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257fa0>] do_sys_open+0x130/0x220
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2580ae>] SyS_open+0x1e/0x20
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e81cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
Jan 23 09:28:05 rskikr kernel: Xorg            D    0  7210   7208 0x00000000
Jan 23 09:28:05 rskikr kernel: ffff92574d2eb740 0000000000000000 ffff925767be9e80 ffff92577e219600
Jan 23 09:28:05 rskikr kernel: ffffffff9ee0e500 ffffae9bc8d6fb78 ffffffff9e817a45 0000000000000246
Jan 23 09:28:05 rskikr kernel: 0000000000000246 ffff92577e219600 ffffffff9e375236 ffff925767be9e80
Jan 23 09:28:05 rskikr kernel: Call Trace:
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817a45>] ? __schedule+0x225/0x6d0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e375236>] ? avc_compute_av+0x146/0x1e0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e817f26>] schedule+0x36/0x80
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e8181ee>] schedule_preempt_disabled+0xe/0x10
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819c5f>] __mutex_lock_slowpath+0xaf/0x120
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e819cef>] mutex_lock+0x1f/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffffc044bd0c>] drm_stub_open+0x3c/0x100 [drm]
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e2d0>] chrdev_open+0xb0/0x180
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25674a>] do_dentry_open+0x20a/0x2f0
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e25e220>] ? cdev_put+0x30/0x30
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257b8c>] vfs_open+0x4c/0x70
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2654eb>] ? may_open+0x9b/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e268994>] path_openat+0x664/0x1380
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e26a9b1>] do_filp_open+0x91/0x100
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e278dcf>] ? __alloc_fd+0x3f/0x170
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e257fa0>] do_sys_open+0x130/0x220
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e2580ae>] SyS_open+0x1e/0x20
Jan 23 09:28:05 rskikr kernel: [<ffffffff9e81cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

Version-Release number of selected component (if applicable):
4.9.4-201.fc25.x86_64

How reproducible:
100% so far

Additional info:

lspci hangs like so.

Jan 23 09:45:07 rskikr kernel: lspci           D    0 18220   9526 0x00000004
Jan 23 09:45:07 rskikr kernel: ffff92575e4ee1c0 0000000000000000 ffff92575ce61e80 ffff92577e259600
Jan 23 09:45:07 rskikr kernel: ffff92576c2c8000 ffffae9bcf523cb8 ffffffff9e817a45 000000000000090f
Jan 23 09:45:07 rskikr kernel: 0000000000000001 ffff92577e259600 0000000000000002 ffff92575ce61e80
Jan 23 09:45:07 rskikr kernel: Call Trace:
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e817a45>] ? __schedule+0x225/0x6d0
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e817f26>] schedule+0x36/0x80
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e555daa>] __pm_runtime_barrier+0xaa/0x170
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e0e7270>] ? prepare_to_wait_event+0x100/0x100
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e557686>] pm_runtime_barrier+0x56/0xc0
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e44767b>] pci_config_pm_runtime_get+0x3b/0x60
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e44c5ff>] pci_read_config+0x8f/0x280
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e2e18aa>] sysfs_kf_bin_read+0x4a/0x70
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e2e13e4>] kernfs_fop_read+0xb4/0x190
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e258317>] __vfs_read+0x37/0x150
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e37263b>] ? security_file_permission+0x9b/0xc0
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e259496>] vfs_read+0x96/0x130
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e25ab45>] SyS_pread64+0x95/0xb0
Jan 23 09:45:07 rskikr kernel: [<ffffffff9e81cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

W541 

$ lspci -vv -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 21cf
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 28
        Region 0: Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 4: I/O ports at 5000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915

Rebooting into 4.8.15-300 is a workaround.

Comment 1 Brad Hubbard 2017-01-23 00:08:23 UTC
Ignore lspci output above (wrong machine).

$ lspci -vv -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller])
        Subsystem: Lenovo Device 221e
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 31
        Region 0: Memory at b3400000 (64-bit, non-prefetchable) [size=4M]
        Region 2: Memory at 80000000 (64-bit, prefetchable) [size=512M]
        Region 4: I/O ports at 5000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915

Comment 2 Justin M. Forbes 2017-04-11 14:36:50 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 25 kernel bugs.

Fedora 25 has now been rebased to 4.10.9-200.fc25.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26.

If you experience different issues, please open a new bug report for those.

Comment 3 Justin M. Forbes 2017-04-28 17:08:10 UTC
*********** MASS BUG UPDATE **************
This bug is being closed with INSUFFICIENT_DATA as there has not been a response in 2 weeks. If you are still experiencing this issue, please reopen and attach the 
relevant data from the latest kernel you are running and any data that might have been requested previously.


Note You need to log in before you can comment on or make changes to this bug.