Description of problem: possible circular locking dependency detected Version-Release number of selected component (if applicable): 3.18.0-0.rc0.git9.4.fc22.x86_64 How reproducible: 100% Steps to Reproduce: 1. Boot. 2. 3. Actual results: [ 43.050637] ====================================================== [ 43.050639] [ INFO: possible circular locking dependency detected ] [ 43.050642] 3.18.0-0.rc0.git9.4.fc22.x86_64 #1 Not tainted [ 43.050643] ------------------------------------------------------- [ 43.050645] Xorg.bin/1090 is trying to acquire lock: [ 43.050647] (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa00767f0>] __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.050666] but task is already holding lock: [ 43.050668] (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa0076921>] drm_modeset_lock_crtc+0x41/0xc0 [drm] [ 43.050681] which lock already depends on the new lock. [ 43.050684] the existing dependency chain (in reverse order) is: [ 43.050685] -> #1 (crtc_ww_class_acquire){+.+.+.}: [ 43.050690] [<ffffffff8110a491>] lock_acquire+0xd1/0x2b0 [ 43.050695] [<ffffffffa007610c>] drm_modeset_acquire_init+0xbc/0xe0 [drm] [ 43.050706] [<ffffffffa00767fa>] __drm_modeset_lock_all+0x9a/0x120 [drm] [ 43.050715] [<ffffffffa0076890>] drm_modeset_lock_all+0x10/0x40 [drm] [ 43.050724] [<ffffffffa0069770>] drm_mode_config_init+0x160/0x240 [drm] [ 43.050734] [<ffffffffa010ed5e>] vmw_kms_init+0x1e/0x70 [vmwgfx] [ 43.050741] [<ffffffffa0111038>] vmw_driver_load+0x8c8/0xd90 [vmwgfx] [ 43.050747] [<ffffffffa0060025>] drm_dev_register+0xb5/0x110 [drm] [ 43.050755] [<ffffffffa00632cd>] drm_get_pci_dev+0x8d/0x200 [drm] [ 43.050763] [<ffffffffa010fda5>] vmw_probe+0x15/0x20 [vmwgfx] [ 43.050769] [<ffffffff81468625>] local_pci_probe+0x45/0xa0 [ 43.050773] [<ffffffff81469b29>] pci_device_probe+0xf9/0x150 [ 43.050775] [<ffffffff8155bb13>] driver_probe_device+0xa3/0x400 [ 43.050780] [<ffffffff8155bf53>] __driver_attach+0xa3/0xb0 [ 43.050782] [<ffffffff815596e3>] bus_for_each_dev+0x73/0xc0 [ 43.050785] [<ffffffff8155b55e>] driver_attach+0x1e/0x20 [ 43.050787] [<ffffffff8155b118>] bus_add_driver+0x188/0x260 [ 43.050790] [<ffffffff8155cdb4>] driver_register+0x64/0xf0 [ 43.050793] [<ffffffff81467eb4>] __pci_register_driver+0x64/0x70 [ 43.050795] [<ffffffffa006353a>] drm_pci_init+0xfa/0x130 [drm] [ 43.050803] [<ffffffffa0139019>] rpc_call_start+0x9/0x20 [sunrpc] [ 43.050818] [<ffffffff81002148>] do_one_initcall+0xd8/0x210 [ 43.050822] [<ffffffff811539f2>] load_module+0x20c2/0x2870 [ 43.050827] [<ffffffff81154287>] SyS_init_module+0xe7/0x140 [ 43.050829] [<ffffffff818643a9>] system_call_fastpath+0x12/0x17 [ 43.050835] -> #0 (&dev->mode_config.mutex){+.+.+.}: [ 43.050838] [<ffffffff81109ac9>] __lock_acquire+0x1c39/0x1d50 [ 43.050841] [<ffffffff8110a491>] lock_acquire+0xd1/0x2b0 [ 43.050844] [<ffffffff8185f02e>] mutex_lock_nested+0x7e/0x440 [ 43.050847] [<ffffffffa00767f0>] __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.050857] [<ffffffffa0076890>] drm_modeset_lock_all+0x10/0x40 [drm] [ 43.050865] [<ffffffffa010e14b>] vmw_du_crtc_cursor_move+0x5b/0xc0 [vmwgfx] [ 43.050870] [<ffffffffa0067cd7>] drm_mode_cursor_common+0x2c7/0x390 [drm] [ 43.050878] [<ffffffffa006b4e0>] drm_mode_cursor_ioctl+0x50/0x70 [drm] [ 43.050888] [<ffffffffa005be0f>] drm_ioctl+0x1df/0x6a0 [drm] [ 43.050895] [<ffffffffa010ff3d>] vmw_generic_ioctl+0x18d/0x2a0 [vmwgfx] [ 43.050901] [<ffffffffa0110085>] vmw_unlocked_ioctl+0x15/0x20 [vmwgfx] [ 43.050905] [<ffffffff8128aa50>] do_vfs_ioctl+0x2f0/0x520 [ 43.050909] [<ffffffff8128ad01>] SyS_ioctl+0x81/0xa0 [ 43.050912] [<ffffffff818643a9>] system_call_fastpath+0x12/0x17 [ 43.050915] other info that might help us debug this: [ 43.050917] Possible unsafe locking scenario: [ 43.050919] CPU0 CPU1 [ 43.050920] ---- ---- [ 43.050921] lock(crtc_ww_class_acquire); [ 43.050923] lock(&dev->mode_config.mutex); [ 43.050925] lock(crtc_ww_class_acquire); [ 43.050927] lock(&dev->mode_config.mutex); [ 43.050930] *** DEADLOCK *** Expected results: Additional info:
[ 43.050932] 1 lock held by Xorg.bin/1090: [ 43.050934] #0: (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa0076921>] drm_modeset_lock_crtc+0x41/0xc0 [drm] [ 43.050946] stack backtrace: [ 43.050950] CPU: 0 PID: 1090 Comm: Xorg.bin Not tainted 3.18.0-0.rc0.git9.4.fc22.x86_64 #1 [ 43.050952] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 43.050954] 0000000000000000 00000000b4807c9b ffff88001146b938 ffffffff8185a15d [ 43.050957] 0000000000000000 ffffffff82c02d30 ffff88001146b988 ffffffff818576dc [ 43.050961] ffff88001145dae8 ffff88001146b9e8 ffff88001145cec0 ffff88001145dab8 [ 43.050964] Call Trace: [ 43.050970] [<ffffffff8185a15d>] dump_stack+0x4e/0x68 [ 43.050973] [<ffffffff818576dc>] print_circular_bug+0x203/0x214 [ 43.050977] [<ffffffff81109ac9>] __lock_acquire+0x1c39/0x1d50 [ 43.050980] [<ffffffff8110a491>] lock_acquire+0xd1/0x2b0 [ 43.050991] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.050994] [<ffffffff8185f02e>] mutex_lock_nested+0x7e/0x440 [ 43.051004] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051013] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051016] [<ffffffff8124b9bc>] ? kmem_cache_alloc_trace+0x3cc/0x410 [ 43.051025] [<ffffffffa00767f0>] __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051034] [<ffffffffa0076890>] drm_modeset_lock_all+0x10/0x40 [drm] [ 43.051048] [<ffffffffa010e14b>] vmw_du_crtc_cursor_move+0x5b/0xc0 [vmwgfx] [ 43.051058] [<ffffffffa0067cd7>] drm_mode_cursor_common+0x2c7/0x390 [drm] [ 43.051062] [<ffffffff8110aa36>] ? lock_release_non_nested+0x3c6/0x3d0 [ 43.051067] [<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0 [ 43.051071] [<ffffffff81216cde>] ? might_fault+0x5e/0xc0 [ 43.051082] [<ffffffffa006b4e0>] drm_mode_cursor_ioctl+0x50/0x70 [drm] [ 43.051091] [<ffffffffa005be0f>] drm_ioctl+0x1df/0x6a0 [drm] [ 43.051095] [<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0 [ 43.051101] [<ffffffff8139de3a>] ? avc_has_perm+0x15a/0x2f0 [ 43.051105] [<ffffffff8139dd14>] ? avc_has_perm+0x34/0x2f0 [ 43.051120] [<ffffffffa005bc30>] ? drm_getmap+0xf0/0xf0 [drm] [ 43.051127] [<ffffffffa010ff3d>] vmw_generic_ioctl+0x18d/0x2a0 [vmwgfx] [ 43.051133] [<ffffffffa0110085>] vmw_unlocked_ioctl+0x15/0x20 [vmwgfx] [ 43.051136] [<ffffffff8128aa50>] do_vfs_ioctl+0x2f0/0x520 [ 43.051139] [<ffffffff8128ad01>] SyS_ioctl+0x81/0xa0 [ 43.051142] [<ffffffff818643a9>] system_call_fastpath+0x12/0x17
3.18.0-0.rc0.git9.4.fc22.x86_64 - bad 3.18.0-0.rc0.git4.1.fc22.x86_64 - good
Rob, is this due to one of your changes? It appears that Xorg deadlocks on the Vmware platform due to this. /Thomas
(In reply to Thomas Hellström from comment #3) > Rob, is this due to one of your changes? > > It appears that Xorg deadlocks on the Vmware platform due to this. > > /Thomas I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg can deadlock with this, since it is a locking path on driver load vs a runtime path. And I'm not quite seeing yet how mode_config.mutex is acquired *after* the drm_modeset_acquire_init(), since all appears to go via the same __drm_modeset_lock_all() path. Do you have hung-task backtraces from the Xorg deadlock?
Created attachment 951411 [details] log from deadlock Attaching a log file from an Xorg deadlock.
(In reply to Rob Clark from comment #4) > (In reply to Thomas Hellström from comment #3) > > Rob, is this due to one of your changes? > > > > It appears that Xorg deadlocks on the Vmware platform due to this. > > > > /Thomas > > I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg > can deadlock with this, since it is a locking path on driver load vs a > runtime path. And I'm not quite seeing yet how mode_config.mutex is > acquired *after* the drm_modeset_acquire_init(), since all appears to go via > the same __drm_modeset_lock_all() path. > > Do you have hung-task backtraces from the Xorg deadlock? Attached. From a vanilla 3.18.0-rc2 kernel. (Although not Fedora.) Thanks, Thomas
(In reply to Thomas Hellström from comment #6) > (In reply to Rob Clark from comment #4) > > (In reply to Thomas Hellström from comment #3) > > > Rob, is this due to one of your changes? > > > > > > It appears that Xorg deadlocks on the Vmware platform due to this. > > > > > > /Thomas > > > > I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg > > can deadlock with this, since it is a locking path on driver load vs a > > runtime path. And I'm not quite seeing yet how mode_config.mutex is > > acquired *after* the drm_modeset_acquire_init(), since all appears to go via > > the same __drm_modeset_lock_all() path. > > > > Do you have hung-task backtraces from the Xorg deadlock? > > Attached. From a vanilla 3.18.0-rc2 kernel. (Although not Fedora.) > > Thanks, > Thomas I suspect you want something like: ---------------- diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index d2bc2b0..5be6583 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -273,7 +273,7 @@ int vmw_du_crtc_cursor_move(struct drm_crtc *crtc, int x, int y) * can do this since the caller in the drm core doesn't check anything * which is protected by any looks. */ - drm_modeset_unlock(&crtc->mutex); + drm_modeset_unlock_crtc(crtc); drm_modeset_lock_all(dev_priv->dev); vmw_cursor_update_position(dev_priv, shown, @@ -281,7 +281,7 @@ int vmw_du_crtc_cursor_move(struct drm_crtc *crtc, int x, int y) du->cursor_y + du->hotspot_y); drm_modeset_unlock_all(dev_priv->dev); - drm_modeset_lock(&crtc->mutex, NULL); + drm_modeset_lock_crtc(crtc); return 0; } ---------------- looks like there might be a couple other spots that want similar treatment.
It might be, but whoever broke the code should really be responsible for fixing it up...
Created attachment 952292 [details] drm/vmwgfx: fix lock breakage After: commit d059f652e73c35678d28d4cd09ab2cec89696af9 Author: Daniel Vetter <daniel.vetter> AuthorDate: Fri Jul 25 18:07:40 2014 +0200 drm: Handle legacy per-crtc locking with full acquire ctx drm_mode_cursor_common() was switched to use drm_modeset_(un)lock_crtc() which uses full aquire ctx. So dropping/reaquiring the lock via drm_modeset_(un)lock() directly isn't the right thing to do, as lockdep kindly points out. The 'FIXME's about sorting out whether vmwgfx *really* needs to lock-all for cursor updates still apply.
fyi, patch posted on dri-devel http://lists.freedesktop.org/archives/dri-devel/2014-October/070910.html
This is fixed with commit 21e88620aa21b48d4f62d29275e3e2944a5ea2b5 which is in the rc3 kernels. Thanks Rob!