Bug 1155825
Summary: | possible circular locking dependency detected __drm_modeset_lock_all | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Vinson Lee <vlee> | ||||||
Component: | kernel | Assignee: | Rob Clark <rclark> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | brsmith, gansalmon, itamar, jakob, jonathan, kernel-maint, madhu.chinakonda, mchehab, rclark, robdclark, thellstrom, zackr | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-11-05 23:58:56 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Vinson Lee
2014-10-22 23:25:49 UTC
[ 43.050932] 1 lock held by Xorg.bin/1090: [ 43.050934] #0: (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa0076921>] drm_modeset_lock_crtc+0x41/0xc0 [drm] [ 43.050946] stack backtrace: [ 43.050950] CPU: 0 PID: 1090 Comm: Xorg.bin Not tainted 3.18.0-0.rc0.git9.4.fc22.x86_64 #1 [ 43.050952] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 43.050954] 0000000000000000 00000000b4807c9b ffff88001146b938 ffffffff8185a15d [ 43.050957] 0000000000000000 ffffffff82c02d30 ffff88001146b988 ffffffff818576dc [ 43.050961] ffff88001145dae8 ffff88001146b9e8 ffff88001145cec0 ffff88001145dab8 [ 43.050964] Call Trace: [ 43.050970] [<ffffffff8185a15d>] dump_stack+0x4e/0x68 [ 43.050973] [<ffffffff818576dc>] print_circular_bug+0x203/0x214 [ 43.050977] [<ffffffff81109ac9>] __lock_acquire+0x1c39/0x1d50 [ 43.050980] [<ffffffff8110a491>] lock_acquire+0xd1/0x2b0 [ 43.050991] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.050994] [<ffffffff8185f02e>] mutex_lock_nested+0x7e/0x440 [ 43.051004] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051013] [<ffffffffa00767f0>] ? __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051016] [<ffffffff8124b9bc>] ? kmem_cache_alloc_trace+0x3cc/0x410 [ 43.051025] [<ffffffffa00767f0>] __drm_modeset_lock_all+0x90/0x120 [drm] [ 43.051034] [<ffffffffa0076890>] drm_modeset_lock_all+0x10/0x40 [drm] [ 43.051048] [<ffffffffa010e14b>] vmw_du_crtc_cursor_move+0x5b/0xc0 [vmwgfx] [ 43.051058] [<ffffffffa0067cd7>] drm_mode_cursor_common+0x2c7/0x390 [drm] [ 43.051062] [<ffffffff8110aa36>] ? lock_release_non_nested+0x3c6/0x3d0 [ 43.051067] [<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0 [ 43.051071] [<ffffffff81216cde>] ? might_fault+0x5e/0xc0 [ 43.051082] [<ffffffffa006b4e0>] drm_mode_cursor_ioctl+0x50/0x70 [drm] [ 43.051091] [<ffffffffa005be0f>] drm_ioctl+0x1df/0x6a0 [drm] [ 43.051095] [<ffffffff81028c4a>] ? native_sched_clock+0x2a/0xa0 [ 43.051101] [<ffffffff8139de3a>] ? avc_has_perm+0x15a/0x2f0 [ 43.051105] [<ffffffff8139dd14>] ? avc_has_perm+0x34/0x2f0 [ 43.051120] [<ffffffffa005bc30>] ? drm_getmap+0xf0/0xf0 [drm] [ 43.051127] [<ffffffffa010ff3d>] vmw_generic_ioctl+0x18d/0x2a0 [vmwgfx] [ 43.051133] [<ffffffffa0110085>] vmw_unlocked_ioctl+0x15/0x20 [vmwgfx] [ 43.051136] [<ffffffff8128aa50>] do_vfs_ioctl+0x2f0/0x520 [ 43.051139] [<ffffffff8128ad01>] SyS_ioctl+0x81/0xa0 [ 43.051142] [<ffffffff818643a9>] system_call_fastpath+0x12/0x17 3.18.0-0.rc0.git9.4.fc22.x86_64 - bad 3.18.0-0.rc0.git4.1.fc22.x86_64 - good Rob, is this due to one of your changes? It appears that Xorg deadlocks on the Vmware platform due to this. /Thomas (In reply to Thomas Hellström from comment #3) > Rob, is this due to one of your changes? > > It appears that Xorg deadlocks on the Vmware platform due to this. > > /Thomas I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg can deadlock with this, since it is a locking path on driver load vs a runtime path. And I'm not quite seeing yet how mode_config.mutex is acquired *after* the drm_modeset_acquire_init(), since all appears to go via the same __drm_modeset_lock_all() path. Do you have hung-task backtraces from the Xorg deadlock? Created attachment 951411 [details]
log from deadlock
Attaching a log file from an Xorg deadlock.
(In reply to Rob Clark from comment #4) > (In reply to Thomas Hellström from comment #3) > > Rob, is this due to one of your changes? > > > > It appears that Xorg deadlocks on the Vmware platform due to this. > > > > /Thomas > > I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg > can deadlock with this, since it is a locking path on driver load vs a > runtime path. And I'm not quite seeing yet how mode_config.mutex is > acquired *after* the drm_modeset_acquire_init(), since all appears to go via > the same __drm_modeset_lock_all() path. > > Do you have hung-task backtraces from the Xorg deadlock? Attached. From a vanilla 3.18.0-rc2 kernel. (Although not Fedora.) Thanks, Thomas (In reply to Thomas Hellström from comment #6) > (In reply to Rob Clark from comment #4) > > (In reply to Thomas Hellström from comment #3) > > > Rob, is this due to one of your changes? > > > > > > It appears that Xorg deadlocks on the Vmware platform due to this. > > > > > > /Thomas > > > > I'm guessing due to Daniel's trylock changes (cb597bb3).. I wonder how Xorg > > can deadlock with this, since it is a locking path on driver load vs a > > runtime path. And I'm not quite seeing yet how mode_config.mutex is > > acquired *after* the drm_modeset_acquire_init(), since all appears to go via > > the same __drm_modeset_lock_all() path. > > > > Do you have hung-task backtraces from the Xorg deadlock? > > Attached. From a vanilla 3.18.0-rc2 kernel. (Although not Fedora.) > > Thanks, > Thomas I suspect you want something like: ---------------- diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index d2bc2b0..5be6583 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -273,7 +273,7 @@ int vmw_du_crtc_cursor_move(struct drm_crtc *crtc, int x, int y) * can do this since the caller in the drm core doesn't check anything * which is protected by any looks. */ - drm_modeset_unlock(&crtc->mutex); + drm_modeset_unlock_crtc(crtc); drm_modeset_lock_all(dev_priv->dev); vmw_cursor_update_position(dev_priv, shown, @@ -281,7 +281,7 @@ int vmw_du_crtc_cursor_move(struct drm_crtc *crtc, int x, int y) du->cursor_y + du->hotspot_y); drm_modeset_unlock_all(dev_priv->dev); - drm_modeset_lock(&crtc->mutex, NULL); + drm_modeset_lock_crtc(crtc); return 0; } ---------------- looks like there might be a couple other spots that want similar treatment. It might be, but whoever broke the code should really be responsible for fixing it up... Created attachment 952292 [details]
drm/vmwgfx: fix lock breakage
After:
commit d059f652e73c35678d28d4cd09ab2cec89696af9
Author: Daniel Vetter <daniel.vetter>
AuthorDate: Fri Jul 25 18:07:40 2014 +0200
drm: Handle legacy per-crtc locking with full acquire ctx
drm_mode_cursor_common() was switched to use drm_modeset_(un)lock_crtc()
which uses full aquire ctx. So dropping/reaquiring the lock via
drm_modeset_(un)lock() directly isn't the right thing to do, as lockdep
kindly points out.
The 'FIXME's about sorting out whether vmwgfx *really* needs to lock-all
for cursor updates still apply.
fyi, patch posted on dri-devel http://lists.freedesktop.org/archives/dri-devel/2014-October/070910.html This is fixed with commit 21e88620aa21b48d4f62d29275e3e2944a5ea2b5 which is in the rc3 kernels. Thanks Rob! |