Bug 1571128
| Summary: | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Joseph D. Wagner <joe> | ||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
| Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | rawhide | CC: | airlied, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jcline, jglisse, joe, john.j5live, jonathan, josef, kernel-maint, kraxel, labbott, linville, mchehab, mjg59, patrick, steved | ||||
| Target Milestone: | --- | Keywords: | Reopened | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-07-10 16:48:09 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Joseph D. Wagner
2018-04-24 07:16:14 UTC
Created attachment 1425829 [details]
oops files
please retest with latest rawhide kernel which has one serve qxl issue fixed. As far as I know, this is the latest version in the repository. "dnf clean all; dnf -y upgrade" did not install a new kernel. If there is a newer version, please provide a link or push to the repository. This issue appears to be resolved in kernel-4.17.0-0.rc2.git0.1.fc29.x86_64, so I'm closing this bug. I still have the same problem with kernel-4.17.0-0.rc3.git2.1.fc29.x86_64 as guest of QEMU/kvm running on an up-to-date 64-bit F27 host. Using Mate desktop.
-----
May 4 18:31:03 rawhide kernel: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
May 4 18:31:03 rawhide kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 902, name: Xorg
May 4 18:31:03 rawhide kernel: 4 locks held by Xorg/902:
May 4 18:31:03 rawhide kernel: #0: 000000005e74e4e3 (crtc_ww_class_acquire){+.+.}, at: drm_mode_cursor_common+0x90/0x210 [drm]
May 4 18:31:03 rawhide kernel: #1: 00000000154815bd (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0xfb/0x110 [drm]
May 4 18:31:03 rawhide kernel: #2: 00000000f6ef4033 (reservation_ww_class_acquire){+.+.}, at: qxl_release_reserve_list+0x63/0x150 [qxl]
May 4 18:31:03 rawhide kernel: #3: 00000000256c7d08 (reservation_ww_class_mutex){+.+.}, at: ttm_eu_reserve_buffers+0x349/0x5b0 [ttm]
May 4 18:31:03 rawhide kernel: CPU: 1 PID: 902 Comm: Xorg Tainted: G W 4.17.0-0.rc3.git2.1.fc29.x86_64 #1
May 4 18:31:03 rawhide kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
May 4 18:31:03 rawhide kernel: Call Trace:
May 4 18:31:03 rawhide kernel: dump_stack+0x85/0xc0
May 4 18:31:03 rawhide kernel: ___might_sleep.cold.72+0xac/0xbc
May 4 18:31:03 rawhide kernel: ? __mutex_lock+0x56/0xa10
May 4 18:31:03 rawhide kernel: ? _raw_spin_unlock_irqrestore+0x4b/0x60
May 4 18:31:03 rawhide kernel: ? __slab_free+0x153/0x360
May 4 18:31:03 rawhide kernel: ? debug_check_no_obj_freed+0x123/0x204
May 4 18:31:03 rawhide kernel: ? qxl_surface_evict+0x25/0x60 [qxl]
May 4 18:31:03 rawhide kernel: ? qxl_surface_evict+0x25/0x60 [qxl]
May 4 18:31:03 rawhide kernel: ? qxl_gem_object_free+0x37/0x60 [qxl]
May 4 18:31:03 rawhide kernel: ? qxl_bo_unref+0x1d/0x30 [qxl]
May 4 18:31:03 rawhide kernel: ? qxl_cursor_atomic_update+0x270/0x2b0 [qxl]
May 4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit_planes+0xae/0x210 [drm_kms_helper]
May 4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit_tail+0x26/0x60 [drm_kms_helper]
May 4 18:31:03 rawhide kernel: ? commit_tail+0x59/0x70 [drm_kms_helper]
May 4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit+0xdf/0x150 [drm_kms_helper]
May 4 18:31:03 rawhide kernel: ? drm_atomic_helper_update_plane+0xf1/0x110 [drm_kms_helper]
May 4 18:31:03 rawhide kernel: ? __setplane_internal+0x137/0x260 [drm]
May 4 18:31:03 rawhide kernel: ? drm_internal_framebuffer_create+0x2b6/0x490 [drm]
May 4 18:31:03 rawhide kernel: ? drm_mode_cursor_universal+0xed/0x1f0 [drm]
May 4 18:31:03 rawhide kernel: ? drm_mode_cursor_common+0x19e/0x210 [drm]
May 4 18:31:03 rawhide kernel: ? drm_mode_cursor_ioctl+0x70/0x70 [drm]
May 4 18:31:03 rawhide kernel: ? drm_ioctl_kernel+0x5b/0xb0 [drm]
May 4 18:31:03 rawhide kernel: ? drm_ioctl+0x1b3/0x370 [drm]
May 4 18:31:03 rawhide kernel: ? drm_mode_cursor_ioctl+0x70/0x70 [drm]
May 4 18:31:03 rawhide kernel: ? finish_task_switch+0x98/0x2b0
May 4 18:31:03 rawhide kernel: ? do_vfs_ioctl+0xa5/0x6d0
May 4 18:31:03 rawhide kernel: ? __fget+0x10d/0x1f0
May 4 18:31:03 rawhide kernel: ? ksys_ioctl+0x60/0x90
May 4 18:31:03 rawhide kernel: ? __x64_sys_ioctl+0x16/0x20
May 4 18:31:03 rawhide kernel: ? do_syscall_64+0x60/0x1f0
May 4 18:31:03 rawhide kernel: ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
This bug came back for me too after upgrading to 4.17.0-0.rc3.git2.1.fc29.x86_64. Why does abort say "the backtrace does not contain enough meaningful function frames to be reported"? Could this be improved in the future to facilitate better reporting? Issue is ongoing with 4.17.0-0.rc3.git4.1.fc29.x86_64. This issue appeared to go away for 4.17.0-0.rc6.git1.1.fc29.x86_64, but it came back in 4.17.0-0.rc6.git2.1.fc29.x86_64. I hope this info helps. Last version that worked for me: kernel-4.16.0-0.rc6.git0.1.fc29.x86_64 Not yet fixed in kernel-4.17.0-0.rc6.git3.1.fc29.x86_64 Appears to be fixed in 4.17.0-0.rc7.git0.1.fc29.x86_64. Can anyone confirm? > Appears to be fixed in 4.17.0-0.rc7.git0.1.fc29.x86_64. Can anyone confirm?
Ok for me too, although I'm not in position to point on the fix in the source code.
Hi Joseph, Patrick, I believe the reason you're repeatedly seeing it "fixed" is because it's not actually fixed, but the rc (rc#.git0) builds turn off debugging options which includes CONFIG_LOCKDEP. If you install kernel-debug-4.17.0-0.rc7.git0.1.fc29.x86_64 or kernel-4.17.0-0.rc7.git1.1.fc29, you'll likely still see it. Thanks for the info Jeremy. I'll check with the next non-git0 update when available. I went ahead and set up a VM, it seems pretty easy to reproduce, I don't see a fix submitted upstream, and I think I understand the problem so I'll see about submitting a patch to fix this. > I think I understand the problem so I'll see about submitting a patch to fix this.
Would be great. Thanks in advance.
Appears to still be a problem on 4.18.0-0.rc0.git10.1.fc29.x86_64. Hi Joseph, It looks like the fix is in linux-next. I'll close this bug when it arrives in Linus' tree. I recommend running the non-debug builds (builds with git0 in the release) until then. It seems effectively fixed in kernel-4.18.0-0.rc3.git3.1.fc29.x86_64. Thanks a lot. |