Bug 1571128 - BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
Summary: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-24 07:16 UTC by Joseph D. Wagner
Modified: 2018-07-10 23:08 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-10 16:48:09 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
oops files (34.34 KB, application/zip)
2018-04-24 07:22 UTC, Joseph D. Wagner
no flags Details

Description Joseph D. Wagner 2018-04-24 07:16:14 UTC
Description of problem:
kernel-core crashes about every three seconds.

abrt says:
The backtrace does not contain enough meaningful function frames to be reported. It is annoying but it does not necessary signalize a problem with your computer. ABRT will not allow you to create a report in a bug tracking system but you can contact kernel maintainers via e-mail.

Version-Release number of selected component (if applicable):
4.17.0-0.rc1.git3.1.fc29.x86_64

How reproducible:
100%.

Steps to Reproduce:
1. Boot into Xfce.
2. Logon.
3. Watch abrt notifications every few seconds.

I've tried the following to get crash info:
1. Install kernel-debuginfo.
2. Add crashkernel=128M to grub2.
3. Started kdump.

But I can't it to produce a backtrace, or even a dump in /var/crash.

Is there anything I can do to get this info to you? Or is it something in the build that is simply missing?

Comment 1 Joseph D. Wagner 2018-04-24 07:22:03 UTC
Created attachment 1425829 [details]
oops files

Comment 2 Gerd Hoffmann 2018-04-24 13:14:51 UTC
please retest with latest rawhide kernel which has one serve qxl issue fixed.

Comment 3 Joseph D. Wagner 2018-04-25 13:33:57 UTC
As far as I know, this is the latest version in the repository.
"dnf clean all; dnf -y upgrade" did not install a new kernel.

If there is a newer version, please provide a link or push to the repository.

Comment 4 Joseph D. Wagner 2018-04-27 22:53:05 UTC
This issue appears to be resolved in kernel-4.17.0-0.rc2.git0.1.fc29.x86_64, so I'm closing this bug.

Comment 5 Patrick Monnerat 2018-05-04 16:34:41 UTC
I still have the same problem with kernel-4.17.0-0.rc3.git2.1.fc29.x86_64 as guest of QEMU/kvm running on an up-to-date 64-bit F27 host. Using Mate desktop.

-----
May  4 18:31:03 rawhide kernel: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
May  4 18:31:03 rawhide kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 902, name: Xorg
May  4 18:31:03 rawhide kernel: 4 locks held by Xorg/902:
May  4 18:31:03 rawhide kernel: #0: 000000005e74e4e3 (crtc_ww_class_acquire){+.+.}, at: drm_mode_cursor_common+0x90/0x210 [drm]
May  4 18:31:03 rawhide kernel: #1: 00000000154815bd (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0xfb/0x110 [drm]
May  4 18:31:03 rawhide kernel: #2: 00000000f6ef4033 (reservation_ww_class_acquire){+.+.}, at: qxl_release_reserve_list+0x63/0x150 [qxl]
May  4 18:31:03 rawhide kernel: #3: 00000000256c7d08 (reservation_ww_class_mutex){+.+.}, at: ttm_eu_reserve_buffers+0x349/0x5b0 [ttm]
May  4 18:31:03 rawhide kernel: CPU: 1 PID: 902 Comm: Xorg Tainted: G        W         4.17.0-0.rc3.git2.1.fc29.x86_64 #1
May  4 18:31:03 rawhide kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
May  4 18:31:03 rawhide kernel: Call Trace:
May  4 18:31:03 rawhide kernel: dump_stack+0x85/0xc0
May  4 18:31:03 rawhide kernel: ___might_sleep.cold.72+0xac/0xbc
May  4 18:31:03 rawhide kernel: ? __mutex_lock+0x56/0xa10
May  4 18:31:03 rawhide kernel: ? _raw_spin_unlock_irqrestore+0x4b/0x60
May  4 18:31:03 rawhide kernel: ? __slab_free+0x153/0x360
May  4 18:31:03 rawhide kernel: ? debug_check_no_obj_freed+0x123/0x204
May  4 18:31:03 rawhide kernel: ? qxl_surface_evict+0x25/0x60 [qxl]
May  4 18:31:03 rawhide kernel: ? qxl_surface_evict+0x25/0x60 [qxl]
May  4 18:31:03 rawhide kernel: ? qxl_gem_object_free+0x37/0x60 [qxl]
May  4 18:31:03 rawhide kernel: ? qxl_bo_unref+0x1d/0x30 [qxl]
May  4 18:31:03 rawhide kernel: ? qxl_cursor_atomic_update+0x270/0x2b0 [qxl]
May  4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit_planes+0xae/0x210 [drm_kms_helper]
May  4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit_tail+0x26/0x60 [drm_kms_helper]
May  4 18:31:03 rawhide kernel: ? commit_tail+0x59/0x70 [drm_kms_helper]
May  4 18:31:03 rawhide kernel: ? drm_atomic_helper_commit+0xdf/0x150 [drm_kms_helper]
May  4 18:31:03 rawhide kernel: ? drm_atomic_helper_update_plane+0xf1/0x110 [drm_kms_helper]
May  4 18:31:03 rawhide kernel: ? __setplane_internal+0x137/0x260 [drm]
May  4 18:31:03 rawhide kernel: ? drm_internal_framebuffer_create+0x2b6/0x490 [drm]
May  4 18:31:03 rawhide kernel: ? drm_mode_cursor_universal+0xed/0x1f0 [drm]
May  4 18:31:03 rawhide kernel: ? drm_mode_cursor_common+0x19e/0x210 [drm]
May  4 18:31:03 rawhide kernel: ? drm_mode_cursor_ioctl+0x70/0x70 [drm]
May  4 18:31:03 rawhide kernel: ? drm_ioctl_kernel+0x5b/0xb0 [drm]
May  4 18:31:03 rawhide kernel: ? drm_ioctl+0x1b3/0x370 [drm]
May  4 18:31:03 rawhide kernel: ? drm_mode_cursor_ioctl+0x70/0x70 [drm]
May  4 18:31:03 rawhide kernel: ? finish_task_switch+0x98/0x2b0
May  4 18:31:03 rawhide kernel: ? do_vfs_ioctl+0xa5/0x6d0
May  4 18:31:03 rawhide kernel: ? __fget+0x10d/0x1f0
May  4 18:31:03 rawhide kernel: ? ksys_ioctl+0x60/0x90
May  4 18:31:03 rawhide kernel: ? __x64_sys_ioctl+0x16/0x20
May  4 18:31:03 rawhide kernel: ? do_syscall_64+0x60/0x1f0
May  4 18:31:03 rawhide kernel: ? entry_SYSCALL_64_after_hwframe+0x49/0xbe

Comment 6 Joseph D. Wagner 2018-05-04 17:20:04 UTC
This bug came back for me too after upgrading to 4.17.0-0.rc3.git2.1.fc29.x86_64.

Comment 7 Joseph D. Wagner 2018-05-04 17:21:37 UTC
Why does abort say "the backtrace does not contain enough meaningful function frames to be reported"? Could this be improved in the future to facilitate better reporting?

Comment 8 Joseph D. Wagner 2018-05-05 19:57:20 UTC
Issue is ongoing with 4.17.0-0.rc3.git4.1.fc29.x86_64.

Comment 9 Joseph D. Wagner 2018-05-26 23:04:30 UTC
This issue appeared to go away for 4.17.0-0.rc6.git1.1.fc29.x86_64, but it came back in 4.17.0-0.rc6.git2.1.fc29.x86_64.

I hope this info helps.

Comment 10 Patrick Monnerat 2018-05-29 18:28:06 UTC
Last version that worked for me: kernel-4.16.0-0.rc6.git0.1.fc29.x86_64
Not yet fixed in kernel-4.17.0-0.rc6.git3.1.fc29.x86_64

Comment 11 Joseph D. Wagner 2018-05-31 09:19:45 UTC
Appears to be fixed in 4.17.0-0.rc7.git0.1.fc29.x86_64. Can anyone confirm?

Comment 12 Patrick Monnerat 2018-05-31 11:10:23 UTC
> Appears to be fixed in 4.17.0-0.rc7.git0.1.fc29.x86_64. Can anyone confirm?

Ok for me too, although I'm not in position to point on the fix in the source code.

Comment 13 Jeremy Cline 2018-05-31 18:03:12 UTC
Hi Joseph, Patrick,

I believe the reason you're repeatedly seeing it "fixed" is because it's not actually fixed, but the rc (rc#.git0) builds turn off debugging options which includes CONFIG_LOCKDEP. If you install kernel-debug-4.17.0-0.rc7.git0.1.fc29.x86_64 or kernel-4.17.0-0.rc7.git1.1.fc29, you'll likely still see it.

Comment 14 Patrick Monnerat 2018-05-31 18:16:35 UTC
Thanks for the info Jeremy. I'll check with the next non-git0 update when available.

Comment 15 Jeremy Cline 2018-06-01 00:23:50 UTC
I went ahead and set up a VM, it seems pretty easy to reproduce, I don't see a fix submitted upstream, and I think I understand the problem so I'll see about submitting a patch to fix this.

Comment 16 Patrick Monnerat 2018-06-01 09:29:36 UTC
> I think I understand the problem so I'll see about submitting a patch to fix this.
Would be great. Thanks in advance.

Comment 17 Joseph D. Wagner 2018-06-18 18:54:03 UTC
Appears to still be a problem on 4.18.0-0.rc0.git10.1.fc29.x86_64.

Comment 18 Jeremy Cline 2018-06-18 19:15:09 UTC
Hi Joseph,

It looks like the fix is in linux-next. I'll close this bug when it arrives in Linus' tree. I recommend running the non-debug builds (builds with git0 in the release) until then.

Comment 19 Patrick Monnerat 2018-07-10 23:08:42 UTC
It seems effectively fixed in kernel-4.18.0-0.rc3.git3.1.fc29.x86_64. Thanks a lot.


Note You need to log in before you can comment on or make changes to this bug.