Bug 1270699

Summary: kernel-rt-debug: BUG: sleeping function called from invalid context at kernel/rtmutex.c:729
Product: Red Hat Enterprise Linux 7 Reporter: Nicolas Dichtel <nicolas.dichtel>
Component: kernel-rtAssignee: Clark Williams <williams>
Status: CLOSED CURRENTRELEASE QA Contact: Jiri Kastner <jkastner>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.3CC: bhu, jean-mickael.guerin, knoel, lgoncalv, nicolas.dichtel, riel, srostedt, vincent.jardin
Target Milestone: rcKeywords: Reopened
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1416403 (view as bug list) Environment:
Last Closed: 2017-01-25 20:25:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1175461, 1274397, 1282922, 1313485    
Attachments:
Description Flags
make kvm async pagefault patch rt-preempt friendly none

Description Nicolas Dichtel 2015-10-12 08:01:39 UTC
When using the kernel-rt-debug, I got a WARNING during the login:

Red Hat Enterprise Linux
Kernel 3.10.0-319.rt56.194.el7.x86_64.debug on an x86_64

redhat7 login: root
Password: 
Login incorrect

redhat7 login: root
Password: 
[   38.166957] BUG: sleeping function called from invalid context at kernel/rtmutex.c:729
[   38.166958] in_atomic(): 0, irqs_disabled(): 1, pid: 4490, name: dracut
[   38.166958] INFO: lockdep is turned off.
[   38.166958] irq event stamp: 0
[   38.166960] hardirqs last  enabled at (0): [<          (null)>]           (null)
[   38.166965] hardirqs last disabled at (0): [<ffffffff810737f8>] copy_process.part.26+0x778/0x1a70
[   38.166966] softirqs last  enabled at (0): [<ffffffff810737f8>] copy_process.part.26+0x778/0x1a70
[   38.166967] softirqs last disabled at (0): [<          (null)>]           (null)
[   38.166969] CPU: 1 PID: 4490 Comm: dracut Not tainted 3.10.0-319.rt56.194.el7.x86_64.debug #1
[   38.166969] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
[   38.166971]  ffff880034182830 00000000f9ce5327 ffff880033c73b10 ffffffff816e5622
[   38.166972]  ffff880033c73b38 ffffffff810be42d ffffffff822d1610 0000000000001770
[   38.166973]  ffffffff822d1610 ffff880033c73b58 ffffffff816ec6c4 0000000000001770
[   38.166973] Call Trace:
[   38.166977]  [<ffffffff816e5622>] dump_stack+0x19/0x1b
[   38.166979]  [<ffffffff810be42d>] __might_sleep+0x12d/0x1f0
[   38.166981]  [<ffffffff816ec6c4>] rt_spin_lock+0x24/0x60
[   38.166983]  [<ffffffff8104dbcb>] kvm_async_pf_task_wait+0x8b/0x290
[   38.166986]  [<ffffffff810ad720>] ? wake_up_atomic_t+0x30/0x30
[   38.166988]  [<ffffffff8119f894>] ? __alloc_pages_nodemask+0x214/0xcb0
[   38.166991]  [<ffffffff8135817d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[   38.166992]  [<ffffffff816f11ac>] do_async_page_fault+0xac/0x100
[   38.166994]  [<ffffffff816edd88>] async_page_fault+0x28/0x30
[   38.166995]  [<ffffffff81356725>] ? copy_page_rep+0x5/0x10
[   38.166997]  [<ffffffff811c3581>] ? do_wp_page+0x161/0x8f0
[   38.166998]  [<ffffffff811c645f>] ? handle_mm_fault+0x29f/0xe40
[   38.166999]  [<ffffffff811c67bb>] handle_mm_fault+0x5fb/0xe40
[   38.167002]  [<ffffffff816f1766>] ? __do_page_fault+0x216/0x530
[   38.167003]  [<ffffffff816f17c8>] __do_page_fault+0x278/0x530
[   38.167004]  [<ffffffff816f1b51>] trace_do_page_fault+0x51/0x2f0
[   38.167004]  [<ffffffff816f1129>] do_async_page_fault+0x29/0x100
[   38.167005]  [<ffffffff816edd88>] async_page_fault+0x28/0x30
[   70.982022] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
Last failed login: Mon Oct 12 09:53:16 CEST 2015 on ttyS0
There was 1 failed login attempt since the last successful login.
Last login: Mon Oct 12 09:52:02 on ttyS0
[root@redhat7 ~]#

Comment 2 Clark Williams 2016-02-25 20:24:23 UTC
(In reply to Nicolas Dichtel from comment #0)
> When using the kernel-rt-debug, I got a WARNING during the login:
> 
> Red Hat Enterprise Linux
> Kernel 3.10.0-319.rt56.194.el7.x86_64.debug on an x86_64
> 

Nicolas, did you have any virtual guests running on this system? Or even loaded?

Comment 3 Nicolas Dichtel 2016-02-26 09:00:02 UTC
No. However, the system was booted on a vm (qemu), not on a physical machine.

Comment 6 Rik van Riel 2016-03-08 22:09:41 UTC
Created attachment 1134328 [details]
make kvm async pagefault patch rt-preempt friendly

Comment 7 Rik van Riel 2016-03-10 19:21:53 UTC
I have built a test kernel with the patch from comment #6. Does that change resolve the issue?

https://people.redhat.com/riel/.bz1270699/

Comment 8 Nicolas Dichtel 2016-03-17 10:07:08 UTC
It seems that this kernel is not compiled with debug options. I got the initial backtrace only with the debug kernel.

Comment 9 Rik van Riel 2016-03-17 17:16:06 UTC
My apologies, I have uploaded a debug kernel to the same location now.

Does that kernel resolve the problem, or is it still showing up?

Comment 10 Nicolas Dichtel 2016-03-18 15:36:09 UTC
No backtrace on my side, thank you.

Comment 11 Karen Noel 2016-03-21 18:39:25 UTC
Nicolas, 

What is the version of your host software packages for kernel/qemu/libvirt? Thanks.

Comment 12 Nicolas Dichtel 2016-03-22 08:27:25 UTC
$ uname -a
Linux bretzel 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt11-1+deb8u4 (2015-09-19) x86_64 GNU/Linux
$ qemu-system-x86_64 --version
QEMU emulator version 2.3.0, Copyright (c) 2003-2008 Fabrice Bellard

Comment 13 Rik van Riel 2016-04-13 20:31:08 UTC
This patch has been merged into the RHEL7 realtime kernel tree since kernel-rt-3.10.0-363.rt56.240.el7