Bug 1361614

Summary: [abrt] BUG: sleeping function called from invalid context at mm/slab.h:391
Product: [Fedora] Fedora Reporter: Joachim Frieben <jfrieben>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 25CC: awilliam, bugzilla, gansalmon, iliketurtlesbro, itamar, jonathan, juliux.pigface, kernel-maint, madhu.chinakonda, mchehab, michal.jnn
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://retrace.fedoraproject.org/faf/reports/bthash/91088472ab3302b5e1d9449eb5841855f7b19096
Whiteboard: abrt_hash:97cb95136ac065e29e3a26bbfdf753cc7e27bcc8;VARIANT_ID=workstation;
Fixed In Version: kernel-4.8.0-0.rc1.git1.1.fc25 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-10 20:10:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1277285    
Attachments:
Description Flags
File: dmesg none

Description Joachim Frieben 2016-07-29 14:10:12 UTC
Additional info:
reporter:       libreport-2.7.2
BUG: sleeping function called from invalid context at mm/slab.h:391
in_atomic(): 1, irqs_disabled(): 0, pid: 957, name: modprobe
no locks held by modprobe/957.
CPU: 0 PID: 957 Comm: modprobe Not tainted 4.8.0-0.rc0.git2.1.fc25.x86_64 #1
Hardware name: LENOVO 2768W9J/2768W9J, BIOS 7UET94WW (3.24 ) 10/17/2012
 0000000000000286 00000000dec86753 ffff8bc8681eb900 ffffffff85463ef3
 ffff8bc876bdb140 ffffffff85c7c846 ffff8bc8681eb928 ffffffff850de799
 ffffffff85c7c846 0000000000000187 0000000000000000 ffff8bc8681eb950
Call Trace:
 [<ffffffff85463ef3>] dump_stack+0x86/0xc3
 [<ffffffff850de799>] ___might_sleep+0x179/0x230
 [<ffffffff850de899>] __might_sleep+0x49/0x80
 [<ffffffff8526ef96>] kmem_cache_alloc_trace+0x1e6/0x2d0
 [<ffffffff8549ce00>] ? mpi_alloc+0x20/0x80
 [<ffffffff8549ce00>] mpi_alloc+0x20/0x80
 [<ffffffff8549a755>] mpi_read_raw_from_sgl+0xd5/0x1e0
 [<ffffffff85403666>] rsa_verify+0x66/0x100
 [<ffffffff85270bf5>] ? __kmalloc+0x2e5/0x320
 [<ffffffff85404ab9>] pkcs1pad_verify+0x149/0x190
 [<ffffffff8541f379>] public_key_verify_signature+0x1f9/0x290
 [<ffffffff8541f425>] public_key_verify_signature_2+0x15/0x20
 [<ffffffff8541f07c>] verify_signature+0x3c/0x50
 [<ffffffff854212ed>] pkcs7_validate_trust+0x11d/0x230
 [<ffffffff851f518b>] verify_pkcs7_signature+0xbb/0x180
 [<ffffffff8515c24d>] mod_verify_sig+0xdd/0x130
 [<ffffffff85158f8c>] load_module+0x16c/0x2960
 [<ffffffff8524a05c>] ? vmap_page_range_noflush+0x25c/0x360
 [<ffffffff85037e89>] ? sched_clock+0x9/0x10
 [<ffffffff850ea440>] ? sched_clock_cpu+0x90/0xc0
 [<ffffffff85234ba3>] ? __might_fault+0x43/0xa0
 [<ffffffff85234ba3>] ? __might_fault+0x43/0xa0
 [<ffffffff8515b8f2>] SYSC_init_module+0x172/0x1b0
 [<ffffffff8515ba5e>] SyS_init_module+0xe/0x10
 [<ffffffff858f17fc>] entry_SYSCALL_64_fastpath+0x1f/0xbd

Comment 1 Joachim Frieben 2016-07-29 14:10:22 UTC
Created attachment 1185573 [details]
File: dmesg

Comment 2 Michal Jaegermann 2016-07-30 23:09:51 UTC
It looks like that I am seeing the same bug with 4.8.0-0.rc0.git3.1.fc26.x86_64 only with a line number slightly different.  Booting that on "Acer Aspire T135/K8VM800MAE, BIOS R01-A3 06/27/2005" I get 33 times:

[    3.231229] BUG: sleeping function called from invalid context at mm/slab.h:393
[    4.296673] BUG: sleeping function called from invalid context at mm/slab.h:393
[    5.436395] BUG: sleeping function called from invalid context at mm/slab.h:393
[   15.887089] BUG: sleeping function called from invalid context at mm/slab.h:393
[   19.339376] BUG: sleeping function called from invalid context at mm/slab.h:393
[   25.687549] BUG: sleeping function called from invalid context at mm/slab.h:393
[   27.185124] BUG: sleeping function called from invalid context at mm/slab.h:393
[   28.711261] BUG: sleeping function called from invalid context at mm/slab.h:393
[   29.893191] BUG: sleeping function called from invalid context at mm/slab.h:393
[   30.928079] BUG: sleeping function called from invalid context at mm/slab.h:393
[   32.197365] BUG: sleeping function called from invalid context at mm/slab.h:393
[   33.697918] BUG: sleeping function called from invalid context at mm/slab.h:393
[   34.732839] BUG: sleeping function called from invalid context at mm/slab.h:393
[   35.842572] BUG: sleeping function called from invalid context at mm/slab.h:393
[   36.857164] BUG: sleeping function called from invalid context at mm/slab.h:393
[   37.915442] BUG: sleeping function called from invalid context at mm/slab.h:393
[   40.191417] BUG: sleeping function called from invalid context at mm/slab.h:393
[   43.596287] BUG: sleeping function called from invalid context at mm/slab.h:393
[   45.108227] BUG: sleeping function called from invalid context at mm/slab.h:393
[   46.673371] BUG: sleeping function called from invalid context at mm/slab.h:393
[   47.997747] BUG: sleeping function called from invalid context at mm/slab.h:393
[   49.103199] BUG: sleeping function called from invalid context at mm/slab.h:393
[   52.045579] BUG: sleeping function called from invalid context at mm/slab.h:393
[   53.343135] BUG: sleeping function called from invalid context at mm/slab.h:393
[   54.683570] BUG: sleeping function called from invalid context at mm/slab.h:393
[   55.722280] BUG: sleeping function called from invalid context at mm/slab.h:393
[   56.861133] BUG: sleeping function called from invalid context at mm/slab.h:393
[   57.877091] BUG: sleeping function called from invalid context at mm/slab.h:393
[   59.019474] BUG: sleeping function called from invalid context at mm/slab.h:393
[   64.576782] BUG: sleeping function called from invalid context at mm/slab.h:393
[   72.171066] BUG: sleeping function called from invalid context at mm/slab.h:393
[   74.057061] BUG: sleeping function called from invalid context at mm/slab.h:393
[   75.641668] BUG: sleeping function called from invalid context at mm/slab.h:393

After all this excitement a boot, suprisingly enough, finishes.

If somebody wants to see a dmesg output please let me know.  It does not seem to be materially different from what is alredy here.

Comment 3 Michal Jaegermann 2016-07-30 23:20:54 UTC
(In reply to Michal Jaegermann from comment #2)
> It does not
> seem to be materially different from what is alredy here.

Acutally looking closer there is a small difference.  Instead of modprobe from the original report I see:

[    3.231229] BUG: sleeping function called from invalid context at mm/slab.h:393
[    3.231369] in_atomic(): 1, irqs_disabled(): 0, pid: 235, name: systemd-udevd
[    3.231459] no locks held by systemd-udevd/235.
[    3.231548] CPU: 0 PID: 235 Comm: systemd-udevd Not tainted 4.8.0-0.rc0.git3.1.fc26.x86_64 #1

This is in most of 33 cases mentioned before.  In some modprobe shows up too.

systemd-udev-231-2.fc26.x86_64, kmod-23-1.fc25.x86_64.

Comment 4 Adam Williamson 2016-08-03 16:05:32 UTC
I also see this on boot of a system (KVM) freshly installed from today's F25 x86_64 Server DVD nightly:

https://kojipkgs.fedoraproject.org/compose/branched/Fedora-25-20160803.n.0/compose/Server/x86_64/iso/Fedora-Server-dvd-x86_64-25-20160803.n.0.iso

booting the system normally seems to fail - it never reaches a login prompt - but booting with console=ttyS0 does give a login prompt on the serial console.

All openQA tests seem to have started failing on 2016-07-29. Between 2016-07-25 and 2016-07-29 there were no successful image composes, it seems; on 2016-07-24 most tests were passing. kernel 4.8 packages appeared on 2016-07-28, so this bug looks like a suspect in causing the boot failures, but labbott says it should not, and I *do* see the login prompt on the serial console, so I'll do some more poking around before declaring that this is the culprit for the boot failure.

Comment 5 Chris Murphy 2016-08-03 16:14:13 UTC
In a qemu-kvm VM I have a seemingly fully functional F25 Workstation installation with kernel-4.8.0-0.rc0.git3.1.fc25 with bunch of these BUG messages. It consistently gets to gdm. But I haven't updated it in perhaps two or three days, so I'd suspect something other than the kernel is causing it to not reach a login prompt.

Comment 6 Adam Williamson 2016-08-03 16:18:58 UTC
yeah, it seems to be something else, booting a 4.7 kernel stops this bug appearing but tty1 still doesn't get a login prompt. other ttys do, though. So I think it's a bug in systemd or something.

Comment 7 Joachim Frieben 2016-08-03 17:49:58 UTC
(In reply to Adam Williamson from comment #6)
Current Fedora 25 Workstation with kernel-4.8.0-0.rc0.git3.1.fc25 indeed still shows this bug but it is not fatal. It appears in the system output and gets reported by the problem reporting utility.
Booting in permissive mode might allow the user to reach graphical login for further analysis.

Comment 8 Laura Abbott 2016-08-08 12:54:47 UTC
*** Bug 1364714 has been marked as a duplicate of this bug. ***

Comment 9 Adam Williamson 2016-08-10 18:25:06 UTC
Proposing as a freeze exception issue, it seems reasonable to fix this for Alpha as it's pretty visible. Did the kernel team have a plan for what kernel build you want in Alpha?

Comment 10 Chris Murphy 2016-08-10 19:37:50 UTC
Doesn't happen with 4.8.0-0.rc1.git0.1.fc25.x86_64, even if I boot with slub_debug=F.

Comment 11 Adam Williamson 2016-08-10 19:45:38 UTC
ah, and that one got in under the freeze. so if others can confirm we can probably just close this.

Comment 12 Adam Williamson 2016-08-10 20:10:43 UTC
Yeah, no trace of this in an install from today's Server netinst with 4.8.0-0.rc1.git0.1, so that looks like fix confirmed, let's close it.

Comment 13 Michal Jaegermann 2016-08-10 21:59:02 UTC
(In reply to Chris Murphy from comment #10)
> Doesn't happen with 4.8.0-0.rc1.git0.1.fc25.x86_64,

As expected 4.8.0-0.rc1.git1.1.fc26.x86_64 does not sport this bug either.

Comment 14 Chris Murphy 2016-08-10 22:02:28 UTC
(In reply to Michal Jaegermann from comment #13)
> As expected 4.8.0-0.rc1.git1.1.fc26.x86_64 does not sport this bug either.

Good to know. I was about to test that since it has more debug stuff enabled than slub_debug, and I can't tell if the BUG messages would have appeared anyway without debug stuff enabled.

Comment 15 Joachim Frieben 2016-08-11 04:45:08 UTC
(In reply to Adam Williamson from comment #12)
Kernel 4.8.0-0.rc1.git0.1 is a bad one: on my Lenovo ThinkPad T400, like previous kernels of the 4.8.0 development line, it leads to a kernel panic when shutting down the machine. Kernel 4.8.0-0.rc1.git1.1.fc25 was the first one to fix this issue. It should be included in some later TC if not in the first one.

Comment 16 Adam Williamson 2016-08-11 05:03:51 UTC
That has nothing to do with this bug. You need to file it separately if you want it considered, we cannot track two completely different issues in one bug report.

Comment 17 Joachim Frieben 2016-08-11 05:25:30 UTC
(In reply to Adam Williamson from comment #16)
Of course, but that is why I had not closed the bug report yet; issue filed  as bug 1366104.