Bug 795992 - Install DVD won't boot, displays messages: rcu_sched detected stalls on CPUs/tasks
Summary: Install DVD won't boot, displays messages: rcu_sched detected stalls on CPUs/...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 17
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-22 00:19 UTC by Tim Flink
Modified: 2012-03-17 21:37 UTC (History)
6 users (show)

Fixed In Version: kernel-3.3.0-0.rc4.git1.4.fc17
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-28 10:56:03 UTC
Type: ---


Attachments (Terms of Use)
console log of failed boot attempt with f17 alpha RC3 DVD (447.46 KB, text/x-log)
2012-02-22 00:19 UTC, Tim Flink
no flags Details
console log of boot attempt with updated kernel (69.93 KB, text/plain)
2012-02-22 17:46 UTC, Tim Flink
no flags Details

Description Tim Flink 2012-02-22 00:19:29 UTC
Created attachment 564804 [details]
console log of failed boot attempt with f17 alpha RC3 DVD

When I boot the F17 Alpha RC3 or RC4 DVD on my machine, it just spits out error messages for every core on my CPU (Intel Core 2 Quad Q6600) on a regular basis. I tried leaving it alone for at least 30 minutes but the boot process never finished.

The error messages that I'm seeing (1 for each core) are similar to:

NMI backtrace for cpu 1
CPU 1 
Modules linked in:

Pid: 0, comm: swapper/1 Tainted: G          I  3.3.0-0.rc3.git7.2.fc17.x86_64 #1 Hewlett-Packard HP xw4600 Workstation/0AA0h
RIP: 0010:[<ffffffff81043ca6>]  [<ffffffff81043ca6>] native_safe_halt+0x6/0x10
RSP: 0018:ffff880118cade18  EFLAGS: 00000206
RAX: ffff880118ca2680 RBX: ffff8801100e2770 RCX: 0000000225c17d03
RDX: ffff880118ca2680 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff880118cade18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8801100e2520
R13: 0000000000000001 R14: ffff8801100e2540 R15: 127488014ef52db3
FS:  0000000000000000(0000) GS:ffff88011b000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001c05000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/1 (pid: 0, threadinfo ffff880118cac000, task ffff880118ca2680)
Stack:
 ffff880118cade28 ffffffff813b1061 ffff880118cade38 ffffffff813b109a
 ffff880118cade98 ffffffff813b1111 0000000000000003 000000003b91ebf1
 0000000000000003 000000003b91ebf1 0000000000000000 ffff8801100e2540
Call Trace:
 [<ffffffff813b1061>] acpi_safe_halt+0x2f/0x4d
 [<ffffffff813b109a>] acpi_idle_do_entry+0x1b/0x2b
 [<ffffffff813b1111>] acpi_idle_enter_c1+0x67/0xc9
 [<ffffffff81519c53>] cpuidle_idle_call+0xb3/0x540
 [<ffffffff8101821f>] cpu_idle+0xbf/0x130 
 [<ffffffff8168a5f9>] start_secondary+0x290/0x292
Code: 00 00 00 00 00 55 48 89 e5 fa 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb 5d c3 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 fb f4 <5d> c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 f4 5d c3 66 0f 1f 84 
Call Trace:
 [<ffffffff813b1061>] acpi_safe_halt+0x2f/0x4d
 [<ffffffff813b109a>] acpi_idle_do_entry+0x1b/0x2b
 [<ffffffff813b1111>] acpi_idle_enter_c1+0x67/0xc9
 [<ffffffff81519c53>] cpuidle_idle_call+0xb3/0x540
 [<ffffffff8101821f>] cpu_idle+0xbf/0x130
 [<ffffffff8168a5f9>] start_secondary+0x290/0x292

I've attached the boot log that I grabbed from the serial console with args:
initrd=initrd.img root=live:CDLABEL=Fedora\x2017-Alpha\x20x86_64 rd.luks=0 rd.md=0 rd.dm=0 rd.debug console=tty0 console=ttyS0,38400n8 BOOT_IMAGE=vmlinuz

Comment 1 Josh Boyer 2012-02-22 01:29:45 UTC
That's, erm... cute.

We had a report of massive slowness for some people in bug 795050.  I dropped an RCU related patch because of it that might be causing this.  I will admit that is just a slightly educated guess, but if you could try:

http://koji.fedoraproject.org/koji/taskinfo?taskID=3809140

when it completes we'll know for sure.

Comment 2 Tim Flink 2012-02-22 17:46:51 UTC
Created attachment 565056 [details]
console log of boot attempt with updated kernel

I built a custom boot.iso with the kernel mentioned in comment#1

I see the same symptoms on boot with a slightly different stack trace. I wonder if the following warning is at all related:

------------[ cut here ]------------
WARNING: at drivers/iommu/dmar.c:492 warn_invalid_dmar+0x92/0xa0()
Hardware name: HP xw4600 Workstation
Your BIOS is broken; DMAR reported at address fed90000 returns all ones!
BIOS vendor: Hewlett-Packard; Ver: 786F3 v01.22; Product Version:  
Modules linked in:
Pid: 0, comm: swapper Not tainted 3.3.0-0.rc3.git7.2.fc17.x86_64 #1
Call Trace:
 [<ffffffff81060bef>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff81060c8f>] warn_slowpath_fmt_taint+0x3f/0x50
 [<ffffffff81044119>] ? native_flush_tlb_single+0x9/0x10
 [<ffffffff81f0aa98>] ? __early_set_fixmap+0x99/0xa0
 [<ffffffff815390f2>] warn_invalid_dmar+0x92/0xa0
 [<ffffffff81f34f9f>] check_zero_address+0xc8/0xf7
 [<ffffffff816ab0df>] ? bad_to_user+0x7f9/0x7f9
 [<ffffffff81f34fe5>] detect_intel_iommu+0x17/0xb9
 [<ffffffff81efe068>] pci_iommu_alloc+0x4a/0x73
 [<ffffffff81f0a857>] mem_init+0x19/0xed
 [<ffffffff816900b5>] ? set_nmi_gate+0x48/0x4a
 [<ffffffff81ef6a3a>] start_kernel+0x1f4/0x407
 [<ffffffff81ef6346>] x86_64_start_reservations+0x131/0x135
 [<ffffffff81ef644a>] x86_64_start_kernel+0x100/0x10f
---[ end trace a7919e7f17c0a725 ]---

Comment 3 Josh Boyer 2012-02-22 18:05:03 UTC
(In reply to comment #2)
> Created attachment 565056 [details]
> console log of boot attempt with updated kernel
> 
> I built a custom boot.iso with the kernel mentioned in comment#1

Erm... I think whatever you did went wrong.  3.3.0-rc3.git7.2.fc17.x86_64 is the kernel you originally had issues with.  The kernel I built in comment #1 is 3.3.0-rc4.git1.4

> I see the same symptoms on boot with a slightly different stack trace. I wonder
> if the following warning is at all related:
> 
> ------------[ cut here ]------------
> WARNING: at drivers/iommu/dmar.c:492 warn_invalid_dmar+0x92/0xa0()
> Hardware name: HP xw4600 Workstation
> Your BIOS is broken; DMAR reported at address fed90000 returns all ones!
> BIOS vendor: Hewlett-Packard; Ver: 786F3 v01.22; Product Version:  
> Modules linked in:

Broken BIOSes are usually pretty crappy.  Look for an update or boot with iommu=off

Comment 4 Tim Flink 2012-02-22 18:46:50 UTC
(In reply to comment #3)
> Erm... I think whatever you did went wrong.  3.3.0-rc3.git7.2.fc17.x86_64 is
> the kernel you originally had issues with.  The kernel I built in comment #1 is 3.3.0-rc4.git1.4

Crap, I didn't notice that. One of these days, I'm going to fix this iso building script to quit when it can't find the updated builds I want.

Will retry, verifying the presence of the updated kernel this time.

Comment 5 Tim Flink 2012-02-22 19:35:14 UTC
OK, I built another custom boot.iso using the right kernel this time.

I am now able to boot into the installer without issue. The new kernel appears to have fixed the problem I was seeing.

Comment 6 Josh Boyer 2012-02-22 19:53:41 UTC
(In reply to comment #5)
> OK, I built another custom boot.iso using the right kernel this time.
> 
> I am now able to boot into the installer without issue. The new kernel appears
> to have fixed the problem I was seeing.

Thanks Tim.  I'll get this queued up as an update today.

Comment 7 Fedora Update System 2012-02-22 19:58:20 UTC
kernel-3.3.0-0.rc4.git1.4.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/kernel-3.3.0-0.rc4.git1.4.fc17

Comment 8 Fedora Update System 2012-02-23 22:31:13 UTC
Package kernel-3.3.0-0.rc4.git1.4.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.3.0-0.rc4.git1.4.fc17'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-2304/kernel-3.3.0-0.rc4.git1.4.fc17
then log in and leave karma (feedback).

Comment 9 Fedora Update System 2012-02-28 10:56:03 UTC
kernel-3.3.0-0.rc4.git1.4.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.