Bug 242639

Summary: Fedora 7 Dual Core CPU - 1 core soft lockup bug on Boot Intel 965 chipset
Product: [Fedora] Fedora Reporter: Matt Darcy <matt>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: low    
Version: 7CC: chris.brown, tglx
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.23.9-85.fc8 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-04 00:59:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Dmesg output none

Description Matt Darcy 2007-06-05 08:50:45 UTC
Description of problem:
Booting the default installed kernel of Fedora 7 (2.6.21-1.3194) with the
following boot option
irqpoll agp=off rhgb quiet

I get the following error on boot

BUG: soft lockup detected on CPU#0!

the boot process appears to start after approx a 4 - 5 minute wait and in all
other aspects appears to boot normally.

Once in fedora I can see and appear to be able to use cpu0 so I assume the
lockup is once again free

Version-Release number of selected component (if applicable):
Fedora 7 - kernel 2.6.21-1.3194

How reproducible:
Every Time


Steps to Reproduce:
1. Purchase board with intel 965 chipset (in my case MSI 965 Neo)
2. Load with an Intel E4400 Core 2 Duo chip and 4 gig of ram in 2 x 2GB modules
3. Install Fedora 7 (with work around for ixh8 chipset)
4. Add boot options to kernel to allow compatability
5. boot kernel wait for error message
  
Actual results:
Boots with CPU Soft lockup and performance degredation warnings


Expected results:
Normal Boot

Additional info:

Within Dmesg I see lots of warnings and unusual behaviour.

<snip>
BUG: soft lockup detected on CPU#0!

Call Trace:
 <IRQ>  [<ffffffff802ad31e>] softlockup_tick+0xd5/0xe7
 [<ffffffff8028ac35>] update_process_times+0x42/0x68
 [<ffffffff8026e710>] smp_local_timer_interrupt+0x34/0x55
 [<ffffffff8026ee36>] smp_apic_timer_interrupt+0x43/0x5b
 [<ffffffff80257d56>] apic_timer_interrupt+0x66/0x70
 [<ffffffff8020f926>] handle_IRQ_event+0x1a/0x53
 [<ffffffff802aebd2>] handle_edge_irq+0xe4/0x128
 [<ffffffff80265498>] do_IRQ+0xf1/0x15f
 [<ffffffff80257631>] ret_from_intr+0x0/0xa
 <EOI>
</snip>

and even though I'm booting with the "irqpoll" option I still see this warning

irq 19: nobody cared (try booting with the "irqpoll" option)
<snip>

Call Trace:
 <IRQ>  [<ffffffff802adfe6>] __report_bad_irq+0x30/0x72
 [<ffffffff802ae1f5>] note_interrupt+0x1cd/0x20e
 [<ffffffff802aeac6>] handle_fasteoi_irq+0xa9/0xd1
 [<ffffffff80265498>] do_IRQ+0xf1/0x15f
 [<ffffffff80257631>] ret_from_intr+0x0/0xa
 [<ffffffff8020f926>] handle_IRQ_event+0x1a/0x53
 [<ffffffff802aebd2>] handle_edge_irq+0xe4/0x128
 [<ffffffff80265498>] do_IRQ+0xf1/0x15f
 [<ffffffff80257631>] ret_from_intr+0x0/0xa
 <EOI>
handlers:
[<ffffffff803b3b8d>] (usb_hcd_irq+0x0/0x52)
Disabling IRQ #19

</snip>


and 

<snip>
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
BUG: warning at kernel/softirq.c:138/local_bh_enable() (Not tainted)

Call Trace:
 [<ffffffff80229e7b>] local_bh_enable+0x42/0x98
 [<ffffffff8025c008>] cond_resched_softirq+0x35/0x4b
 [<ffffffff8022e9f5>] release_sock+0x59/0xaa
 [<ffffffff802252a6>] tcp_sendmsg+0x9ae/0xab8
 [<ffffffff8025d937>] _spin_lock_bh+0x9/0x19
 [<ffffffff8024297c>] sock_aio_write+0x110/0x128
 [<ffffffff80218c67>] vsnprintf+0x55f/0x5a3
 [<ffffffff8024286c>] sock_aio_write+0x0/0x128
 [<ffffffff802cd13d>] do_sync_readv_writev+0xc0/0x107
 [<ffffffff80293107>] autoremove_wake_function+0x0/0x2e
 [<ffffffff8020c716>] do_sync_read+0xc9/0x10c
 [<ffffffff802cd00d>] rw_copy_check_uvector+0x6c/0xdc
 [<ffffffff802cd5ae>] do_readv_writev+0xd7/0x1ae
 [<ffffffff803e83b6>] move_addr_to_user+0x5d/0x78
 [<ffffffff802cd70c>] sys_writev+0x45/0x93
 [<ffffffff8025711e>] system_call+0x7e/0x83


</snip>

<snip>
Kernel command line: ro root=/dev/md1 irqpoll agp=off rhgb quiet
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
</snip>

I will attatch the dmesg output into a file.

Comment 1 Matt Darcy 2007-06-05 08:50:46 UTC
Created attachment 156188 [details]
Dmesg output

Comment 2 Matt Darcy 2007-06-07 10:35:14 UTC
Looking at this output again, this all appears to be related to IRQ handling.

Comment 3 Christopher Brown 2007-09-13 21:07:04 UTC
Hello Matt,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

There hasn't been much activity on this bug for a while. Could you tell me if
you are still having problems with the latest kernel?

If the problem no longer exists then please close this bug or I'll do so in a
few days if there is no additional information lodged.

Cheers
Chris

Comment 4 Matt Darcy 2008-01-04 00:39:33 UTC
problem has been resolved in Fedora 8 upgrade.