Bug 240765 - Kernel lockup in mptscsih SCSI driver
Summary: Kernel lockup in mptscsih SCSI driver
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-xen
Version: 6
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Eduardo Habkost
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-05-21 15:15 UTC by Alec Leamas
Modified: 2009-12-14 20:38 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-02-27 00:13:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Excerpts from /var/log/messages (149.11 KB, text/plain)
2007-06-05 09:54 UTC, Alec Leamas
no flags Details
Working grub.conf (1.71 KB, application/x-extension-conf)
2007-06-14 12:53 UTC, Alec Leamas
no flags Details

Description Alec Leamas 2007-05-21 15:15:30 UTC
Description of problem:
 208033

Kernel 2.6.20-1.2948-xen.fc6
 208033

How reproducible: Always


Steps to Reproduce:
1. Boot the xen kernel on my system
2. Run some commands in GUI,GUI freezes. From now on the GUI isn't usable.
3. Log in to a text console (ctrl-alt-F3)
<the printouts hereafter are just noted by hand, there might be som errors.>
4. Watch a printout mptscsih: Attempting task abort 
5. Wait a few seconds, see printout mptsci: task abort SUCCEEDED.
6. Xen kernel dumps, lots of lines, some looks lika a stack trace?!
7 Steps 4-7 repeats, with a few seconds delay.
8 A power  cycle (nothing else) restores system.

This bug might be related to  bug  208033; not sure t. It was not present
in the initial releases of FC6, but it has been with me for some time now (hoped
it would disappear without a bug report in a later release).

The non-xen kernel works just fine, and have been doing all the time.

Hw:  SCSI card: LSI Logic 53C1030 Dual U320 (as reported by hwbrowser)
Mobo Asus M2n32 Ws Pro

Comment 1 Alec Leamas 2007-05-22 11:48:09 UTC
Some additional info after reading 208333 once again:
- Decreasing the transfer speed does not help
- Nor does adding the  mptscsih="width:0 factor:0x0A" option to the kernel.

# cat /proc/mpt/version 
mptlinux-3.04.03
  Fusion MPT base driver
  Fusion MPT SPI host driver

[root@hemulen]# cat /proc/mpt/summary 
ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=11
ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=10



Comment 2 Stephen Tweedie 2007-05-24 15:07:41 UTC
Can you please try with the latest kernel-xen from updates-testing?  There is a
fix in that which addresses an interrupt-handling bug which could easily result
in what you're seeing.


Comment 3 Alec Leamas 2007-05-25 08:19:23 UTC
Writing this answer running the new kernel. So its at least better than the old
one. Coming back when I've tried a little more.

BTW, is there a GPG key for updates-testing? If so, where?

--alec

Comment 4 Alec Leamas 2007-05-25 08:55:58 UTC
Now I've tested a little more, and I shouldn't have been so lucky a few minutes
ago ;-)

No, the kernel is not stable. The first time I tested (and wrote previous
message)  I was able to start a windows hvm guest ok. However, after a few
minutes the system locked up completely, totally dead and I had to reboot.

I have now restored the SCSI transfer speed to 320 Mbps. After doing the this,
the new kernel locked down more or less immediately after logging in. This time
it's not possible to start any text console - it's just dead. 

Sorry not beeing able to provide anyh mer more precise info...'

--alec

Comment 5 Eduardo Habkost 2007-06-04 12:50:47 UTC
Using the new kernel, do you get anything that looks like a stack trace or 
Oops message on the console? If it is not being written to logs, a picture of 
the screen would be useful.

Comment 6 Alec Leamas 2007-06-05 09:52:17 UTC
There was absolutely nothing on the screen, it just frooze. I'll attach
/var/log/messages

Comment 7 Alec Leamas 2007-06-05 09:54:58 UTC
Created attachment 156196 [details]
Excerpts from /var/log/messages 

Excerpts while booting released kernel 2.6.20-1.2952-xen.fc6. Ends with a
power-cycle reboot.

Comment 8 Alec Leamas 2007-06-14 12:53:44 UTC
Created attachment 156986 [details]
Working grub.conf

Resolved! Adding NOAPIC  boot option to the xen kernel removes problem. 

Notes: Not sure if bug should just be closed, or somehow sent to the
installation program. The root cause in my case was the grub.conf generated by
the kernel installation rpm which didn't copy the noapic option present in the
normal kernel boot line to the xen kernel boot line. Leaving bug open.

But of course, the real root cause is my limited brain.

Comment 9 Red Hat Bugzilla 2007-07-25 01:41:37 UTC
change QA contact

Comment 10 Chris Lalancette 2008-02-27 00:13:32 UTC
This report targets FC6, which is now end-of-life.

Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.

Thanks



Note You need to log in before you can comment on or make changes to this bug.