Bug 240765 - Kernel lockup in mptscsih SCSI driver
Kernel lockup in mptscsih SCSI driver
Product: Fedora
Classification: Fedora
Component: kernel-xen (Show other bugs)
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Eduardo Habkost
Virtualization Bugs
Depends On:
  Show dependency treegraph
Reported: 2007-05-21 11:15 EDT by Alec Leamas
Modified: 2009-12-14 15:38 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-02-26 19:13:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Excerpts from /var/log/messages (149.11 KB, text/plain)
2007-06-05 05:54 EDT, Alec Leamas
no flags Details
Working grub.conf (1.71 KB, application/x-extension-conf)
2007-06-14 08:53 EDT, Alec Leamas
no flags Details

  None (edit)
Description Alec Leamas 2007-05-21 11:15:30 EDT
Description of problem:

Kernel 2.6.20-1.2948-xen.fc6

How reproducible: Always

Steps to Reproduce:
1. Boot the xen kernel on my system
2. Run some commands in GUI,GUI freezes. From now on the GUI isn't usable.
3. Log in to a text console (ctrl-alt-F3)
<the printouts hereafter are just noted by hand, there might be som errors.>
4. Watch a printout mptscsih: Attempting task abort 
5. Wait a few seconds, see printout mptsci: task abort SUCCEEDED.
6. Xen kernel dumps, lots of lines, some looks lika a stack trace?!
7 Steps 4-7 repeats, with a few seconds delay.
8 A power  cycle (nothing else) restores system.

This bug might be related to  bug  208033; not sure t. It was not present
in the initial releases of FC6, but it has been with me for some time now (hoped
it would disappear without a bug report in a later release).

The non-xen kernel works just fine, and have been doing all the time.

Hw:  SCSI card: LSI Logic 53C1030 Dual U320 (as reported by hwbrowser)
Mobo Asus M2n32 Ws Pro
Comment 1 Alec Leamas 2007-05-22 07:48:09 EDT
Some additional info after reading 208333 once again:
- Decreasing the transfer speed does not help
- Nor does adding the  mptscsih="width:0 factor:0x0A" option to the kernel.

# cat /proc/mpt/version 
  Fusion MPT base driver
  Fusion MPT SPI host driver

[root@hemulen]# cat /proc/mpt/summary 
ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=11
ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=10

Comment 2 Stephen Tweedie 2007-05-24 11:07:41 EDT
Can you please try with the latest kernel-xen from updates-testing?  There is a
fix in that which addresses an interrupt-handling bug which could easily result
in what you're seeing.
Comment 3 Alec Leamas 2007-05-25 04:19:23 EDT
Writing this answer running the new kernel. So its at least better than the old
one. Coming back when I've tried a little more.

BTW, is there a GPG key for updates-testing? If so, where?

Comment 4 Alec Leamas 2007-05-25 04:55:58 EDT
Now I've tested a little more, and I shouldn't have been so lucky a few minutes
ago ;-)

No, the kernel is not stable. The first time I tested (and wrote previous
message)  I was able to start a windows hvm guest ok. However, after a few
minutes the system locked up completely, totally dead and I had to reboot.

I have now restored the SCSI transfer speed to 320 Mbps. After doing the this,
the new kernel locked down more or less immediately after logging in. This time
it's not possible to start any text console - it's just dead. 

Sorry not beeing able to provide anyh mer more precise info...'

Comment 5 Eduardo Habkost 2007-06-04 08:50:47 EDT
Using the new kernel, do you get anything that looks like a stack trace or 
Oops message on the console? If it is not being written to logs, a picture of 
the screen would be useful.
Comment 6 Alec Leamas 2007-06-05 05:52:17 EDT
There was absolutely nothing on the screen, it just frooze. I'll attach
Comment 7 Alec Leamas 2007-06-05 05:54:58 EDT
Created attachment 156196 [details]
Excerpts from /var/log/messages 

Excerpts while booting released kernel 2.6.20-1.2952-xen.fc6. Ends with a
power-cycle reboot.
Comment 8 Alec Leamas 2007-06-14 08:53:44 EDT
Created attachment 156986 [details]
Working grub.conf

Resolved! Adding NOAPIC  boot option to the xen kernel removes problem. 

Notes: Not sure if bug should just be closed, or somehow sent to the
installation program. The root cause in my case was the grub.conf generated by
the kernel installation rpm which didn't copy the noapic option present in the
normal kernel boot line to the xen kernel boot line. Leaving bug open.

But of course, the real root cause is my limited brain.
Comment 9 Red Hat Bugzilla 2007-07-24 21:41:37 EDT
change QA contact
Comment 10 Chris Lalancette 2008-02-26 19:13:32 EST
This report targets FC6, which is now end-of-life.

Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.


Note You need to log in before you can comment on or make changes to this bug.