Red Hat Bugzilla – Bug 240765
Kernel lockup in mptscsih SCSI driver
Last modified: 2009-12-14 15:38:11 EST
Description of problem:
How reproducible: Always
Steps to Reproduce:
1. Boot the xen kernel on my system
2. Run some commands in GUI,GUI freezes. From now on the GUI isn't usable.
3. Log in to a text console (ctrl-alt-F3)
<the printouts hereafter are just noted by hand, there might be som errors.>
4. Watch a printout mptscsih: Attempting task abort
5. Wait a few seconds, see printout mptsci: task abort SUCCEEDED.
6. Xen kernel dumps, lots of lines, some looks lika a stack trace?!
7 Steps 4-7 repeats, with a few seconds delay.
8 A power cycle (nothing else) restores system.
This bug might be related to bug 208033; not sure t. It was not present
in the initial releases of FC6, but it has been with me for some time now (hoped
it would disappear without a bug report in a later release).
The non-xen kernel works just fine, and have been doing all the time.
Hw: SCSI card: LSI Logic 53C1030 Dual U320 (as reported by hwbrowser)
Mobo Asus M2n32 Ws Pro
Some additional info after reading 208333 once again:
- Decreasing the transfer speed does not help
- Nor does adding the mptscsih="width:0 factor:0x0A" option to the kernel.
# cat /proc/mpt/version
Fusion MPT base driver
Fusion MPT SPI host driver
[root@hemulen]# cat /proc/mpt/summary
ioc0: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=11
ioc1: LSI53C1030, FwRev=01032700h, Ports=1, MaxQ=222, IRQ=10
Can you please try with the latest kernel-xen from updates-testing? There is a
fix in that which addresses an interrupt-handling bug which could easily result
in what you're seeing.
Writing this answer running the new kernel. So its at least better than the old
one. Coming back when I've tried a little more.
BTW, is there a GPG key for updates-testing? If so, where?
Now I've tested a little more, and I shouldn't have been so lucky a few minutes
No, the kernel is not stable. The first time I tested (and wrote previous
message) I was able to start a windows hvm guest ok. However, after a few
minutes the system locked up completely, totally dead and I had to reboot.
I have now restored the SCSI transfer speed to 320 Mbps. After doing the this,
the new kernel locked down more or less immediately after logging in. This time
it's not possible to start any text console - it's just dead.
Sorry not beeing able to provide anyh mer more precise info...'
Using the new kernel, do you get anything that looks like a stack trace or
Oops message on the console? If it is not being written to logs, a picture of
the screen would be useful.
There was absolutely nothing on the screen, it just frooze. I'll attach
Created attachment 156196 [details]
Excerpts from /var/log/messages
Excerpts while booting released kernel 2.6.20-1.2952-xen.fc6. Ends with a
Created attachment 156986 [details]
Resolved! Adding NOAPIC boot option to the xen kernel removes problem.
Notes: Not sure if bug should just be closed, or somehow sent to the
installation program. The root cause in my case was the grub.conf generated by
the kernel installation rpm which didn't copy the noapic option present in the
normal kernel boot line to the xen kernel boot line. Leaving bug open.
But of course, the real root cause is my limited brain.
change QA contact
This report targets FC6, which is now end-of-life.
Please re-test against Fedora 7 or later, and if the issue persists, open a new bug.