Bug 42391 - (SCSI AIC7XXX)2.4.3-7 Oops - aic7xxx_mod.o
Summary: (SCSI AIC7XXX)2.4.3-7 Oops - aic7xxx_mod.o
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Doug Ledford
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2001-05-26 23:22 UTC by Sam Varshavchik
Modified: 2008-08-01 16:22 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-30 15:39:01 UTC
Embargoed:


Attachments (Terms of Use)
Oops - manually copied by hand. (3.08 KB, text/plain)
2001-05-26 23:23 UTC, Sam Varshavchik
no flags Details
lspci output. (2.91 KB, text/plain)
2001-05-26 23:24 UTC, Sam Varshavchik
no flags Details
Another chipset that OOPses aic7xxx_mod.o (2.61 KB, text/plain)
2001-06-25 02:11 UTC, Sam Varshavchik
no flags Details
I managed to capture this oops on a serial console, and thus get a good ksymoops run. This is aic7xxx_mod in 2.4.7-10smp. (7.68 KB, text/plain)
2001-09-21 03:25 UTC, Sam Varshavchik
no flags Details
The entire 2.4.7-10smp boot sequence, ending with aic7xxx_mod oopsing when kudzu runs. (12.18 KB, text/plain)
2001-09-21 03:27 UTC, Sam Varshavchik
no flags Details

Description Sam Varshavchik 2001-05-26 23:22:47 UTC
Always get an Oops when booting 2.4.3-7, both SMP and UP.  The boot hangs
for a few seconds at "Updating /etc/fstab", then oopses.  One time I
received a message "Kernel panic: HOST_MSG_LOOP with invalid SCB bf
In interrupt handler - not syncing", without an oops.

This is only with the new aic7xxx_mod.o.  I can boot 2.4.3-7 succesfully by
using the old aic7xxx.o instead of aic7xxx_mod.o

Comment 1 Sam Varshavchik 2001-05-26 23:23:49 UTC
Created attachment 19731 [details]
Oops - manually copied by hand.

Comment 2 Sam Varshavchik 2001-05-26 23:24:54 UTC
Created attachment 19732 [details]
lspci output.

Comment 3 Arjan van de Ven 2001-06-06 09:34:34 UTC
Interesting.. the oops suggests USB is to blame, but WTF does switching 
aic7xxx drivers help :)



Comment 4 Sam Varshavchik 2001-06-06 11:47:50 UTC
This oops is from the SMP kernel, and it oopsed while kudzu was running. 
Perhaps there's a race issue that's being hit - there are some cals to aic7xxx
in the stack backtrace.

Comment 5 Pete Zaitcev 2001-06-06 16:33:45 UTC
I would like to see a result of a small experiment:
rename usb-uhci.o to something that prevents it
from loading.

Also... I strongly suggest using a serial console
whenever possible. Typing all of the oops trace
is not an easy job, care for your fingers!


Comment 6 Sam Varshavchik 2001-06-09 00:40:08 UTC
No dice.  aic7xxx_mod.o still oopses even if usb-uhci is not loaded.

I'm going to try to boot up the latest kernel from rawhide...



Comment 7 Sam Varshavchik 2001-06-10 18:20:00 UTC
aic7xxx_mod.o still OOPSes in 2.4.5-0.2.9

aic7xxx.o still works without any problems

aic7xxx_mod always oopses at "updating /etc/fstab", which is when kudzu runs.



Comment 8 Sam Varshavchik 2001-06-25 02:10:36 UTC
Still oopses in 2.4.3-12

The kernel hangs for about 15 seconds before it oopses.  There's a SCSI bus
reset about 5 seconds before the oops.

Looks like what's happening is that something that kudzu's frobbing is causing
aic7xxx_mod.o to do a SCSI bus reset, then oops.

I can also reproduce this on another machine, with slightly different hardware.





Comment 9 Sam Varshavchik 2001-06-25 02:11:19 UTC
Created attachment 21706 [details]
Another chipset that OOPses aic7xxx_mod.o

Comment 10 Arjan van de Ven 2001-06-25 09:18:06 UTC
Seems it's a good thing that this driver isn't used as default, and judging
by reports on linux-kernel about 2.4.6preLatest, it still isn't stable enough.

Comment 11 Doug Ledford 2001-08-02 17:31:41 UTC
Adding Justin Gibbs to the Cc: list so that he can comment on the problem since
it's his driver.

Comment 12 Sam Varshavchik 2001-08-05 04:13:58 UTC
Verified that the oops still exists in 2.4.7-0.3

Excepts that after the oops, the kernel is now printing

SCSI bus is being reset for host 1 channel 0
SCSI host 1 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 1 channel 0
SCSI host 1 channel 0 reset (pid 0) timed out - trying harder

These message keep being printed to the console, in an infinite loop



Comment 13 Sam Varshavchik 2001-09-21 03:25:42 UTC
Created attachment 32318 [details]
I managed to capture this oops on a serial console, and thus get a good ksymoops run.  This is aic7xxx_mod in 2.4.7-10smp.

Comment 14 Sam Varshavchik 2001-09-21 03:27:43 UTC
Created attachment 32319 [details]
The entire 2.4.7-10smp boot sequence, ending with aic7xxx_mod oopsing when kudzu runs.

Comment 15 Justin T. Gibbs 2001-09-21 20:17:57 UTC
From looking at the serial console log, it appears that both the new and old
driver are getting loaded.  With 6.1.13, only the BAR being actively used
(MEM I/O or PIO) is reserved.  This may allow the old driver to attach should
kudzu decide that the older driver is appropriate.  I have seen on several
installs here that, should you have more than one aic7xxx controller, the
generated /etc/modules.conf can have mixed new and old driver entries.  For
example, if you do an expert install and choose the new driver, you'll get one
entry for it, but an entry specifying the old driver for each additional
controller.  I don't know what would be required to make kudzu do the right
thing as far as sticking with the driver the user chose during install.

Anyway, I would expect that if you cleaned up /etc/modules.conf so that you
don't have two drivers attempting to run the same hardware, that you'll have
a better time of it.  It would also be a good idea to upgrade to something
much more recent than 6.1.13.  The driver is currently at 6.2.3.

Comment 16 Bugzilla owner 2004-09-30 15:39:01 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/



Note You need to log in before you can comment on or make changes to this bug.