Bug 42391

Summary: (SCSI AIC7XXX)2.4.3-7 Oops - aic7xxx_mod.o
Product: [Retired] Red Hat Linux Reporter: Sam Varshavchik <mrsam>
Component: kernelAssignee: Doug Ledford <dledford>
Status: CLOSED CURRENTRELEASE QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: dledford, gibbs, zaitcev
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-30 15:39:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Oops - manually copied by hand.
none
lspci output.
none
Another chipset that OOPses aic7xxx_mod.o
none
I managed to capture this oops on a serial console, and thus get a good ksymoops run. This is aic7xxx_mod in 2.4.7-10smp.
none
The entire 2.4.7-10smp boot sequence, ending with aic7xxx_mod oopsing when kudzu runs. none

Description Sam Varshavchik 2001-05-26 23:22:47 UTC
Always get an Oops when booting 2.4.3-7, both SMP and UP.  The boot hangs
for a few seconds at "Updating /etc/fstab", then oopses.  One time I
received a message "Kernel panic: HOST_MSG_LOOP with invalid SCB bf
In interrupt handler - not syncing", without an oops.

This is only with the new aic7xxx_mod.o.  I can boot 2.4.3-7 succesfully by
using the old aic7xxx.o instead of aic7xxx_mod.o

Comment 1 Sam Varshavchik 2001-05-26 23:23:49 UTC
Created attachment 19731 [details]
Oops - manually copied by hand.

Comment 2 Sam Varshavchik 2001-05-26 23:24:54 UTC
Created attachment 19732 [details]
lspci output.

Comment 3 Arjan van de Ven 2001-06-06 09:34:34 UTC
Interesting.. the oops suggests USB is to blame, but WTF does switching 
aic7xxx drivers help :)



Comment 4 Sam Varshavchik 2001-06-06 11:47:50 UTC
This oops is from the SMP kernel, and it oopsed while kudzu was running. 
Perhaps there's a race issue that's being hit - there are some cals to aic7xxx
in the stack backtrace.

Comment 5 Pete Zaitcev 2001-06-06 16:33:45 UTC
I would like to see a result of a small experiment:
rename usb-uhci.o to something that prevents it
from loading.

Also... I strongly suggest using a serial console
whenever possible. Typing all of the oops trace
is not an easy job, care for your fingers!


Comment 6 Sam Varshavchik 2001-06-09 00:40:08 UTC
No dice.  aic7xxx_mod.o still oopses even if usb-uhci is not loaded.

I'm going to try to boot up the latest kernel from rawhide...



Comment 7 Sam Varshavchik 2001-06-10 18:20:00 UTC
aic7xxx_mod.o still OOPSes in 2.4.5-0.2.9

aic7xxx.o still works without any problems

aic7xxx_mod always oopses at "updating /etc/fstab", which is when kudzu runs.



Comment 8 Sam Varshavchik 2001-06-25 02:10:36 UTC
Still oopses in 2.4.3-12

The kernel hangs for about 15 seconds before it oopses.  There's a SCSI bus
reset about 5 seconds before the oops.

Looks like what's happening is that something that kudzu's frobbing is causing
aic7xxx_mod.o to do a SCSI bus reset, then oops.

I can also reproduce this on another machine, with slightly different hardware.





Comment 9 Sam Varshavchik 2001-06-25 02:11:19 UTC
Created attachment 21706 [details]
Another chipset that OOPses aic7xxx_mod.o

Comment 10 Arjan van de Ven 2001-06-25 09:18:06 UTC
Seems it's a good thing that this driver isn't used as default, and judging
by reports on linux-kernel about 2.4.6preLatest, it still isn't stable enough.

Comment 11 Doug Ledford 2001-08-02 17:31:41 UTC
Adding Justin Gibbs to the Cc: list so that he can comment on the problem since
it's his driver.

Comment 12 Sam Varshavchik 2001-08-05 04:13:58 UTC
Verified that the oops still exists in 2.4.7-0.3

Excepts that after the oops, the kernel is now printing

SCSI bus is being reset for host 1 channel 0
SCSI host 1 abort (pid 0) timed out - resetting
SCSI bus is being reset for host 1 channel 0
SCSI host 1 channel 0 reset (pid 0) timed out - trying harder

These message keep being printed to the console, in an infinite loop



Comment 13 Sam Varshavchik 2001-09-21 03:25:42 UTC
Created attachment 32318 [details]
I managed to capture this oops on a serial console, and thus get a good ksymoops run.  This is aic7xxx_mod in 2.4.7-10smp.

Comment 14 Sam Varshavchik 2001-09-21 03:27:43 UTC
Created attachment 32319 [details]
The entire 2.4.7-10smp boot sequence, ending with aic7xxx_mod oopsing when kudzu runs.

Comment 15 Justin T. Gibbs 2001-09-21 20:17:57 UTC
From looking at the serial console log, it appears that both the new and old
driver are getting loaded.  With 6.1.13, only the BAR being actively used
(MEM I/O or PIO) is reserved.  This may allow the old driver to attach should
kudzu decide that the older driver is appropriate.  I have seen on several
installs here that, should you have more than one aic7xxx controller, the
generated /etc/modules.conf can have mixed new and old driver entries.  For
example, if you do an expert install and choose the new driver, you'll get one
entry for it, but an entry specifying the old driver for each additional
controller.  I don't know what would be required to make kudzu do the right
thing as far as sticking with the driver the user chose during install.

Anyway, I would expect that if you cleaned up /etc/modules.conf so that you
don't have two drivers attempting to run the same hardware, that you'll have
a better time of it.  It would also be a good idea to upgrade to something
much more recent than 6.1.13.  The driver is currently at 6.2.3.

Comment 16 Bugzilla owner 2004-09-30 15:39:01 UTC
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/