Always get an Oops when booting 2.4.3-7, both SMP and UP. The boot hangs for a few seconds at "Updating /etc/fstab", then oopses. One time I received a message "Kernel panic: HOST_MSG_LOOP with invalid SCB bf In interrupt handler - not syncing", without an oops. This is only with the new aic7xxx_mod.o. I can boot 2.4.3-7 succesfully by using the old aic7xxx.o instead of aic7xxx_mod.o
Created attachment 19731 [details] Oops - manually copied by hand.
Created attachment 19732 [details] lspci output.
Interesting.. the oops suggests USB is to blame, but WTF does switching aic7xxx drivers help :)
This oops is from the SMP kernel, and it oopsed while kudzu was running. Perhaps there's a race issue that's being hit - there are some cals to aic7xxx in the stack backtrace.
I would like to see a result of a small experiment: rename usb-uhci.o to something that prevents it from loading. Also... I strongly suggest using a serial console whenever possible. Typing all of the oops trace is not an easy job, care for your fingers!
No dice. aic7xxx_mod.o still oopses even if usb-uhci is not loaded. I'm going to try to boot up the latest kernel from rawhide...
aic7xxx_mod.o still OOPSes in 2.4.5-0.2.9 aic7xxx.o still works without any problems aic7xxx_mod always oopses at "updating /etc/fstab", which is when kudzu runs.
Still oopses in 2.4.3-12 The kernel hangs for about 15 seconds before it oopses. There's a SCSI bus reset about 5 seconds before the oops. Looks like what's happening is that something that kudzu's frobbing is causing aic7xxx_mod.o to do a SCSI bus reset, then oops. I can also reproduce this on another machine, with slightly different hardware.
Created attachment 21706 [details] Another chipset that OOPses aic7xxx_mod.o
Seems it's a good thing that this driver isn't used as default, and judging by reports on linux-kernel about 2.4.6preLatest, it still isn't stable enough.
Adding Justin Gibbs to the Cc: list so that he can comment on the problem since it's his driver.
Verified that the oops still exists in 2.4.7-0.3 Excepts that after the oops, the kernel is now printing SCSI bus is being reset for host 1 channel 0 SCSI host 1 abort (pid 0) timed out - resetting SCSI bus is being reset for host 1 channel 0 SCSI host 1 channel 0 reset (pid 0) timed out - trying harder These message keep being printed to the console, in an infinite loop
Created attachment 32318 [details] I managed to capture this oops on a serial console, and thus get a good ksymoops run. This is aic7xxx_mod in 2.4.7-10smp.
Created attachment 32319 [details] The entire 2.4.7-10smp boot sequence, ending with aic7xxx_mod oopsing when kudzu runs.
From looking at the serial console log, it appears that both the new and old driver are getting loaded. With 6.1.13, only the BAR being actively used (MEM I/O or PIO) is reserved. This may allow the old driver to attach should kudzu decide that the older driver is appropriate. I have seen on several installs here that, should you have more than one aic7xxx controller, the generated /etc/modules.conf can have mixed new and old driver entries. For example, if you do an expert install and choose the new driver, you'll get one entry for it, but an entry specifying the old driver for each additional controller. I don't know what would be required to make kudzu do the right thing as far as sticking with the driver the user chose during install. Anyway, I would expect that if you cleaned up /etc/modules.conf so that you don't have two drivers attempting to run the same hardware, that you'll have a better time of it. It would also be a good idea to upgrade to something much more recent than 6.1.13. The driver is currently at 6.2.3.
Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/