From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020827 Description of problem: There's still no workaround (or something) in the RedHat kernels for the buggy irq routing on 440gx+ boards. So people still have problems with many older servers based on this chipset Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.have an intel 440gx+ board with a single cpu plugged in 2.install redhat 3.reboot Actual Results: observe the aic7xxx driver failing ... Expected Results: should work. Additional info: I was VERY suprised to find out that stock 2.4.20-rc2 (config attached) works properly on this fubared board. NONE of the redhat supplied smp kernels from 2.4.2-2 to 2.4.18-18 worked on this particular board, even with all the tips from google and this bugzilla. So what's the reason 2.4.20-rc2 works and 2.4.18-18 from redhat doesn't?
Created attachment 85666 [details] my 2.4.20-rc2 config
I an attest that 2.4.20-rc4 does in fact work with L440GX+ board with a single CPU installed. I also have a DAC960 installed as well, root on raid5. Here is what I did. Boot net (dhcp) via "linux rescue" chroot /mnt/sysimage get linux-2.4.19.tar.gz via kernel.org get patch-2.4.20-rc4.gz via kernel.org patch /usr/src/linux-2.4.19 to make 2.4.20-rc4 rename /usr/src/linux-2.4.19 to /usr/src/linux-2.4.20-rc4 recompile kernel via (make oldconfig) using config-2.4.18-18smp as basis. mkinitrd -f /boot/initrd-2.4.20-rc4smp.img 2.4.20-rc4 update lilo.conf, etc. install and boot
I'd like to report having success with downloading 2.4.20 from kernel.org, copying RedHat's 2.4.18-19 .config, using make menuconfig to set APIC and IOAPIC and building/installing/booting 2.4.20. It seems to work, so far! This is an EMC AV1400 (a 440GX mb). Tim
FWIW If you recompile the redhat kernels with local APIC support for uniprocessors and IO-APIC support for uniprocessors enabled (You do not need to make any other changes) the std RedHat kernels work. I even built them as rpms because I have several machines with the 440 chipset and it is easier for me to keep track that way. Arjan has said in the past that the reason they (Red Hat) will not build their kernels that way was because this breaks too many laptops.
I tried this and I was still unable to get the machine to boot. In my case I do have an additional Adaptec PCI SCSI board installed, maybe that had something to do with it. So if this is the only problem, maybe a switch or something can be added to enable this at boot time? (like "linux apic", but that only seems to work on the BOOT kernel).
What works depends on a lot of random factors. Intel didn't want to provide the needed info to do something about this on the 440GX. That seems to be changing, fingers crossed. I'd be interested to know if the boot option "pci=biosirq" works
pci=biosirq doesn't help me on my L440GX+, bios 11.1.
I saw this in an announcment on lkml for a recent ac kernel: Improved 440GX bios workarounds (Arjan van de Ven) | Thanks to the guys at Intel for hints on this Is there any time line for getting these fixes either into an official errata kernel or rawhide??
We have an L440GX+ board that exhibits this problem. The various "pci=", SMP kernel, and "apic" workarounds did not help. Fortunately, the system we were installing to had a RAID controller (used a different driver), and an IDE CD-ROM. So, we did a "noprobe" install, skipped the "aic7xxx" driver, and got a working system. We then downloaded kernel 2.4.20 (final release, pristine sources), did a custom build, and tried that. It worked perfectly. It appears that those fixes Alan Cox mentioned did the trick.
Actually we know that you can recompile your kernel such that the bug doesn't show. However we can't do that since it breaks tons of others machines.
There was once a precomplied kernel to download for 7.3 to fix this issue are you still not providing that for 8.0 now as well?
there was no such kernel for RHL7.3, and there won't be one for RHL8.0 For RHL9 most of this should just work for anyone who has submitted their bios ID to us.
Hummm, How does one tell if "their bios id" was submitted?? I would like to have an idea if this is going to work before doing the upgrade. Is there a list I can look at to compare to my bios id?? Is the bios id something I get with dmidecode?? If it is not submitted is submitting it here sufficient to get in included in future errata kernels??
> How does one tell if "their bios id" was submitted?? if you added your dmidecode to a bug in bugzilla I have it added. dmidecode will show you your bios ID's.... If it starts with L440GX0.86B then it'll work most likely There is no way to do an erratum for this, since you can't even install if it's not working so you can't actually get to the point where you could use the erratum.
Your correct it was not 7.3 it was 7.1 where Redhat provided 3 different boot.img files for us to use instead. So regarding that you still providing those specific boot disks? And please read the old bugzilla 29555 cause there were more helpful people there when I had this issue.
Arjan van de Ven/Red Hat: Could you please elaborate on the comment, "Actually we know that you can recompile your kernel such that the bug doesn't show. However we can't do that since it breaks tons of others machines." If you are referring to the "enable APIC on uni-processor" option fix, please note that it did *not* work for our system. See comment #5 of bug 77090 and comment #3 of bug 55358. If you are referring to a *different* kernel recompile fix, please let us know what it is. For completeness, all of the following bugs appear to be related to this one: Bug 29555 (RHL 7.1), bug 55358 (RHL 7.2), bug 64971 (RHL 7.3), bug 77090 (RHL 8.0), and bug 77643 (RHL 8.0). The "APIC fix" has worked for some but not everyone.
WHOOHOOOOO!! I just upgraded my 440 to kernel-2.4.20-13.8. It boots and so far runs just fine (uptime=39 minutes but I am VERY hopeful). This is a RHL 8.0 system. Looks like the fixes worked!! Can anyone else verify my experience??
yep, the new 2.4.20 kernel spits out a nice little message saying the bios is known to have issues and talks about using smp as a workaround, but then it goes ahead and actually works. prior to this, the scsi was finding irq 11, but now it finds irq 10 and things actually seem to work. it resync'ed the raid1 mirrors and did all the upgrades (for 7.2) without problem. intel should be ashamed.