Bug 107880
| Summary: | 440GX chipset machine has non working IRQ routing | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Phil Oester <bugzilla> | 
| Component: | kernel | Assignee: | Dave Jones <davej> | 
| Status: | CLOSED NEXTRELEASE | QA Contact: | |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2 | CC: | acpi-bugzilla, alan, bwyer, fedora, florin, goeran, jkono, kbakalar, markjx, me, mweiner, mwindham, pfrields, rkypriot, ryan | 
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | i386 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2005-04-16 04:15:56 UTC | Type: | --- | 
| Regression: | --- | Mount Type: | --- | 
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Attachments: | |||
| 
        
          Description
        
        
          Phil Oester
        
        
        
        
        
          2003-10-23 23:53:52 UTC
        
       *** Bug 107893 has been marked as a duplicate of this bug. *** Created attachment 95449 [details]
console output for successful RHL9.0 boot.iso
Created attachment 95450 [details] console output for failed Fedora Core 1 test3 boot.iso Looks like RHL9.0 ran the dmi_scan.c broken_pirq() function: > *** Possibly defective BIOS detected (irqtable) which would enable the APIC, which would work were it actually built into the install kernel. RHL9 then tries to read the PIRQ tables: > PCI: Error 89 when fetching IRQ routing table. that fails so it presumably continues w/o attempting any PIRQ programming, which works. Fedora Core 1 doesn't seem to have broken_pirq() in dmi_scan.c anymore. I guess the only value it would add is it would over-ride somebody foolishly asking for "noapic" on a 440gx with APIC built-in. It was a no-op on the PIC kernel anyway. FC1 tries to read the PIRQ tables > PCI: Using IRQ router PIIX/ICH [8086/7110] at 00:12.0 and it is a victim of its success -- as the (undisclosed) 440GX PIRQ router is not compatible with the PIIX. Too bad we can't tell the kernel to give up on PIRQ routing like it did in RHL9. Created attachment 95451 [details]
console output for failed Fedora Core 1 test3 boot.iso w/ acpi=force
Unlike RHL9, ACPI is present (but disabled) in the FC1 Test3 install kernel.
Typically acpi=on will enable it, but on the 440GX, the latest BIOS is from
2000 and ACPI disables itself when it sees something that old.
acpi=force (console output attached) gives ACPI a chance to save the day
and program the 440GX PIC, but instead it chokes on the _PRT in the BIOS --
turns out that ignoring ACPI BIOS support from < 2001 was a good idea;-)
There is supposed to be a different workaround for 440GX now; Alan cleaned up the code we had in RHL9 The Fedora code checks for a 440GX bridge being present and if so it punts back to BIOS IRQ routing. If that isnt working its a small bug, or perhaps a missing PCI ident in the check code in pci-irq.c. You might want to glance over that routine and the rules it uses to decide the IRQ router isnt valid in the piix router setup function, or I can take a look at it in July 2004 when I'm back... Len, can you check the pci-irq.c code as suggested by Alan? Thanks! Created attachment 95560 [details]
proposed patch
Looks like this got hosed by the -ac1 patch. Basically, the comment in pci- irq.c about not touching 440GX is correct, but the code checks for _450GX_. I believe the above patch should fix it, but have not yet rebuild the isolinux vmlinuz to verify. Two questions: 1) how do I rebuild the vmlinuz in isolinux? I assumed it would be built by rebuilding the BOOT rpm, but the sizes of initrd/vmlinuz are drastically different 2) since 440gx boxes can't be kickstarted at present, will this be fixed prior to Core 1 release Created attachment 95809 [details]
Working patch
I have confirmed that the attached patch allows 440GX boxes to successfully
kickstart.  
For those interested, I did the following to replace the isolinux vmlinuz:
rpm -i gcc32-3.2.3-6.i386.rpm
rpm -i kernel-source-2.4.22-1.2115.nptl.i386.rpm
cd linux-2.4.22-1.2115.nptl
cat 440gx.patch | patch -p1
cp configs/kernel-2.4.22-i586.config .config
make oldconfig && make oldconfig     (don't ask...dumb bug)
make dep && make bzImage
Of course the kernel which gets installed by kickstart won't boot on this box
either, but fortunately I install a new one in %post.
*** Bug 110260 has been marked as a duplicate of this bug. *** *** Bug 109346 has been marked as a duplicate of this bug. *** This board has a lot of firmware updates, it's interesting to get latest http://www.intel.com/support/motherboards/server/l440gx/ -> Software & Drivers -> Non-OS Dependent get: BIOS BMC HSC FRUSDR 2.4.23-rcX kernel contains a fix, give it a try. The best "Non expert" way to update to Fedora 1 on such a box btw seems to be the following (a decent net connection helps but you do only need CD 1 for this 8)) Install RH9 (if not upgrading from it anyway) Install fedora-release and yum packages from CD#1 of FC1 yum update rpm rpm --rebuilddb yum update krb5-libs e2fsprogs yum update Then rpm install the RH9 errata kernel *** Bug 116645 has been marked as a duplicate of this bug. *** This problem seems to have bitten my again with Fedora Core 2 Test 1. Now the installer doesn't hang, it detects the Adaptec controller and it also detects my Mylex controller. It also detects the RAID 0 drive that is on the Mylex controller when it is still in the console mode. It then brings up X and goes through the various screens, but once it gets to the "Partitioning Drives" section it barfs on a "No valid drives to create filesystems on" error. It doesn't make much sense to me since I can see in the DMESG output that it detected the drive on the mylex card. my L440GX+ booted 2.6.5 with "acpi=force" both with IOAPIC enabled or with IOAPIC disabled (PIC mode). ACPI/PIC-mode had some complaints, so it may not work in all configs, but it did boot -- which was not the case with earlier kernels. While the system booted and interrupts worked, the two basic ACPI features didn't work properly. The power button did not register ACPI events. (though it also didn't instantly poweroff the box, which indicates the system _is_ in ACPI mode) Also "init 0" didn't soft power-off the box, but acted like "init 6" and rebooted the system. I'd be interested in other's experience with "acpi=force" on this type of system. I tested 4 configurations 
 
SMP ACPI IOAPIC: 
2.6.5 SMP kernel acpi=force 
2.4.26 SMP kernel acpi=force 
 
UP ACPI PIC: 
2.6.5 SMP kernel acpi=force nosmp noapic * 
2.4.26 SMP kernel acpi=force nosmp noapic 
 
In all 4 cases: 
L440GX+ booted off SCSI & talked to ethernet. 
power button press invoked ACPI events as seen in /proc/interrupts 
init 0 gracefully shut down the system and powered off. 
 
two notes: 
1. needed to hack 2.5.6 dmi_scan.c for "noapic" to take effect 
    so I could get into PIC mode using the SMP kernel for this test. 
    otherwise it forces ioapic to be enabled when it sees this machine. 
    2.4.26 did not need this hack since it doesn't have that dmi_scan entry. 
 
2. previously reported 2.6.5 button/init 0 failure turned out to be 
    a configuration problem.  I tested the 2.6 kernel on RHL9 w/ 
    original modutils, so the kernel modules didn't load. 
    So I simply re-built with static modules for the test. 
 
Maybe the 440GX+ is now a candidate for the fabled ACPI "white list" 
to exempt it from disabling ACPI by default on a system older than 2001, 
or we should consider rolling the default cutoff date back a year. 
 
Created attachment 99505 [details]
2.6.5 patch
Looks like Linus excluded the 440GX part when he ported Alan's
patch to 2.6 from 2.4.	I'll check this into 2.6.6, as well as delete
the dmi_scan stuff it replaces.
I've installed FC2-test2 on L440GX+ by using linux acpi=force on the ISO boot command line. So it looks like ACPI can be a "workaround" until the PIRQ patch above appears in the non-ACPI kernel -- or you can just stick with ACPI;-) Fedora Core 1. HP Kayak XU 6/300 dual Pentium II, AIC7880 chip. Same problem. linux acpi=force causes a kernel panic. acpi=force fixes it for me (FC2test3) Please note that the actual kernel (not the one that boots up the installer) does not need acpi=force, however that parameter, if present, does not hamper normal functioning. It just doesn't care, as it seems. Only the install CD needs the acpi=force. From Alan's comment above (#15):
    Install RH9 (if not upgrading from it anyway)
    Install fedora-release and yum packages from CD#1 of FC1
    yum update rpm
    rpm --rebuilddb
    yum update krb5-libs e2fsprogs
    yum update
    Then rpm install the RH9 errata kernel  <<<<<<<
Why revert back to the RH9 errata kernel (or is this still necessary
as of todays date)?
Thats from a while ago. It appears that all that is needed is to use acpi=force. At the time that wasn't know. An FC2 errata will probably fix the problem for good It is expected that FC1 acpi=force fails on this sytem -- OS is too old. It is expected that the workaround is needed only on the install kernel. The install kernel uses PIC mode and PIRQ routing is needed. The installED kernel uses IOAPIC, and PIRQ routing is not needed. 2.6.6 has two things to help PIC-mode on this box: 1. Alan's 440GX PIRQ-disable workaround has been un-commented. to match 2.4. 2. OSDL bug 1581 has been fixed, which makes ACPI work better on a box with a BIOS like this one. Note that until we roll back the ACPI cutoff date, "acpi=force" will be necessary to enable ACPI on boxes of this vintage. *** Bug 119748 has been marked as a duplicate of this bug. *** *** Bug 109693 has been marked as a duplicate of this bug. *** I'm attempting to upgrade from RH8 to FC2 and encountering this same issue. L440GX+ MB, 2xP3 750, 4xIDE drive, CDROM on internal SCSI. Just upgraded all firmware to latest releases from Intel site. Anaconda gets to 'Loading AIC7XXX driver...' and hangs. Alt-F4 shows it having hit the onboard SCSI and just sits there. I've tried acpi=force as well as noapic and still don't get past this point, although the messages do vary. Am I missing something here? I'm installing from the FC2 release CD with all of its files dated 5/13/2004. The vmlinuz shipped with this ISO states it is 2.6.5-1.358. Has anyone had any luck with a true L440gx+ booting off of the FC2 release using any of the aforementioned switches? Is there a 2.6.6 install ISO image available that I can try? people.redhat.com/arjanv has a boot image with newer kernel for C3 and for a problem ASUS board. Try that on the 440GX+ and let me know Thanks for the quick response, Alan. I actually have tried that image without success, but I didn't record the results. I'm currently imaging my drives, which should be done early tomorrow, so I'll give it another try and let you know. One other thing I wanted to try is turning off the Plug-N-Play function in the BIOS. I don't know if that is having an impact or not, but it seems like it would be worth a shot from some other bugs I've read. Created attachment 101412 [details]
Log from successful boot under RH8
Okay, my backup is done, so I can do some hacking :).
I assume when you were referring to the c3 image, you meant c3boot-2.iso. 
Given
that, I did a number of tests this morning to see if I could get this beast to
boot under FC2.  Here are the results:
Plug-N-Play OS turned OFF in the BIOS:
- Generic FC2 install disks, acpi=force - fail
- c3boot-2.iso, no parameters - fail (used acpi)
- c3boot-2.iso, pci=noacpi - fail (lots of devices using IRQ 11)
Turned Plug-N-Play back on:
- c3boot-2.iso, no parameters - fail (used acpi)
Note that when I say "used acpi", I'm referring to all of the ACPI: messages
with the associated "Pin A" and "PRQx" messages, like another attachment that
was posted earlier.  If you would like, I can see about trying to post the
boot messages if I can figure out how to get a serial console working on my
machine.
Finally, after a quick (not particularly careful) scan of the various console
messages during my tests, I didn't see the message that RH8 would spit out when
it detected my BIOS.  I'm wondering if the appropriate code path has been
enabled or put back in.  The messages I'm referring to are (and it is in there
twice-see attachment):
*** Possibly defective BIOS detected (irqtable)
*** Many BIOSes matching this signature have incorrect IRQ routing tables.
*** If you see IRQ problems, in particular SCSI resets and hangs at boot
*** contact your hardware vendor and ask about updates.
*** Building an SMP kernel may evade the bug some of the time.
*** Possibly defective BIOS detected (irqtable)
*** Many BIOSes matching this signature have incorrect IRQ routing tables.
*** If you see IRQ problems, in particular SCSI resets and hangs at boot
*** contact your hardware vendor and ask about updates.
*** Building an SMP kernel may evade the bug some of the time.
As always, any assistance would be appreciated, and thanks in advance.
The case in question got fixed (ACPI=force) and also in the later images for non ACPI so the message went away.. Now it seems at least one box has othe rproblems Next amusement acpi=off pci=usepirqmask Okay, I assume you meant with the "c3" image. I tried that and got what looks to be the same results as pci=noacpi. I.e. 3-4 devices using IRQ 11. Hang after scsi0 access during load of aic7xxx driver. Never gets to scsi1. I would be happy to attach console output once I finish hacking my way through my BIOS config, although how I'm going to send an ALT-F4 is beyond me... :) I just did a "strings" against the vmlinuz that is on the c3boot-2.iso and it claims that it is 2.6.5-1.358. Shouldn't it be 2.6.6? My error. Arjan backported the fix not forward ported the install image. It should still have worked with acpi=force though. Okay, just so I'm clear here, according to Comment #26, there were two fixes worked into 2.6.6. Did Arjan backport those into his 2.6.5 kernel for the C3 boards? I guess my other question here is... is this worth spending a whole lot of time working on from your perspective? I'm willing to hack through this to figure out what the issue is and get it fixed "for the good of Linux," or I can rearrange some hardware in my box (add a PCI IDE controller) so that I can run an IDE CD-ROM drive, and boot up with the noprobe option. Obviously, I'd like to get this fixed, but I'd also like to get to FC2 while I'm out sick from work and have the time to hack with it. Thanks, Brett Created attachment 101427 [details]
L440GX+ with "C3" boot disk, no parameters specified
Okay, I made some progress with my serial console.  I've attached a copy of the
console output if I boot the "C3" (c3boot-2.iso) image specifying nothing.
Created attachment 101428 [details]
L440GX+ with "C3" boot disk, acpi=off pci=usepirqmask specified
Here's a log of the boot with the above parameters specified.  Note that both
hang in the same place (driver loading).
Hm, acpi=force used to work with FC2test3, but it's not working with FC2final. I have no idea what's going on. Okay... I feel a little better now--I thought it was just my board, or maybe the fact that I have four Ethernet devices along with all of the PIIX4 stuff. I haven't tried FC2test3 on my board. Sorry, my bad, acpi=force does work with FC2 final version, but only after changing a setting in the BIOS. Unfortunately, i did this a few weeks ago and i can't remember the setting. Sorry. :-( I tested FC3test1 and it works fine on 440GX. I didn't even had to enter acpi=force, it just booted up fine, it saw the SCSI disks, etc. Thanks for the bug report. However, Red Hat no longer maintains this version of the product. Please upgrade to the latest version and open a new bug if the problem persists. The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, and if you believe this bug is interesting to them, please report the problem in the bug tracker at: http://bugzilla.fedora.us/ Re-opening from the auto tools. This one does affect FC2 for some users and wasn't marked FC2. Fixed that Fedora Core 2 has now reached end of life, and no further updates will be provided by Red Hat. The Fedora legacy project will be producing further kernel updates for security problems only. If this bug has not been fixed in the latest Fedora Core 2 update kernel, please try to reproduce it under Fedora Core 3, and reopen if necessary, changing the product version accordingly. Thank you. |