Bug 469107
Summary: | Fedora 8/9 >= kernels 2.6.26.x fail to shutdown properly with Disabled IRQ# message. | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Reilly Hall <sly.midnight> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 9 | CC: | al.dunsmuir, ctubbsii, kernel-maint, leo, quintela, somlo | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2009-07-14 17:06:39 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Reilly Hall
2008-10-29 21:53:20 UTC
Decided to do some additional testing and verified it has most definitely something to do with the Intel PRO/1000 GT Gigabit ethernet adapter I have in a PCI slot of my Asus A7N8X-Deluxe. It shares the IRQ with my onboard SATA controller which would explain why the system appears to hang. It crashes the system when a #service network stop is issued or if changing runlevels (say issuing an #init 1 command). The system boots up just fine and appears to run indefinitely, network access also appears to work flawlessly. But again, unloading/reloading the network service/module seems to crash the system (most likely due to it sharing the IRQ with my SATA controller). If I don't load the network services (and presumably the e1000 module) at boot time, I CAN successfully shut the system down. Any ideas? It just occurred to me to include a copy of the output of #lspci -v. I will try to include that as soon as possible. In the mean time I'm going to see if moving the Gigabit card to another PCI slot changes which IRQ it uses so it doesn't share one with the SATA controller as no setting in the BIOS would appear to fix that. The kernel option "noirqdebug" will stop the interrupt from being disabled. You can add it to the end of the kernel line in /etc/grub.conf . Thanks, I'm gonna go ahead and try that option tonight. I tried one suggested by the errors that showed up in the syslog one time I was able to manage to get it displayed before the machine inevitably crashed (though it was never committed to disk for the obvious reasons)...it mentioned to use something like "irqpoll" or something like that. Needless to say it didn't help. I'll post back as quickly as possible the results. Oh my GOD, you are my savior! That did the trick. Been putting up with this for like 2+ months now always hoping the next kernel update would solve that issue. Eventually some update to the X.org drivers made it so booting from the last safe 2.6.25.x kernel I have installed not feasible (it would cause X to crash on load apparently, must be some special new requirement for the Radeons). I will still post my "lspci -v" and if I can manage to swipe a copy of what gets output to the syslog on a normal kernel oops prior to the system fully crashing after prolonged disconnect from the hard drive (maybe I can have a mounted USB flash drive ready to go and/or just pipe the output from syslog directly to a file on said USB flash drive. We'll see...again thanks SOOO much! Created attachment 322248 [details]
lspci -v of my Asus A7N8X-Deluxe on 11-02-08
Here's that "lspci -v" I promised incase its of interest to anyone. Next up is the output of the syslog during a normal crash, that's gonna be tricky though since it doesn't actually get saved to the hard drive.
Created attachment 322252 [details]
last few lines of syslog output as IRQ #18 is Disabled (nobody cared) 11-02-08
I guess that trick of using a mounted and ready to go USB drive with the syslog contents piped to a file on that drive worked...here are the last few lines from the syslog as I issued a "#service NetworkManager stop" to induce the error before the system totally crashed from not being able to access the hard drive. Hope this helps, it just looks like gibberish to me :S
Just to add another data point, this also happens on F10 (2.6.27.5-117.fc10.i686) when the machine is being shut down (or when NetworkManager is stopped and shuts down eth0). Also using the e1000 driver: 01:0c.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controll er (rev 02) Subsystem: Dell Optiplex GX270 Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 18 Memory at feae0000 (32-bit, non-prefetchable) [size=128K] I/O ports at df40 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device Capabilities: [f0] Message Signalled Interrupts: Mask- 64bit+ Count=1/1 Enable- Kernel driver in use: e1000 Kernel modules: e1000 Using "noirqdebug" does make the problem go away. Thanks Gabriel. While that doesn't make me too enthusiastic to upgrade to F10 on my desktop (just barely been testing it in a VM since I was away from home for a month), I guess I'll have to keep that in mind. Many thanks. I'll report what I find as well when I find some time to upgrade and do some testing. Same problem - also on Dell GX270 Man, I'm slacking, I needed to test this on Fedora 10 on this machine and I still haven't upgraded from 9 yet (on this one machine out of all the ones I have and its only affecting just this one). I will at least test out the latest F9 kernel without the "noirqdebug" option to see if it still happens and report back here. I still experience this problem with the latest F10 kernel - 2.6.27.15-170.2.24.fc10.i686 I have very similar problems on Dell GX260 with the e1000 driver for Intel PRO/1000 Gigabit onboard NIC. In my case, IRQ #18 is disabled upon ifdown normally (with little side effects, so I don't usually care), but lately it has been doing it on ifup (VERY VERY VERY BAD, b/c I have no network!). Sometimes I get a stack trace along with the disable message, and sometimes I get only the "Disabling IRQ #18" message. I've tried everything in the BIOS, but this problem only started happening recently for me, and there hasn't been a BIOS update for this board since 2005, so I think it's not the problem. 'sudo ethtool eth0' shows no link detected, unless I run the tool in the very brief moment during dhclient where I get an assigned IP address, in which case it shows there is a link. In fact, DHCP always successfully assigns an address, but the network goes down and IRQ 18 is disabled immediately afterwards. The noirqdebug option does not fix the problem, and indeed reduces performance significantly. I don't get the disable message, but it also doesn't work. (In reply to comment #12) > I have very similar problems... In my case, the problem was fixed by resorting to NetworkManager (which loads after other startup services... perhaps that has an effect). The network service still causes this problem when the NetworkManager is uninstalled for me. (In reply to comment #13) > (In reply to comment #12) > > I have very similar problems... > > In my case, the problem was fixed... Oh, and the IRQ disable message still appears on shutdown, but at least I get network now. Its been a while since an update but, the original machine I was experiencing this problem with was temporarily decommissioned back a couple/few months ago when I bought parts to build a new machine to use for my primary machine. As soon as I get a new hard drive in the original old machine with the symptoms, I will install the new Fedora 11 (should have a drive by the time it comes out) and test to see if the same conditions occur. This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |