Bug 162321 - System hang with kernel-2.6.12-1.1387
Summary: System hang with kernel-2.6.12-1.1387
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 4
Hardware: i686
OS: Linux
medium
high
Target Milestone: ---
Assignee: Dave Jones
QA Contact: Brian Brock
URL:
Whiteboard:
: 162320 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-07-02 14:50 UTC by Martin
Modified: 2015-01-04 22:20 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-07-13 18:08:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Sample dmesg from problematic computer (14.12 KB, text/plain)
2005-07-04 10:41 UTC, Furry Ball
no flags Details
dmidecode.out (8.11 KB, text/plain)
2005-07-10 13:01 UTC, Martin
no flags Details
acpidmp.out (65.48 KB, text/plain)
2005-07-10 13:03 UTC, Martin
no flags Details

Description Martin 2005-07-02 14:50:40 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; de-DE; rv:1.7.8) Gecko/20050524 Fedora/1.0.4-4 Firefox/1.0.4

Description of problem:
the computer starts up to the message 'setting computer name' and freeze complete.

Motherboard: MS-6758
dmidecode: https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103925
acpidmp: https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103926

Version-Release number of selected component (if applicable):
kernel-2.6.12-1.1387_FC4.i686.rpm

How reproducible:
Always

Steps to Reproduce:
1. Problem occurs every time
2.
3.
  

Actual Results:  computer boots with 2.6.11-1.1369_FC4 fine

Additional info:

Comment 1 Bojan Smojver 2005-07-03 23:41:46 UTC
I also noticed strange stuff with 1387, as opposed to 1385 (the one from
testing). It appears that something if very wrong with 1387...

The symptoms on my notebook (full hardware spec here:
http://www.rexursive.com/articles/linuxonhpze4201.html) are that when I see the
line "Initializing: storage network audio" it takes ages to get to "done". This
wasn't a problem with any of the previous kernels.

Unfortunately there is nothing in the logs or dmesg that points to any
meaningful error that would be causing this.

Comment 2 Bojan Smojver 2005-07-03 23:43:10 UTC
This bug is also a duplicate of bug #162320.

Comment 3 Warren Togami 2005-07-04 01:29:42 UTC
*** Bug 162320 has been marked as a duplicate of this bug. ***

Comment 4 Rui Miguel Seabra 2005-07-04 09:56:40 UTC
To me, this happened just before setting up LVM modules.

Comment 5 Furry Ball 2005-07-04 10:37:56 UTC
This happens to me as well. It can happen at virtually any stage, but most 
usually at the mentioned. I once managed to boot the whole system up before 
the lockup and even login. I can't find any useful log entry or alike.  
 
I tried the non-smp kernel with acpi=off and it did't fix the issue for me and 
can't think of anything else useful. Reverted back to .11 kernel 

Comment 6 Furry Ball 2005-07-04 10:41:52 UTC
Created attachment 116324 [details]
Sample dmesg from problematic computer

In case it helps to see some basic information about platform.. My board is
Asus P4P800-VM http://www.asus.com/prog/spec.asp?m=P4P800-VM&langs=01

Comment 7 Bojan Smojver 2005-07-04 11:49:17 UTC
I agree with comment #5 - it happens kind of at random - completely
unpredictable. Actually, I just booted up and the "Initializing hardware..."
thing went on really fast. And I had slowdowns at other boot stages. No idea
what's happening there...



Comment 8 Bojan Smojver 2005-07-04 23:09:25 UTC
Just to preserve sanity, I booted 1385 again (the one based on 2.6.12.1) and it
is quite fine. My bet is on that ACPI patch in 2.6.12.2, which changed IRQ
routing via ACPI. These strange "hangs" have all the hallmarks of undelivered
IRQs...

Comment 9 Per Bjornsson 2005-07-05 17:10:55 UTC
Re Comment #8 : This sounds like a reasonable explanation for what I'm seeing so
I'll add my comments here instead of opening a new report. My computer
consistently hangs on boot with kernel-2.6.12-1.1387 with the default boot
command line; it works when I add "pci=noacpi".

Hardware is a notebook with ATI IGP320M northbridge (Athlon XP-M processor) and
ALi southbridge.

The last boot messages before the hang are:
----
Warning: ATI Radeon IGP Northbridge is not yet fully tested.
ALI15X3: IDE controller at PCI slot 0000:00:0f.0
ACPI: PCI Interrupt 0000:0f.0[A]: no GSI - using IRQ 0
PCI: Setting IRQ 0 as level-triggered
ALI15X3: chipset revision 196
ALI15X3: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1800-0x1807, BIOS settings: hda:DMA, hdb:PIO
    ide1: BM-DMA at 0x1808-0x180f, BIOS settings: hdc:DMA, hdd:PIO
----

(Hand-copied, sorry for any typos, but I think I got it right.) The "IRQ 0"
sounds fishy to me; the IDE controller is normally on IRQs 14 and 15 I believe,
although I'm not sure about what that message actually means. My
/proc/interrupts now kernel 1387, booting with pci=noacpi, and I think this is
what it usually looks like) looks like this:

----
           CPU0
  0:     564758          XT-PIC  timer
  1:       1578          XT-PIC  i8042
  2:          0          XT-PIC  cascade
  7:          2          XT-PIC  parport0
  8:          1          XT-PIC  rtc
  9:          1          XT-PIC  acpi
 11:     135957          XT-PIC  ath0, ALI 5451, ehci_hcd:usb1, ohci_hcd:usb2,
ohci_hcd:usb3, yenta, yenta, eth0, radeon@pci:0000:01:05.0
 12:        113          XT-PIC  i8042
 14:      25132          XT-PIC  ide0
 15:       4431          XT-PIC  ide1
NMI:          0
ERR:          0
----

I think that this means that itlooks like something has gone fishy with the
interrupt routing? And the patch in 2.6.12.2 has the comment
---
Linus Torvalds:
ACPI: Make sure we call acpi_register_gsi() even for default PCI interrupt
assignment
---
Sounds related to "no GSI" in my boot messages above?


Comment 10 Frank Swasey 2005-07-05 23:51:46 UTC
I am seeing the same symptoms on an ASUS P5A as described in comment #9.  
Differences in my case is the ALI15X3 is chipset revision 193 (instead of 196)
and my ide1 shows hdd:DMA (instead of hdd:PIO).

Modifying the boot parameters to add "pci=noacpi" also gets me past the problem
and allows the kernel to boot and run.

Comment 11 Per Bjornsson 2005-07-06 00:10:07 UTC
Re comment #10: I strongly suspect that the reason that I'm seeing "hdd:PIO" is
that this is a notebook with one hard disk (hda) and one CDRW/DVD (hdc), so
there's no hdd drive connected.

Just for clarity, the section corresponding to what I posted above, but grabbed
from my dmesg when booted with pci=noacpi are
----
Warning: ATI Radeon IGP Northbridge is not yet fully tested.
ALI15X3: IDE controller at PCI slot 0000:00:0f.0
ALI15X3: chipset revision 196
ALI15X3: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0x1800-0x1807, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0x1808-0x180f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: TOSHIBA MK6021GAS, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: TOSHIBA DVD-ROM SD-R2412, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
----
(...boot continues...)


Comment 12 Bojan Smojver 2005-07-06 04:12:05 UTC
Most likely related to this kernel bug:

http://bugzilla.kernel.org/show_bug.cgi?id=4824

BTW, there is already a fix for PCI IRQ issue, just scroll down to the end of
that bug report.

Comment 13 Dave Jones 2005-07-07 21:53:57 UTC

*** This bug has been marked as a duplicate of 162269 ***

Comment 14 Martin 2005-07-10 12:59:21 UTC
Problem was fixed for me with kernel-2.6.12-1.1390_FC4 but after update BIOS to
V2.4 booting isn't longer possible. Now kernel-2.6.11-1.1369_FC4 is the only
working kernel for me :(((

Comment 15 Martin 2005-07-10 13:01:48 UTC
Created attachment 116565 [details]
dmidecode.out

Comment 16 Martin 2005-07-10 13:03:41 UTC
Created attachment 116566 [details]
acpidmp.out

Comment 17 Martin 2005-07-10 20:36:58 UTC
with the last FC5 kernel-2.6.12-1.1425_FC5 or a vanilla linux-2.6.13-rc2 kernel,
the computer is booting, why not with 2.6.12-1.1387_FC4 and 2.6.12-1.1390_FC4 ?
I get no message in dmesg or messages.

Comment 18 John Horne 2005-07-11 12:05:11 UTC
Although comment 13 says this bug is a duplicate of 162269, the use of
'pci=noacpi' or 'acpi=off' does not solve the problem (despite 162269 saying it
does).

I too can only run kernel-2.6.11-1.1369_FC4 as the latest working kernel, all
others fail at setting hostname.

Comment 19 Martin 2005-07-13 18:08:32 UTC
booting works now again with kernel-2.6.12-1.1394_FC4

Comment 20 John Horne 2005-07-14 13:22:40 UTC
Yes, 1394 works for me too. Must have been the PIIX change. Many thanks.


Note You need to log in before you can comment on or make changes to this bug.