|Summary:||Kernel 42.0.2 hangs because timer not found|
|Product:||Red Hat Enterprise Linux 4||Reporter:||Milan Kerslager <milan.kerslager>|
|Component:||kernel||Assignee:||Jason Baron <jbaron>|
|Status:||CLOSED NOTABUG||QA Contact:||Brian Brock <bbrock>|
|Version:||4.4||CC:||darford, jbaron, jburke, knoel|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2007-03-01 17:13:23 UTC||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Milan Kerslager 2006-08-28 20:10:18 UTC
Latest update 2.6.9-42.0.2.EL does not boot because times is not found. When trying to boot with suggested apic=debug, kernel boot normally. U4 kernel kernel-2.6.9-42.EL is booting ok too. This is ASUS A8N-E board with Athlon 64 3000+ (32 bit kernel). The latest messages on the console are (sorry, this has been hand-writed): ENABLING IO-APIC IRQs ..TIMER: vector 0x31 pin1=2 pin2=-1 ..MP-BIOS bug: 8254 timer not connected to IO-APIC ...trying to set up timer (IRQ0) through the 8259A... failed. ...trying to set up timer as Virtual Wire IRQ.. failed ...trying to set up timer as ExtINT IRQ.. failed :(. Kernel Panic - not syncing: IO-APIC timer doesn't work! Boot with apic=debug and send report then noapic # lspci 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Ethernet controller: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60 [Radeon X300 (PCIE)] 01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE] 05:06.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02) 05:07.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet Controller (rev 05)
Comment 1 Milan Kerslager 2006-08-28 20:16:40 UTC
Created attachment 135079 [details] Output messages of succesfull 42.0.2 boot with apic=debug The same kernel normally does not boot (without apic=debug).
Comment 2 Jason Baron 2006-09-19 18:46:00 UTC
So this is the x86 up kernel? Can you double check that the up x86 -42 kernel boots correctly? Also, for the 42.0.2 kernel that boots with apic=debug, can you try passing 'noapic'? thanks.
Comment 3 Milan Kerslager 2006-09-21 17:12:05 UTC
I double checked that on this UP machine -42 kernel boots without a problem and -42.0.2 kernel does not boot (boots only with apic=debug or noapic parameter).
Comment 4 Milan Kerslager 2006-09-21 17:15:14 UTC
Created attachment 136884 [details] dmesg output - system boots with 2.6.9-42.EL
Comment 5 Milan Kerslager 2006-09-21 17:16:18 UTC
Created attachment 136885 [details] dmesg output - system boots with 2.6.9-42.0.2.EL and "noapic" parameter
Comment 6 Milan Kerslager 2006-09-21 17:17:10 UTC
Created attachment 136886 [details] dmesg output - system boots with 2.6.9-42.0.2.EL and "apic=debug" parameter
Comment 7 Milan Kerslager 2006-09-21 17:18:16 UTC
Created attachment 136888 [details] dmidecode output
Comment 9 Jason Baron 2006-09-22 16:23:39 UTC
hmmm very strange. b/c none of the patches in -42.0.2 seem like they would cause the kernel not to boot like this. i'd like for us to iterate over the 10 or so patches in the kernel to determine which one is causing this problem...unfotunately i don't have time today to compile these 10 kernels for you...i could though send you the 10 pathces and you could try them yourself. otherwise, i'll get you the test kernels next week. thanks.
Comment 10 Milan Kerslager 2006-09-23 18:39:16 UTC
I see patch difference in the SPEC file (Patch2213 Patch5057 Patch5058 Patch5059 Patch5060 Patch5061 Patch5062 Patch5063 Patch5064). I'm able to build -42 with every one patch selectively enabled. I'll post the results here next week.
Comment 11 Jason Baron 2006-09-25 15:10:55 UTC
ok great! thanks.
Comment 12 Milan Kerslager 2006-10-03 18:04:26 UTC
I'm unable to recompile kernel. This may be related to HW problem on my build system. I'l try tomorow. Soory for the delay. Babysitting is a little bit overloading problem :-)
Comment 13 Milan Kerslager 2006-10-04 09:13:19 UTC
I created chroot build environment on another machine. Kernels are builded righ now so I expect to be able to reboot the server multiple times today or tomorow late evening.
Comment 14 Milan Kerslager 2006-10-05 16:42:22 UTC
All subsequent kernels with only one of the delta patches between 42.EL and 42.0.2.EL kernels has been builded and all the kernels boots without timer problem. Only RH's kernel 42.0.2.EL fails to boot. All booted kernels are named 42.0.0.[0-9].EL and dmesg output are in the attachment. Kernel with .0 is without patches (so this is like 42.EL), 1 to 9 are kernels with only one patch enabled of the all 9 patches difference between 42.EL and 42.0.2.EL. Kernel with all the patches included (ie like 42.0.2.EL) has not been tested and is builded right now. I'll test it later. So I'm wondering where the problem is. Actual building environment is not RHEL4 system but fresh chroot up-to-date CentOS4 (ie RHEL4 rebuild) system because HW problem on my RHEL4 system. I'm able to create chroot build environment from RHEL4 packages as well if you wish to compare the resulting kernels. Please tell me if you want to test anything else. # cat /proc/version (problematic version) Linux version 2.6.9-42.0.2.EL (firstname.lastname@example.org) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 Thu Aug 17 17:36:53 EDT 2006
Comment 15 Milan Kerslager 2006-10-05 16:48:08 UTC
Created attachment 137830 [details] dmesg output from testing kernels
Comment 16 Jason Baron 2006-10-06 17:17:53 UTC
very strange. i'm really at a loss ot explain this....we just released 42.0.3 yesterday...i wonder if that works...this bug reminds me a lot of bz #203423 where a BIOS upgrad fixed the problem...
Comment 17 Milan Kerslager 2006-10-07 16:17:29 UTC
BIOS upgrade fix the problem. There must be a hidden bug in the compiler or something similar. I have old BIOS saved so I'm able to do more tesing if you want to.
Comment 18 Darlene J. Ford 2007-01-07 02:16:09 UTC
This sounds similar to Bug 175784, most recently reported to have occurred on my AMD64 system after a BIOS upgrade. Are there some docs explaining how to boot with the noapic option if I poke around a bit?
Comment 19 Milan Kerslager 2007-02-10 08:23:48 UTC
As the current kernel 2.6.9-42.0.8.EL has no problem and BIOS update fixed the problem I'm suggetsing to close this bug. I tryed but even I'm the submiter I'm not allowed to close this bug... The strange part is that my plain-rebuilded kernel worked but RH's kernel did not. So there may be hidden bug in the building system (maybe already fixed).
Comment 20 Darlene J. Ford 2007-02-10 17:29:09 UTC
(In reply to comment #17) > BIOS upgrade fix the problem. > There must be a hidden bug in the compiler or something similar. I have old BIOS > saved so I'm able to do more tesing if you want to. I'd like to try to reproduce this result (because it would make my system work under linux again.) Do you remember what Rev BIOS and kernel you tested? I thought I had the latest HP BIOS, but life could get so much better if I'm wrong about this.
Comment 21 Jason Baron 2007-03-01 17:13:23 UTC
ok thanks Milan. I've going to close this. Darlene, if you are still seeing an issue please open a new bug.