Bug 204388

Summary: Kernel 42.0.2 hangs because timer not found
Product: Red Hat Enterprise Linux 4 Reporter: Milan Kerslager <milan.kerslager>
Component: kernelAssignee: Jason Baron <jbaron>
Status: CLOSED NOTABUG QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: darford, jbaron, jburke, knoel
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-03-01 17:13:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output messages of succesfull 42.0.2 boot with apic=debug
none
dmesg output - system boots with 2.6.9-42.EL
none
dmesg output - system boots with 2.6.9-42.0.2.EL and "noapic" parameter
none
dmesg output - system boots with 2.6.9-42.0.2.EL and "apic=debug" parameter
none
dmidecode output
none
lspci output
none
dmesg output from testing kernels none

Description Milan Kerslager 2006-08-28 20:10:18 UTC
Latest update 2.6.9-42.0.2.EL does not boot because times is not found. When
trying to boot with suggested apic=debug, kernel boot normally.
U4 kernel kernel-2.6.9-42.EL is booting ok too.

This is ASUS A8N-E board with Athlon 64 3000+ (32 bit kernel).

The latest messages on the console are (sorry, this has been hand-writed):

ENABLING IO-APIC IRQs
..TIMER: vector 0x31 pin1=2 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A... failed.
...trying to set up timer as Virtual Wire IRQ.. failed
...trying to set up timer as ExtINT IRQ.. failed :(.
Kernel Panic - not syncing: IO-APIC timer doesn't work! Boot with apic=debug and
send report then noapic

# lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Ethernet controller: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60 [Radeon X300
(PCIE)]
01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE]
05:06.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet
Controller (rev 02)
05:07.0 Ethernet controller: Intel Corporation 82541PI Gigabit Ethernet
Controller (rev 05)

Comment 1 Milan Kerslager 2006-08-28 20:16:40 UTC
Created attachment 135079 [details]
Output messages of succesfull 42.0.2 boot with apic=debug

The same kernel normally does not boot (without apic=debug).

Comment 2 Jason Baron 2006-09-19 18:46:00 UTC
So this is the x86 up kernel? Can you double check that the up x86 -42 kernel
boots correctly? Also, for the 42.0.2 kernel that boots with apic=debug, can you
try passing 'noapic'?

thanks.

Comment 3 Milan Kerslager 2006-09-21 17:12:05 UTC
I double checked that on this UP machine -42 kernel boots without a problem and
-42.0.2 kernel does not boot (boots only with apic=debug or noapic parameter).



Comment 4 Milan Kerslager 2006-09-21 17:15:14 UTC
Created attachment 136884 [details]
dmesg output - system boots with 2.6.9-42.EL

Comment 5 Milan Kerslager 2006-09-21 17:16:18 UTC
Created attachment 136885 [details]
dmesg output - system boots with 2.6.9-42.0.2.EL and "noapic" parameter

Comment 6 Milan Kerslager 2006-09-21 17:17:10 UTC
Created attachment 136886 [details]
dmesg output - system boots with 2.6.9-42.0.2.EL and "apic=debug" parameter

Comment 7 Milan Kerslager 2006-09-21 17:18:16 UTC
Created attachment 136888 [details]
dmidecode output

Comment 8 Milan Kerslager 2006-09-21 17:19:07 UTC
Created attachment 136889 [details]
lspci output

Comment 9 Jason Baron 2006-09-22 16:23:39 UTC
hmmm very strange. b/c none of the patches in -42.0.2 seem like they would cause
the kernel not to boot like this. i'd like for us to iterate over the 10 or so
patches in the kernel to determine which one is causing this
problem...unfotunately i don't have time today to compile these 10 kernels for
you...i could though send you the 10 pathces and you could try them yourself.
otherwise, i'll get you the test kernels next week. thanks.

Comment 10 Milan Kerslager 2006-09-23 18:39:16 UTC
I see patch difference in the SPEC file (Patch2213 Patch5057 Patch5058 Patch5059
Patch5060 Patch5061 Patch5062 Patch5063 Patch5064). I'm able to build -42 with
every one patch selectively enabled. I'll post the results here next week.

Comment 11 Jason Baron 2006-09-25 15:10:55 UTC
ok great! thanks.

Comment 12 Milan Kerslager 2006-10-03 18:04:26 UTC
I'm unable to recompile kernel. This may be related to HW problem on my build
system. I'l try tomorow. Soory for the delay. Babysitting is a little bit
overloading problem :-)

Comment 13 Milan Kerslager 2006-10-04 09:13:19 UTC
I created chroot build environment on another machine. Kernels are builded righ
now so I expect to be able to reboot the server multiple times today or tomorow
late evening.

Comment 14 Milan Kerslager 2006-10-05 16:42:22 UTC
All subsequent kernels with only one of the delta patches between 42.EL and
42.0.2.EL kernels has been builded and all the kernels boots without timer
problem. Only RH's kernel 42.0.2.EL fails to boot. All booted kernels are named
42.0.0.[0-9].EL and dmesg output are in the attachment. Kernel with .0 is
without patches (so this is like 42.EL), 1 to 9 are kernels with only one patch
enabled of the all 9 patches difference between 42.EL and 42.0.2.EL. Kernel with
all the patches included (ie like 42.0.2.EL) has not been tested and is builded
right now. I'll test it later. So I'm wondering where the problem is.

Actual building environment is not RHEL4 system but fresh chroot up-to-date
CentOS4 (ie RHEL4 rebuild) system because HW problem on my RHEL4 system. I'm
able to create chroot build environment from RHEL4 packages as well if you wish
to compare the resulting kernels.

Please tell me if you want to test anything else.

# cat /proc/version (problematic version)
Linux version 2.6.9-42.0.2.EL (bhcompile.redhat.com) (gcc
version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 Thu Aug 17 17:36:53 EDT 2006

Comment 15 Milan Kerslager 2006-10-05 16:48:08 UTC
Created attachment 137830 [details]
dmesg output from testing kernels

Comment 16 Jason Baron 2006-10-06 17:17:53 UTC
very strange. i'm really at a loss ot explain this....we just released 42.0.3
yesterday...i wonder if that works...this bug reminds me a lot of bz #203423
where a BIOS upgrad fixed the problem...

Comment 17 Milan Kerslager 2006-10-07 16:17:29 UTC
BIOS upgrade fix the problem.

There must be a hidden bug in the compiler or something similar. I have old BIOS
saved so I'm able to do more tesing if you want to.

Comment 18 Darlene J. Ford 2007-01-07 02:16:09 UTC
This sounds similar to Bug 175784, most recently reported to have occurred on my
AMD64 system after a BIOS upgrade.  Are there some docs explaining how to boot
with the noapic option if I poke around a bit?

Comment 19 Milan Kerslager 2007-02-10 08:23:48 UTC
As the current kernel 2.6.9-42.0.8.EL has no problem and BIOS update fixed the
problem I'm suggetsing to close this bug. I tryed but even I'm the submiter I'm
not allowed to close this bug...

The strange part is that my plain-rebuilded kernel worked but RH's kernel did
not. So there may be hidden bug in the building system (maybe already fixed).

Comment 20 Darlene J. Ford 2007-02-10 17:29:09 UTC
(In reply to comment #17)
> BIOS upgrade fix the problem.
> There must be a hidden bug in the compiler or something similar. I have old 
BIOS
> saved so I'm able to do more tesing if you want to.

I'd like to try to reproduce this result (because it would make my system work 
under linux again.)  Do you remember what Rev BIOS and kernel you tested?  I 
thought I had the latest HP BIOS, but life could get so much better if I'm 
wrong about this.

Comment 21 Jason Baron 2007-03-01 17:13:23 UTC
ok thanks Milan. I've going to close this.

Darlene, if you are still seeing an issue please open a new bug.