Bug 435239 - [5.2][kdump] MP-BIOS bug: 8254 timer not connected to IO-APIC
[5.2][kdump] MP-BIOS bug: 8254 timer not connected to IO-APIC
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: rc
: ---
Assigned To: Ed Pollard
Martin Jenner
Depends On:
  Show dependency treegraph
Reported: 2008-02-28 00:55 EST by CAI Qian
Modified: 2013-08-05 20:03 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-10-22 06:13:40 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
boot log in ibm-e326m for capture kernel hangs (7.16 KB, text/plain)
2008-02-28 00:55 EST, CAI Qian
no flags Details
boot log in ibm-e326m for capture kernel works (17.29 KB, text/plain)
2008-02-28 00:56 EST, CAI Qian
no flags Details

  None (edit)
Description CAI Qian 2008-02-28 00:55:10 EST
Description of problem:
It has been observed several times on certain IBM x86_64 machines that capture
kernel hanged at,

SMP alternatives: switching to UP code
ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC

With RHEL5.2-Server-20080225.2 tree, the following RHTS machines are affected as
far as I tested,


Though, there is a workaround to add "noapic" to capture kernel command line.

I have tried different version of either kernel (2.6.18-53.el5) or kexec-tools
(1.101-194.4.el5) without success.

Note that the problem is only triggered by certain crash scenarios. For example,
LKDTM (Linux Kernel Dump Test Module)'s bug in do_irq(). Simple "echo c
>/proc/sysrq-trigger" works perfect fine without problem.

Version-Release number of selected component (if applicable):

How reproducible:
always (3 times in a row)

Steps to Reproduce:
1. reserved one of the affected machines, and configured kdump and booted the
kernel with crashkernel=128M@16M.
2. wget
http://porkchop.devel.redhat.com/qa/rhts/lookaside/ltp-kdump-20080228.tar.gz; cd
kdump/lib/lkdtm; export USE_SYMBOL_NAME=1; make
3. insmod lkdtm.ko cpoint_name=INT_HARDWARE_ENTRY cpoint_type=BUG cpoint_count=05
Actual results:
Capture kernel hangs.

Expected results:
Capture kernel bring up successfully.

Additional information:
Both hanging and working (via sysrq-c) kernel booting logs have been attached.
Compared two files showed some interesting data,

--- ibm-e326m-hangs.log 2008-02-28 13:32:58.000000000 +0800
+++ ibm-e326m-works.log 2008-02-28 13:44:13.000000000 +0800
-CPU 0: aperture @ 410000000 size 32 MB
+CPU 0: aperture @ 412000000 size 32 MB
 Aperture too small (32 MB)
 No AGP bridge found
 Memory: 118844k/147440k available (2456k kernel code, 12212k reserved, 1242k
data, 196k init)
-Calibrating delay using timer specific routine.. 3995.00 BogoMIPS (lpj=1997500)
+Calibrating delay using timer specific routine.. 3994.95 BogoMIPS (lpj=1997477)
 Security Framework v1.0.0 initialized
 SELinux:  Initializing.
 selinux_register_security:  Registering secondary module capability
@@ -128,10 +68,321 @@
 Mount-cache hash table entries: 256
 CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
 CPU: L2 Cache: 1024K (64 bytes/line)
-CPU 0/0 -> Node 0
+CPU 0/1 -> Node 0
 CPU: Physical Processor ID: 0
-CPU: Processor Core ID: 0
+CPU: Processor Core ID: 1
 SMP alternatives: switching to UP code
 ACPI: Core revision 20060707
-..MP-BIOS bug: 8254 timer not connected to IO-APIC

I have also tried to build a new kernel with "new early apic init patch" from
BZ336371, but it failed to progress further on ibm-e326m,

  Booting 'Red Hat Enterprise Linux Server (2.6.18-83.el5.earlyapic)'

root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
kernel /vmlinuz-2.6.18-83.el5.earlyapic ro root=/dev/VolGroup00/LogVol00 consol
e=tty0 console=ttyS0,115200
   [Linux-bzImage, setup=0x1e00, size=0x1c411c]
initrd /initrd-2.6.18-83.el5.earlyapic.img
   [Linux-initrd @ 0x37cd3000, 0x31ce14 bytes]
Comment 1 CAI Qian 2008-02-28 00:55:10 EST
Created attachment 296157 [details]
boot log in ibm-e326m for capture kernel hangs
Comment 2 CAI Qian 2008-02-28 00:56:36 EST
Created attachment 296160 [details]
boot log in ibm-e326m for capture kernel works
Comment 3 CAI Qian 2008-03-04 03:18:36 EST
Same problem on ibm-wildhorse-01, but looks like failed with a different crash
Comment 4 Jeff Burke 2008-03-31 14:22:06 EDT
   Do you know if this is a regression for 5.1?

Comment 5 CAI Qian 2008-03-31 18:27:50 EDT
Not a regression against 5.1. It neither work for RHEL5U1 kernel (2.6.18-53.el5)
nor kexec-tools (1.101-194.4.el5).
Comment 6 CAI Qian 2008-04-15 10:44:33 EDT
I could still see this with -89.el5 kernel,

Total of 1 processors activated (3996.41 BogoMIPS).
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ...  failed.
timer doesn't work through the IO-APIC - disabling NMI Watchdog!
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ...

Comment 7 CAI Qian 2008-04-15 10:47:51 EDT
So look like i386 is also affected.
Comment 8 Ed Pollard 2008-05-08 10:31:04 EDT
seems there are several issues with kdump and a few IBM systems. Once I can get
to them reliably again I am going to start making sure that firmware is up to
date on them all.  A little searching around the net yielded this scenario being
seen a couple years ago on several different types of systems and it seems to
have been generally accepted as a bios problem, but I won't know until i can get
to the systems. 

I am also a bit concerned that this is only being seen in the kdump kernel and
not the boot kernel if it is indeed bios related.
Comment 9 CAI Qian 2008-07-15 05:50:49 EDT
Hi, any update so far? I am wondering if it is possible to update BIOS for those


So I could retest in RHEL5.3.
Comment 10 Ed Pollard 2008-07-15 10:08:49 EDT
I am leaving the office for a few days and will plan to do this when I get back.
Sorry for the delay. Morrison should have had it's bios updated so you might
give that one a try first.
Comment 11 CAI Qian 2008-07-15 23:38:01 EDT
ibm-morrison.rhts.bos.redhat.com is currently unavailable in RHTS.
Comment 12 CAI Qian 2008-07-16 06:11:20 EDT
Hi Ed, I'll retest it when you have time to update BIOS of those machines. Thanks!
Comment 13 CAI Qian 2008-10-22 06:13:40 EDT
I'll close this out, as using jprobe() to trigger artificial crashes probably not a good way to test Kdump. I'll create a new Kernel module to test those scenarios and open new BZs for any issue found.

Note You need to log in before you can comment on or make changes to this bug.