473404 – [5.3] Kdump Kernel Hangs on Dell AMD Machines

Bug 473404 - [5.3] Kdump Kernel Hangs on Dell AMD Machines

Summary: [5.3] Kdump Kernel Hangs on Dell AMD Machines

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	rc
Target Release:	---
Assignee:	Neil Horman
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	525215 526775 527955 533192 5.5TechNotes-Updates
TreeView+	depends on / blocked

Reported:	2008-11-28 10:49 UTC by Qian Cai
Modified:	2018-12-02 18:26 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-03-30 07:45:05 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
dell-pem605-01 kdump kernel hangs (3.25 KB, text/plain) 2008-11-28 10:51 UTC, Qian Cai	no flags	Details
dell-pem605-01 kdump kernel hangs with "acpi=off noacpi" (2.61 KB, text/plain) 2008-11-28 10:52 UTC, Qian Cai	no flags	Details
dell-pem605-01 normal kernel boots (15.68 KB, text/plain) 2008-11-28 10:53 UTC, Qian Cai	no flags	Details
dell-pem605-01 dmidecode (14.68 KB, text/plain) 2008-11-28 10:54 UTC, Qian Cai	no flags	Details
dell-pem605-01 cpuinfo (2.41 KB, text/plain) 2008-11-28 10:55 UTC, Qian Cai	no flags	Details
dell-per805-01 kdump kernel hangs (3.55 KB, text/plain) 2008-11-28 10:58 UTC, Qian Cai	no flags	Details
dell-per805-01 kdump kernel hangs with "acpi=off noacpi" (2.86 KB, text/plain) 2008-11-28 11:00 UTC, Qian Cai	no flags	Details
dell-per805-01 normal kernel boots (19.20 KB, text/plain) 2008-11-28 11:00 UTC, Qian Cai	no flags	Details
dell-per805-01 dmidecode (21.28 KB, text/plain) 2008-11-28 11:01 UTC, Qian Cai	no flags	Details
dell-per805-01 cpuinfo (5.47 KB, text/plain) 2008-11-28 11:01 UTC, Qian Cai	no flags	Details
patch to add reserved sections to i386 kexec (3.06 KB, patch) 2009-01-09 20:47 UTC, Neil Horman	no flags	Details \| Diff
new version of patch to add reserved sections to i386 kexec (4.18 KB, patch) 2009-01-12 20:54 UTC, Neil Horman	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2010:0178	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update	2010-03-29 12:18:21 UTC

Description Qian Cai 2008-11-28 10:49:03 UTC

Description of problem:
Kdump kernel hangs on Dell machines with AMD CPU.

...
Initializing CPU#0
CPU 0 irqstacks, hard=c1359000 soft=c1339000
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 122328k/146796k available (2134k kernel code, 8648k reserved, 891k data, 228k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000 (virtual 0xc9800000), IRQs 2, 8, 31
hpet0: 3 32-bit timers, 25000000 Hz
Using HPET for base-timer

Adding "acpi=off noacpi" or "hpet=off hpet=disabled", kdump kernel hangs at a different place.

...
Initializing CPU#0
CPU 0 irqstacks, hard=c1359000 soft=c1339000
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 122328k/146796k available (2134k kernel code, 8648k reserved, 891k data, 228k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.

Adding "maxcpus=0" does not help either.

PAE kernel, non-PAE kernel, and Xen Domain 0 kernel are all affected.

Seen it on two machines so far,

dell-pem605-01.rhts.bos.redhat.com
dell-per805-01.rhts.bos.redhat.com

I don't know if it is a regression because those machines are probably just added to RHTS, nor if it affects IA-32 only.

Version-Release number of selected component (if applicable):
kernel-2.6.18-124.el5
kernel-PAE-2.6.18-124.el5
kernel-xen-2.6.18-124.el5
kexec-tools-1.102pre-50.el5

How reproducible:
Around 50% with bare metal kernel. Here is the testing result on those two machines. All testing are done on IA-32.

dell-pem605:
bare metal kernel (sysrq-c):   2 FAIL - 3 PASS
Xen Domain 0 kernel (sysrq-c): 4 FAIL - 0 PASS

dell-per805:
bare metal kernel (sysrq-c):   1 FAIL - 2 PASS
Xen Domain 0 kernel (sysrq-c): 1 FAIL - 2 PASS

Steps to Reproduce:
1. configure kdump with crashkernel=128M@16M
2. echo c >/proc/sysrq-c

Comment 1 Qian Cai 2008-11-28 10:51:20 UTC

Created attachment 324968 [details]
dell-pem605-01 kdump kernel hangs

Comment 2 Qian Cai 2008-11-28 10:52:39 UTC

Created attachment 324969 [details]
dell-pem605-01 kdump kernel hangs with "acpi=off noacpi"

Comment 3 Qian Cai 2008-11-28 10:53:29 UTC

Created attachment 324971 [details]
dell-pem605-01 normal kernel boots

Comment 4 Qian Cai 2008-11-28 10:54:34 UTC

Created attachment 324972 [details]
dell-pem605-01 dmidecode

Comment 5 Qian Cai 2008-11-28 10:55:55 UTC

Created attachment 324974 [details]
dell-pem605-01 cpuinfo

Comment 6 Qian Cai 2008-11-28 10:56:27 UTC

# uname -ra
Linux dell-pem605-01.rhts.bos.redhat.com 2.6.18-124.el5PAE #1 SMP Mon Nov 17 17:11:02 EST 2008 i686 athlon i386 GNU/Linux

Comment 7 Qian Cai 2008-11-28 10:58:25 UTC

Created attachment 324975 [details]
dell-per805-01 kdump kernel hangs

Comment 8 Qian Cai 2008-11-28 11:00:16 UTC

Created attachment 324976 [details]
dell-per805-01 kdump kernel hangs with "acpi=off noacpi"

Comment 9 Qian Cai 2008-11-28 11:00:52 UTC

Created attachment 324977 [details]
dell-per805-01 normal kernel boots

Comment 10 Qian Cai 2008-11-28 11:01:24 UTC

Created attachment 324979 [details]
dell-per805-01 dmidecode

Comment 11 Qian Cai 2008-11-28 11:01:49 UTC

Created attachment 324980 [details]
dell-per805-01 cpuinfo

Comment 12 Qian Cai 2008-11-28 11:02:33 UTC

# uname -ra
Linux dell-per805-01.rhts.bos.redhat.com 2.6.18-124.el5PAE #1 SMP Mon Nov 17 17:11:02 EST 2008 i686 athlon i386 GNU/Linux

Comment 18 Qian Cai 2008-12-02 09:45:25 UTC

Adding "noapic noacpi acpi=off" to kdump kernel did not help either.

Comment 19 Neil Horman 2008-12-02 12:18:20 UTC

ok, so this system has never worked.  Is it only with the PAE kernel, or all kernels that it fails?  In fact, why are you running the pae kernel on this system?  Isn't it 64 bit hardware?

Comment 20 Qian Cai 2008-12-03 01:44:58 UTC

As you can read from the bug description, PAE kernel, non-PAE kernel, and Xen Domain 0 kernel are all affected on both IA-32 and x86-64 architectures.

Comment 21 Neil Horman 2008-12-12 16:35:03 UTC

Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
Note that this problem can be workaround if encountered by setting hpet=off in the KDUMP_COMMANDLINE_APPEND variable in /et/csysconfig//kdump.conf

Comment 22 Neil Horman 2008-12-12 20:34:24 UTC

I was talking with dchapman today, who recently found out that on several hp systems which are having a simmilar problem to this, that there are acpi regions in the e820 map which are not marked as ACPI NVS/DATA, but rather simply 'reserved'.  As kdump ignores reserved sections, it doesnt map them in kdump kernels, causing all sort of odd behavior.  By explicitly mapping those segments, it made the systems work.  A simmilar patch is working here.

Doug is going to push a kexec and/or kernel patch upstream to universally map the reserved areas of ram.  Since this appears to be the same problem, I'm going to close this as a dup of the bug that he is tracking this in, bz 475843.  Once doug has his fixes pushed upstream, I'll pull them into the kernel and kexec for RHEL

*** This bug has been marked as a duplicate of bug 475843 ***

Comment 23 Qian Cai 2008-12-25 10:56:04 UTC

I am afraid this is not the same as bug 475843. According to https://bugzilla.redhat.com/show_bug.cgi?id=475843#c16, the bug should be fixed by using kexec-tools-1.102pre-57.el5. However, I have tried kernel-PAE-2.6.18-128.el5 and kexec-tools-1.102pre-57.el5 here, and it did not solve the problem.

dell-pem605-01.rhts.bos.redhat.com login: SysRq : Trigger a crashdump
Linux version 2.6.18-128.el5PAE (mockbuild.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP8
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000dfaa0000 (usable)
 BIOS-e820: 00000000dfaa0000 - 00000000dfab6000 (reserved)
 BIOS-e820: 00000000dfab6000 - 00000000dfad5c00 (ACPI data)
 BIOS-e820: 00000000dfad5c00 - 00000000e0000000 (reserved)
 BIOS-e820: 00000000f0000000 - 00000000f8000000 (reserved)
 BIOS-e820: 00000000fe000000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000001000000 - 0000000008f5b000 (usable)
...
Memory: 122328k/146796k available (2119k kernel code, 8612k reserved, 879k data, 228k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000 (virtual 0xc9800000), IRQs 2, 8, 31
hpet0: 3 32-bit timers, 25000000 Hz
Using HPET for base-timer

In addition, from https://bugzilla.redhat.com/show_bug.cgi?id=475843#c3, it said that the problem in that bug was that it was unable to map the ACPI tables. However, it seems clearly that ACPI data is here at least for this bug,

BIOS-e820: 00000000dfab6000 - 00000000dfad5c00 (ACPI data)

Comment 24 Qian Cai 2008-12-25 10:57:42 UTC

(In reply to comment #21)
> Release note added. If any revisions are required, please set the 
> "requires_release_notes" flag to "?" and edit the "Release Notes" field
> accordingly.
> All revisions will be proofread by the Engineering Content Services team.
> 
> New Contents:
> Note that this problem can be workaround if encountered by setting hpet=off in
> the KDUMP_COMMANDLINE_APPEND variable in /et/csysconfig//kdump.conf

This is wrong. As I stated in comment #0,

Adding "acpi=off noacpi" or "hpet=off hpet=disabled", kdump kernel hangs at a
different place.

There is basically no workaround.

Comment 25 Neil Horman 2009-01-06 20:35:22 UTC

Regarding your ACPI comment, its not that we're not expressly mapping the ACPI regions, in fact we are.  The problem that Doug noted was that sometimes bios vendors will place ACPI data (or other ancilliary data required to make various bits of hardware function properly inside areas marked in the e820 tables as reserved (rather than ACPI).  kdump was not mapping this into the kdump kernel, hence all sorts of odd problems arose, resulting in various odd failures. Cai, do you have the failing machine reserved in rhts at the moement, and if so, which one?  I'd like to poke about on it a bit and verify that we're correctly reserving all the memory regions properly now.  Thanks!

Comment 27 Neil Horman 2009-01-08 20:19:53 UTC

Cai, I've been working on dell-per805-01.rhts.bos.redhat.com, and using kexec-tools-1.102pre-57.el5, I can capture a vmcore no problem.  Can you please confirm?  57 is whats supposed to be shipping with 5.3, so I think, if you're comfortable with this, we should be able to close this, since 57 is the version that starts mapping reserve sections of memory as we should be.  Please confirm.

Comment 28 Qian Cai 2009-01-09 09:27:10 UTC

Neil, As you can see from comment #0, it might works sometimes, but the failure rate looks like around 50%. Just be a little patient. :)

I have just reproduced the problem on the same machine using kexec-tools-1.102pre-57.el5.

# rpm -q kexec-tools kexec-tools-1.102pre-57.el5

# echo c >/proc/sysrq-trigger

SysRq : Trigger a crashdump
Linux version 2.6.18-128.el5PAE (mockbuild.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.28
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000100 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000cfaa0000 (usable)
 BIOS-e820: 00000000cfaa0000 - 00000000cfab6000 (reserved)
 BIOS-e820: 00000000cfab6000 - 00000000cfad5c00 (ACPI data)
 BIOS-e820: 00000000cfad5c00 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000f0000000 - 00000000f8000000 (reserved)
 BIOS-e820: 00000000fe000000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000001000000 - 0000000008f5b000 (usable)
...
Memory: 122584k/146796k available (2119k kernel code, 8352k reserved, 879k data, 228k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000 (virtual 0xc9800000), IRQs 2, 8, 31
hpet0: 3 32-bit timers, 25000000 Hz
Using HPET for base-timer

Comment 29 Neil Horman 2009-01-09 15:42:06 UTC

Interesting, kexec doesn't seem to pick up on adding the reserved areas every time, and thats what corresponds to the HPET hang.  Should be pretty easy to track down.  Odd behavior though, we just parse /proc/iomem to get that info, I wonder whats changing the behavior.

Comment 30 Neil Horman 2009-01-09 20:42:27 UTC

I think I see at least part of the problem.  The PAE kernel maps some of its ram to a different location than what we normally find in the origional e820 map, and so on kdump reboot we have some ram remapped in the physical e820 map using a address that is inaccessible until later during the boot.  I need to figure out how to override that

Comment 31 Neil Horman 2009-01-09 20:47:33 UTC

Created attachment 328586 [details]
patch to add reserved sections to i386 kexec

Additionally, we'll also need this patch to kexec.  Even though this is a 64 bit system, its running a 32 bit OS, and kexec has a per-arch kexec command line generation setup, so the code  we added to x86_64 to add reserved & acpi e820 sections needs to be copied over to x86.  This patch does that.

so all I need to figure out now is how to fix up the remapped physical memory issue (I think)

Comment 32 Neil Horman 2009-01-12 20:54:13 UTC

Created attachment 328782 [details]
new version of patch to add reserved sections to i386 kexec

grr, it look like even in addition ot the above patch, we still hang on the hpet timer.  I'll need to dig in farther to find exactly where we're hanging.

Comment 33 Neil Horman 2009-01-13 20:28:32 UTC

grr, just tracked this down.  We're getting stuck in calibrate_delay.  Given that these are quad core processors on an HT bus, I'm strongly suspicious that this is a duplicate of bz 462519, the fix for which is a much earlier initalization of the apic, which I am trying to figure out.  I'm going to close this as a dupe of that, and we can re-open if/wehn I figure out how to handle the APIC movement properly.

*** This bug has been marked as a duplicate of bug 462519 ***

Comment 34 Qian Cai 2009-06-19 15:47:44 UTC

FYI. I have seen kdump kernel failed on another two Dell machines during RHEL5.4 testing, which looks like have the same issues.

dell-pem805-01.rhts.bos.redhat.com
dell-pem905-01.rhts.bos.redhat.com

Although we can go a little bit further beyond the line of "Using HPET for base-timer" by using,

kernel-2.6.18-153.el5
kexec-tools-1.102pre-73.el5

...
Using HPET for base-timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4000)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(4) -> Core 2
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Checking 'hlt' instruction... irq 106, desc: c12f2380, depth: 1, count: 0, unha0
->handle_irq():  c104b61e, handle_bad_irq+0x0/0x1a6
->chip(): c1288d80, 0xc1288d80
->action(): 00000000
  IRQ_DISABLED set
unexpected IRQ trap at vector 6a
irq 114, desc: c12f2780, depth: 1, count: 0, unhandled: 0
->handle_irq():  c104b61e, handle_bad_irq+0x0/0x1a6
->chip(): c1288d80, 0xc1288d80
->action(): 00000000
  IRQ_DISABLED set
unexpected IRQ trap at vector 72

So, the affected machines in RHTS apparently have increased to 4,

dell-pem805-01.rhts.bos.redhat.com
dell-pem905-01.rhts.bos.redhat.com
dell-pem605-01.rhts.bos.redhat.com
dell-per805-01.rhts.bos.redhat.com

Comment 35 Qian Cai 2009-07-02 14:48:29 UTC

Neil, this looks like turns out to be something different that the 32-bit variant of,

Bug 462519 - Tracking Early Init Apic fix for kdump issues

because kdump kernel still hang by using kernel-2.6.18-156.el5 and kexec-tools-1.102pre-75.el5.

Red Hat Enterprise Linux Server release 5.4 Beta (Tikanga)
Kernel 2.6.18-156.el5PAE on an i686

dell-per805-01.rhts.bos.redhat.com login: SysRq : Trigger a crashdump
Linux version 2.6.18-156.el5PAE (mockbuild.redhat.com) (gcc v9
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000010000 - 00000000000a0000 (usable)
 BIOS-e820: 0000000000100000 - 00000000cfaa0000 (usable)
 BIOS-e820: 00000000cfaa0000 - 00000000cfab6000 (reserved)
 BIOS-e820: 00000000cfab6000 - 00000000cfad5c00 (ACPI data)
 BIOS-e820: 00000000cfad5c00 - 00000000d0000000 (reserved)
 BIOS-e820: 00000000f0000000 - 00000000f8000000 (reserved)
 BIOS-e820: 00000000fe000000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
user-defined physical RAM map:
 user: 0000000000000000 - 00000000000a0000 (usable)
 user: 0000000001000000 - 0000000008f5b000 (usable)
0MB HIGHMEM available.
143MB LOWMEM available.
found SMP MP-table at 000fe710
Memory for crash kernel (0x0 to 0x0) notwithin permissible range
disabling kdump
NX (Execute Disable) protection: active
DMI 2.5 present.
Using APIC driver default
ACPI: PM-Timer IO Port: 0x508
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 0:2 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x04] enabled)
Processor #4 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
Processor #1 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x05] enabled)
Processor #5 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x02] enabled)
Processor #2 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] enabled)
Processor #6 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x03] enabled)
Processor #3 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)
Processor #7 0:2 APIC version 16
WARNING: maxcpus limit of 1 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x09] address[0xd7ffe000] gsi_base[32])
IOAPIC[1]: apic_id 9, version 17, address 0xd7ffe000, GSI 32-55
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
Enabling APIC mode:  Flat.  Using 2 I/O APICs
ACPI: HPET id: 0x10de8201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 10000000 (gap: 08f5b000:f70a5000)
Detected 2300.279 MHz processor.
Built 1 zonelists.  Total pages: 36699
Kernel command line: ro root=/dev/VolGroup00/LogVol00 console=ttyS1,115200  irqK
Misrouted IRQ fixup and polling support enabled
This may significantly impact system performance
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c135d000 soft=c133d000
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 122200k/146796k available (2151k kernel code, 8824k reserved, 886k data)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
hpet0: at MMIO 0xfed00000 (virtual 0xc9800000), IRQs 2, 8, 31
hpet0: 3 32-bit timers, 25000000 Hz
Using HPET for base-timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4600)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(4) -> Core 3
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Checking 'hlt' instruction... 
<hanging ...>

Do you want me to create a new bug or set it to Assigned?

Comment 37 Qian Cai 2009-07-02 14:53:00 UTC

OK, looks like the patch has not been integrated yet. Please disregard comment #35 and #36.

Comment 38 Neil Horman 2009-07-02 15:30:51 UTC

open a new bug please, the above log looks like we were getting farther than we did previously.

Comment 39 Qian Cai 2009-07-10 03:03:15 UTC

OK. A new bug has been filed here,
Bug 510645 - Kdump Kernel Stops on Dell Machines at: Checking 'hlt' instruction

Comment 40 RHEL Program Management 2009-09-25 17:39:55 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 41 Evan McNabb 2009-10-01 17:08:23 UTC

Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1 @@
-Note that this problem can be workaround if encountered by setting hpet=off in the KDUMP_COMMANDLINE_APPEND variable in /et/csysconfig//kdump.conf+Note that this problem can be workaround if encountered by setting hpet=off in the KDUMP_COMMANDLINE_APPEND variable in /etc/sysconfig/kdump.conf

Comment 42 Don Zickus 2009-10-06 19:36:09 UTC

in kernel-2.6.18-168.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 45 Qian Cai 2009-10-08 11:18:31 UTC

Deleted Release Notes Contents.

Old Contents:
Note that this problem can be workaround if encountered by setting hpet=off in the KDUMP_COMMANDLINE_APPEND variable in /etc/sysconfig/kdump.conf

Comment 48 errata-xmlrpc 2010-03-30 07:45:05 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html

Comment 49 Rocky Shi 2010-05-13 01:33:47 UTC

After verification, this issue have been fix in Redhat 5.5 32bit. But it sill happen in Redhat 5.5 64bit. Need re-open it.

Comment 50 Rocky Shi 2010-05-13 01:36:15 UTC

Add ken in the thread.

Comment 51 Shyam Iyer 2011-06-02 14:47:22 UTC

Just curious if the BIOSes running on these systems are the latest.

-Shyam Iyer
Dell Onsite Engineer

Note You need to log in before you can comment on or make changes to this bug.