+++ This bug was initially created as a clone of Bug #215417 +++ Description of problem: The kexec tools do not pass in the ACPI NVS reserved memory area as a reserved memory area to the kdump kernel. --- Copied text directly from Benjamin Romer's email to the Fastboot mailing list -- I'd like to submit a patch to kexec that addresses a serious problem with kdump on the Unisys ES7000/600 system. We initially encountered this issue on SUSE's SLES 10 beta distributions. On the ES7000/600, the ACPI data is located in the 3GB range, and above that is an ACPI NVS region. The problem is that kexec, when loading a dump kernel, does not include the ACPI NVS region in the memory map it provides to the dump kernel. This causes a kernel panic early in the dump kernel's boot process: Bootdata ok (command line is root=/dev/sda2 showopts console=tty0 console=ttyS0,115200n8 earlyprintk=serial,ttyS0,115200n8 memmap=exactmap memmap=640K@0K memmap=3296K@16384K memmap=61599K@20321K elfcorehdr=20320K memmap=408K#3144128K) Linux version 2.6.16.14-6-kdump (geeko@buildhost) (gcc version 4.1.0 (SUSE Linux)) #1 Tue May 9 12:09:06 UTC 2006 BIOS-provided physical RAM map: BIOS-e820: 0000000000000100 - 000000000009e400 (usable) BIOS-e820: 000000000009e400 - 00000000000a0000 (reserved) BIOS-e820: 0000000000100000 - 00000000bfe70000 (usable) BIOS-e820: 00000000bfe70000 - 00000000bfed6000 (ACPI data) BIOS-e820: 00000000bfed6000 - 00000000bff00000 (ACPI NVS) BIOS-e820: 00000000bff00000 - 00000000e8000000 (usable) BIOS-e820: 00000000f8000000 - 00000000fec00000 (reserved) BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved) BIOS-e820: 0000000100000000 - 0000000810000000 (usable) user-defined physical RAM map: user: 0000000000000000 - 00000000000a0000 (usable) user: 0000000001000000 - 0000000001338000 (usable) user: 00000000013d8400 - 0000000005000000 (usable) user: 00000000bfe70000 - 00000000bfed6000 (ACPI data) kernel direct mapping tables up to bfed6000 @ 8000-8000 PANIC: early exception rip 10 error ffffffff8131433b cr2 2b0ed2682180 Call Trace: <ffffffff8131433b>{reserve_bootmem_core+78} <ffffffff81312b52>{reserve_bootmem_generic+19} <ffffffff81310ea7>{smp_scan_config+145} <ffffffff81310f02>{find_intel_smp+54} <ffffffff8130b6af>{setup_arch+2158} <ffffffff813045de>{start_kernel+42} <ffffffff81304259>{_sinittext +601} RIP 0x10 We have determined that the cause of this panic is that the kernel attempts to reserve the ACPI NVS region, which is defined by a pointer stored in the ACPI data region, but cannot reserve memory above the maximum usable memory limit. The kernel determines the maximum usable memory by taking the highest address of usable memory specified in the memory map; so it is setting the value to 0x5000000, as listed in the map, then attempting to reserve memory above 0xbfed6000, which triggers a panic. By modifying kexec to also pass the ACPI NVS region as reserved memory in the memory map, the kernel will not panic. We have tested this on both the ES7000/600 and a Dell server system which exhibited the same problem and it worked on both. The attached patch file contains the changes that we made, and applies to kexec-tools-1.101. Version-Release number of selected component (if applicable): Tested with RHEL5 Beta 2 Milestone 9 How reproducible: Always on an ES7000 Steps to Reproduce: 1. Setup for kdump with boot paramter crashkernel=64M@16M 2. Install the kernel with kexec -p /boot/vmlinux-kdump --args-linux --command-line="`cat /proc/cmdline` lpj=1306000 earlyprintk=serial,ttyS0,115200n8" --initrd=/boot/initrd-kdump 3. Issue alt-sysrq-c (or echo c > /proc/sysrq-trigger) Actual results: Unfortunately the kernel just hangs. I am attaching the serial console output from early_printk. I'll try modifying the kernel and upgrading the kexec tools to see if I can find out some more information. Expected results: Kdump kernel should boot up. Additional info: Original patch submission to the fastboot mailing list http://lists.osdl.org/pipermail/fastboot/2006-June/003202.html Inclusion into the kexec tool tree http://lists.osdl.org/pipermail/fastboot/2006-July/003412.html
Created attachment 141118 [details] Ben's original patch to add ACPI NVS space to the exactmap
Created attachment 141119 [details] kdump kernel hang
fixed in -132.el5. thanks.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion.
A package has been built which should help the problem described in this bug report. This report is therefore being closed with a resolution of CURRENTRELEASE. You may reopen this bug report if the solution does not work for you.