Bug 442661
Summary: | [5.2][kdump][xen] crash failed to read vmcore from Dom0 Kernel | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Qian Cai <qcai> | ||||||||||
Component: | kernel-xen | Assignee: | Bill Burns <bburns> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | |||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 5.2 | CC: | anderson, ddomingo, duck, nhorman, rlerch, xen-maint | ||||||||||
Target Milestone: | rc | Keywords: | Regression | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | i386 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: |
Memory reserved for the kdump kernel was incorrect, resulting in unusable crash dumps. In this update, the memory reservation is now correct, allowing proper crash dumps to be generated.
|
Story Points: | --- | ||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2009-01-20 20:08:09 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 391221, 448753, 454962 | ||||||||||||
Attachments: |
|
Description
Qian Cai
2008-04-16 04:10:19 UTC
Created attachment 302542 [details]
nec-em18 crash read failure log
Created attachment 302543 [details]
nec-em18 system info
Created attachment 302544 [details]
ibm-defiant crash read failure log
Created attachment 302546 [details]
ibm-defiant system info
In the nec-em18 system: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000120 0x0000000000000000 0x0000000000000000 0x0000000000000380 0x0000000000000380 0 LOAD 0x00000000000004a0 0xffff810000000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a04a0 0xffff810000100000 0x0000000000100000 0x0000000001f00000 0x0000000001f00000 RWE 0 LOAD 0x0000000001fa04a0 0xffff81000a000000 0x000000000a000000 0x0000000035f70000 0x0000000035f70000 RWE 0 The physical addresses that crash can't seem to find in the vmcore file are in the last LOAD segment, which starts at physical address a000000 and has a size of 35f70000. That physical address region is unity-mapped by the dom0 kernel, encompassing the virtual address range from ffff88000a000000 to ffff88003ff70000. All of the failed addresses that crash cannot read from the dumpfile start in a region of that memory near ffff880033xxxxxx, and contained within the 32MB region between ~ffff880033xxxxxx but less than ffff880035000000. Similarly, in the ibm-defiant system: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000120 0x0000000000000000 0x0000000000000000 0x00000000000006a8 0x00000000000006a8 0 LOAD 0x00000000000007c8 0xffff810000000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a07c8 0xffff810000100000 0x0000000000100000 0x0000000001f00000 0x0000000001f00000 RWE 0 LOAD 0x0000000001fa07c8 0xffff81000a000000 0x000000000a000000 0x0000000035fcbc00 0x0000000035fcbc00 RWE 0 The physical addresses that crash can't seem to find in the vmcore file are in the last LOAD segment, which starts at physical address a000000 and has a size of 35fcbc00. That physical address region is unity-mapped by the dom0 kernel, encompassing the virtual address range from ffff88000a000000 to ffff88003ffcbc00. All of the noted addresses that crash cannot read from the dumpfile are in a region of that memory near ffff880034xxxxxx, and contained within the 16MB region between ~ffff880034xxxxxx but less than ffff880035000000. If the memory is not in the dumpfile, there's nothing that the crash utility can do about it. But I need the two dumpfiles to examine in order to verify that the memory segments are not contained in the vmcore. Can you make them available to me? Secondly, is this something that has only cropped up using the latest RHEL5.2 beta kernel? Have you successfully created dom0 vmcores on the same systems with earlier RHEL5 kernels? Sorry -- I am confusing this with another RHEL5.2 xen kdump issue associated with the RHEL5.2 hypervisor memory being relocated. (i.e., not the dom0 kernel as above). The PT_LOAD segments in comment #5 are concerned with actual physical (machine) memory as seen by the hypervisor. When the vmcore is used to analyze dom0 linux kernels, all of its pseudo-physical memory references must be translated to machine-memory, and then that machine memory page must be found in the vmcore core. So, for some reason, in the two dumpfiles above, the crash utility cannot locate the machine memory for the dom0 virtual ranges indicated, which seem to be around the ~ffff880034xxxxxx area. That virtual range is unity-mapped with respect to dom0 to a pseudo-physical region around ~34xxxxxx. And the problem at hand is that the "real" machine memory associated with that pseudo-physical memory cannot be determined by the crash utility based upon what's in the vmcore. But again, I need the dumpfiles to verify that in fact the machine memory associated with the dom0's pseudo-physical memory is not in the vmcore. I have other RHEL5.2 2.6.18-89.el5xen x86_64 vmcores that do not exhibit this problem. So that is why I wonder whether this same problem occurred on that particular hardware on RHEL5.1. The RHEL5.2 hypervisor now runs in relocated physical memory, and I wonder whether there might be some type of linkage to that problem. Again, I am sorry for the confusion. It worked with RHEL5U1 from my test on nec-em18.rhts.boston.redhat.com. vmcores can be found at, http://porkchop.devel.redhat.com/qa/qa/vmcores/bz442661/vmcore-5.1 http://porkchop.devel.redhat.com/qa/qa/vmcores/bz442661/vmcore-5.2 Thanks -- I'll take a look at the difference between the two. The download is a bit slow -- so for curiousity's sake, can you tell me what the file size is for each of the two vmcores? Since xen/kdump creates a vmcore containing all of physical memory, I believe that they should be essentially the same size. -rw-rw-rw- 1 qcai devqa7 938542240 Apr 18 10:22 vmcore-5.1 -rw-rw-rw- 1 qcai devqa7 938542240 Apr 18 10:13 vmcore-5.2 Just to mention that my RHEL5U1 test was only replace RHEL5U2 kernel-xen, kernel related packages to RHEL5U1 version (-53.el5), so kexec-tools, crash etc all used RHEL5U2 versions. (In reply to comment #10) > Just to mention that my RHEL5U1 test was only replace RHEL5U2 kernel-xen, kernel > related packages to RHEL5U1 version (-53.el5), so kexec-tools, crash etc all > used RHEL5U2 versions. OK, thanks. Looks like i686 is also affected, at least on athlon3.rhts.boston.redhat.com, http://rhts.redhat.com/testlogs/20853/73762/617308/1.ACS.2008-04-18-13:07:00 crash: read error: kernel virtual address: c71be180 type: "kmem_cache_s buffer" crash: unable to initialize kmem slab cache subsystem WARNING: cannot access vmalloc'd module memory crash: read error: kernel virtual address: c75a0000 type: "fill_task_struct" crash: read error: kernel virtual address: c6236000 type: "fill_thread_info" crash: read error: kernel virtual address: c75ed000 type: "fill_task_struct" crash: read error: kernel virtual address: c77fc000 type: "fill_task_struct" crash: read error: kernel virtual address: c7bbf000 type: "fill_thread_info" crash: read error: kernel virtual address: c701d000 type: "fill_thread_info" crash: read error: kernel virtual address: c75a0550 type: "fill_task_struct" crash: read error: kernel virtual address: c75edaa0 type: "fill_task_struct" crash: read error: kernel virtual address: c75b9000 type: "fill_thread_info" crash: read error: kernel virtual address: c7aa8000 type: "fill_thread_info" crash: read error: kernel virtual address: c75ee000 type: "fill_thread_info" crash: read error: kernel virtual address: c7556aa0 type: "fill_task_struct" crash: read error: kernel virtual address: c75a0aa0 type: "fill_task_struct" crash: read error: kernel virtual address: c75ed550 type: "fill_task_struct" crash: read error: kernel virtual address: c7556000 type: "fill_task_struct" crash: read error: kernel virtual address: c7633000 type: "fill_thread_info" crash: read error: kernel virtual address: c7696000 type: "fill_thread_info" crash: read error: kernel virtual address: c77fcaa0 type: "fill_task_struct" crash: read error: kernel virtual address: c77fc550 type: "fill_task_struct" crash: read error: kernel virtual address: c7bbf000 type: "32-bit KVADDR" This is either a kexec-tools or a xen kernel bug, or perhaps both. Using the nec-em18.rhts.boston.redhat.com vmcore, the crash session fails during initialization because the memory containing the task_struct of the panic task could not be found in the vmcore. But by entering by entering "crash --no_panic vmlinux vmcore", it skipped the panic task determination, and makes it to the "crash> " prompt. Once that was done, I can look at the contents of the dom0 kernel's phys_to_machine_mapping[0...end_pfn] array, which simply a one-to-one mapping of each dom0 pseudo-physical page and the "real" machine physical memory page that backs it. And in so doing, I can see that the problem is that there is a 10MB range of machine memory being used by the dom0 kernel that is not contained in the vmcore file. More specically, on nec-em18.rhts.boston.redhat.com, the physical memory copied to the vmcore is defined by the contents of each PT_LOAD segment: Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000120 0x0000000000000000 0x0000000000000000 0x0000000000000380 0x0000000000000380 0 LOAD 0x00000000000004a0 0xffff810000000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a04a0 0xffff810000100000 0x0000000000100000 0x0000000001f00000 0x0000000001f00000 RWE 0 LOAD 0x0000000001fa04a0 0xffff81000a000000 0x000000000a000000 0x0000000035f70000 0x0000000035f70000 RWE 0 So given the above, these are 3 physical (LOAD) regions: start size end 0x0000000 a0000 0xa0000 0x0100000 1f00000 0x2000000 0xa000000 35f70000 0x3ff70000 However, the phys_to_machine_mapping[] array contains pages in the physical range from 0x9600000 up to 0x9fff000, so whenever crash makes a virtual memory reference to a page backed by anything in that machine memory range, it gets a read error. So on this machine, then, there is a 10MB region starting at 0x9600000, which is located just before (contiguous to) the region that starts at 0xa000000. The kernel is obviously using that memory, but when the vmcore is created, it is not aware of that particular range, and doesn't store it in the vmcore. The question is why? How does kexec-tools decide where the PT_LOAD segments are? Does it read /proc/iomem? I believe that /proc/iomem on the dom0 kernel shows machine memory values, and if so, perhaps the RHEL5.2 dom0 kernel exports incorrect values? What has to be done first is to look at /proc/iomem on the live system. And then run crash on the live system as well, to verify that the the phys_to_machine_mapping[] array contains the mfn values ranging from 0x9600 to 0x9fff. Do you have access to this or any other "known-to-fail" machine? Dave I've attempted an RHTS reserve-workflow for both nec-em18.rhts.boston.redhat.com and ibm-defiant.rhts.boston.redhat.com in case they are available. ...and also for athlon3.rhts.boston.redhat.com OK, I was able to get nec-em18.rhts.boston.redhat.com via RHTS, and configured the xen kernel the same way, i.e., with 128M@32M. I have verified that the live kernel has been allocated machine (physical) memory from the hypervisor that has been reserved for use by the crash kernel. So it is not a problem with kexec-tools reading incorrectly posted System RAM, but rather a bug in the new xen 3.1.2-based hypervisor. In fact, here is what /proc/iomem shows: # cat /proc/iomem 00000000-0009b7ff : System RAM 0009b800-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000cafff : Video ROM 000cb000-000cbfff : Adapter ROM 000cc000-000ccfff : Adapter ROM 000f0000-000fffff : System ROM 00100000-3ff6ffff : System RAM 02000000-09ffffff : Crash kernel 3ee00000-3fdfffff : Hypervisor code and data 3ee0e080-3ee0e213 : Crash note 3ee0fd80-3ee0ff6b : Crash note 3ff70000-3ff7afff : ACPI Tables ... where there is a large System RAM range starting at 1MB, i.e., from 0x100000 to 0x3ff6ffff. When the crashkernel=128M@32M parameter is put into place, that region is broken up as seen in the vmcore file, where there is a region from 1M to 32M, followed by crashkernel "hole" from 32M to (32M+128M) or 150M (0x2000000 to 0xa000000). The remainder of machine physical memory starts at 0xa000000 and goes until the end of memory. In any case, the machine physical memory from 0x2000000 to 0xa000000 is reserved for the crashkernel, and cannot be given by the hypervisor for use by the dom0 kernel. However, the highest 10MB of the 128MB region is in fact being used by the dom0 kernel, and can be seen by dumping the dom0 phys_to_machine_mapping[] array, and seeing the entries from 0x9600 through 0x9fff -- which are mfns for physical memory 0x9600000 through 0x9fff000 -- which make up the high 10MB in the 128MB region reserved for the crashkernel. Setting flags. No fix known yet. Proposed release note: Some systems running the Xen Hypervisor will produce crash dumps that are not readable using the crash utility due to dom0 using some memory that should be reserved for kdump. added to RHEL5.2 release notes: <quote> Some systems using the hypervisor may produce crash dumps that are not readable using the crash utility. This is because dom0 sometimes uses memory regions normally reserved for kdump. </quote> please advise if any further revisions are required. thanks! Bill, Here's the bug: When kexec_reserve_area() calls reserve_e820_ram(), it's passing a size argument as the last argument instead of the ending address. Instead of this: if ( !reserve_e820_ram(e820, kdump_start, kdump_size) ) it should be: if ( !reserve_e820_ram(e820, kdump_start, kdump_start + kdump_size) ) That's why it was setting the end at 0x8000000 instead of 0xa000000, because 0x8000000 is 128M (the size). Staring us right in the face... Looks to have introduced by: xen-x86-make-hv-respect-the-e820-map-16m.patch (In reply to comment #22) > Looks to have introduced by: > > xen-x86-make-hv-respect-the-e820-map-16m.patch Actually, even though the patch above changes the code in question changes function names, etc., it looks like it had the same problem. FWIW, I instrumented the HV version just prior to the one that introduced the patch above (2.6.18-68.el5xen), and it also shows the same problem. Again, here's an attempt with 128M@32M: (XEN) boot_e820 RAM map: (XEN) 0000000000000000 - 00000000000a0000 (reserved) (XEN) 00000000000f0000 - 0000000000100000 (reserved) (XEN) 0000000001000000 - 0000000002000000 (usable) (XEN) 0000000008000000 - 000000003df2b000 (usable) (XEN) 000000003ee00000 - 000000003ee00000 (usable) (XEN) 000000003fe00000 - 000000003fe8c000 (usable) The hole starts correctly at 2000000 (32M), but instead of a 128M hole ending at a000000, the next segment starts at 8000000, because incorrectly used the 128M "size" value. So technically it doesn't appear to be a regression -- it's apparently never worked correctly. in kernel-2.6.18-101.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 This bug has been marked for inclusion in the Red Hat Enterprise Linux 5.3 Release Notes. To aid in the development of relevant and accurate release notes, please fill out the "Release Notes" field above with the following 4 pieces of information: Cause: What actions or circumstances cause this bug to present. Consequence: What happens when the bug presents. Fix: What was done to fix the bug. Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore') Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: The mempry reserved for the kdump kernel on Xen was incorrect resulting in unusable crash dumps. The patch fixed the memory reservation to be correct and allows proper crash dumps to be generated. Release note updated. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -1 +1 @@ -The mempry reserved for the kdump kernel on Xen was incorrect resulting in unusable crash dumps. The patch fixed the memory reservation to be correct and allows proper crash dumps to be generated.+Memory reserved for the kdump kernel was incorrect, resulting in unusable crash dumps. In this update, the memory reservation is now correct, allowing proper crash dumps to be generated. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html |