Description of problem: With the 2.6.18-58.el5xen kernel and kexec-tools-1.102pre-8.el5, the dom0 kdump kernel is created with an invalid notes section size. Version-Release number of selected component (if applicable): i386 kernel-2.6.18-58.el5xen kexec-tools-1.102pre-8.el5 How reproducible: Configure dom0 kdump to use bare-metal 2.6.18-58.el5 kernel as the kdump kernel, and force a kernel crash. Steps to Reproduce: 1. 2. 3. Actual results: The vmcore causes the crash utility to fail with a segmentation violation. Investigating the ELF header with readelf, it shows that the notes section contains an absurdly large number of bytes (0x14228e98 in this example), and then readelf dies with a segmentation violation: # readelf -a vmcore ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 5 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no sections in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000 0x0000000014228e98 0x0000000014228e98 0 LOAD 0x0000000014228ff0 0x00000000c0000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000142c8ff0 0x00000000c0100000 0x0000000000100000 0x0000000001f00000 0x0000000001f00000 RWE 0 LOAD 0x00000000161c8ff0 0x00000000ca000000 0x000000000a000000 0x000000002e000000 0x000000002e000000 RWE 0 LOAD 0x00000000441c8ff0 0xffffffffffffffff 0x0000000038000000 0x0000000007e8c000 0x0000000007e8c000 RWE 0 There is no dynamic section in this file. There are no relocations in this file. There are no unwind sections in this file. No version information found in this file. Notes at offset 0x00000158 with length 0x14228e98: Owner Data size Description CORE 0x00000090 NT_PRSTATUS (prstatus structure) Xen 0x00000010 Unknown note type: (0x01000002) Xen 0x00000024 Unknown note type: (0x01000001) CORE 0x00000090 NT_PRSTATUS (prstatus structure) Xen 0x00000010 Unknown note type: (0x01000002) Segmentation fault # Expected results: The vmcore should be similar to that produced for a bare-metal kernel of the same version. For example: # readelf -a vmcore ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 5 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no sections in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000 0x00000000000004a0 0x00000000000004a0 0 LOAD 0x00000000000005f8 0x00000000c0000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a05f8 0x00000000c0100000 0x0000000000100000 0x0000000001f00000 0x0000000001f00000 RWE 0 LOAD 0x0000000001fa05f8 0x00000000ca000000 0x000000000a000000 0x000000002e000000 0x000000002e000000 RWE 0 LOAD 0x000000002ffa05f8 0xffffffffffffffff 0x0000000038000000 0x0000000007e8cc00 0x0000000007e8cc00 RWE 0 There is no dynamic section in this file. There are no relocations in this file. There are no unwind sections in this file. No version information found in this file. Notes at offset 0x00000158 with length 0x000004a0: Owner Data size Description CORE 0x00000090 NT_PRSTATUS (prstatus structure) CORE 0x00000090 NT_PRSTATUS (prstatus structure) VMCOREINFO 0x00000340 Unknown note type: (0x00000000) # Additional info:
This was split out from a bug that was thought to be two problems. It turns out that the patch that was initially submitted that created this problem was corrupt, a rediff introduced unexpected changes by accident. I've posted the proper version to RHKL. Once it is integrated we should retest to make sure this bug still exists.
Yep...
Hi, makedumpfile cannot use the note section of vmcoreinfo for xen. On the following, Oda-san told about it but it hasn't be implemented yet. http://lists.infradead.org/pipermail/kexec/2007-September/000790.html How about deleting /sys/kernel/vmcoreinfo if building dom0 kernel like the attached patch ? We confirmed that the crash utility can read the vmcore which is created on the kernel (2.6.18-58.el5xen) with this patch.
Created attachment 290443 [details] delete /sys/kernel/vmcoreinfo on dom0 kernel
I just confirmed that the proper vmcoreinfo patch made it into -64.el5, and confirmed that it fixed the zero length /proc/vmcore issue. I don't have a 386 machine handy at the moment to confirm that this problem is also fixed (they're tied up with other tests). Does anyone getting copied on this bug have a RHEL5 x86 box that they can test this out on quickly?
I just requested a RHEL5 i386 from RHTS. Does the -64.el5 kernel contain/require the patch from Ken'ichi in comment #4 above?
Doh! Sorry, Dave, I could have just done that. I wasn't thinking. Regarding The patch in comment 4, the short answer is, I'm not sure. Given the history of when this bug started, I'd say its a 50/50 shot as to weather the bogus patch was responsible for this problem, or if the presence of the vmcoreinfo file at all caused this. Looking at the problem description, my guess would be that the bogus note section will still be created, regardless of if you use makedumpfile or not. As to weather that will affect the behavior of crash, that likely depends on crash's ability to ignore the vmcoreinfo note section. I'd say a test is worth 1000 guesses. Dave, I was bone-headed about the RHTS thing. If you tell me what system you reserved, I'll go ahead and take care of the testing. Thanks!
I just got the system this morning -- I'm installing the 3 64.el5 i686 kernels and will give them a whirl w/kexec-tools-1.102pre-8.el5.
I got mixed results -- the stock kernels did the right thing, but the dom0 xen kernel still resulted in a vmcore with a bogus notes section: 2.6.18-64.el5: OK 2.6.18-64.el5PAE: OK 2.6.18-64.el5xen: malformed vmcore notes Here are the results for each: (1) Running 2.6.18-64.el5, readelf shows this: # readelf -a vmcore ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 5 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no sections in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000 0x00000000000004b4 0x00000000000004b4 0 LOAD 0x000000000000060c 0x00000000c0000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a060c 0x00000000c0100000 0x0000000000100000 0x0000000000f00000 0x0000000000f00000 RWE 0 LOAD 0x0000000000fa060c 0x00000000c9000000 0x0000000009000000 0x000000002f000000 0x000000002f000000 RWE 0 LOAD 0x000000002ffa060c 0xffffffffffffffff 0x0000000038000000 0x00000000b7fc0000 0x00000000b7fc0000 RWE 0 There is no dynamic section in this file. There are no relocations in this file. There are no unwind sections in this file. No version information found in this file. Notes at offset 0x00000158 with length 0x000004b4: Owner Data size Description CORE 0x00000090 NT_PRSTATUS (prstatus structure) CORE 0x00000090 NT_PRSTATUS (prstatus structure) VMCOREINFO 0x00000353 Unknown note type: (0x00000000) and crash works fine: # crash vm* crash 4.0-4.6.1 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... KERNEL: vmlinux DUMPFILE: vmcore CPUS: 2 DATE: Wed Jan 9 09:41:49 2008 UPTIME: 00:15:35 LOAD AVERAGE: 0.00, 0.09, 0.07 TASKS: 96 NODENAME: dell-pe700-01.rhts.boston.redhat.com RELEASE: 2.6.18-64.el5 VERSION: #1 SMP Mon Jan 7 18:03:30 EST 2008 MACHINE: i686 (3391 Mhz) MEMORY: 3.7 GB PANIC: "SysRq : Trigger a crashdump" PID: 2575 COMMAND: "bash" TASK: f701aaa0 [THREAD_INFO: f6212000] CPU: 0 STATE: TASK_RUNNING (SYSRQ) crash> (2) Running 2.6.18-64.el5PAE, readelf shows this: # readelf -a vmcore ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 5 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no sections in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000 0x00000000000004c8 0x00000000000004c8 0 LOAD 0x0000000000000620 0x00000000c0000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000000a0620 0x00000000c0100000 0x0000000000100000 0x0000000000f00000 0x0000000000f00000 RWE 0 LOAD 0x0000000000fa0620 0x00000000c9000000 0x0000000009000000 0x000000002f000000 0x000000002f000000 RWE 0 LOAD 0x000000002ffa0620 0xffffffffffffffff 0x0000000038000000 0x00000000b7fc0000 0x00000000b7fc0000 RWE 0 There is no dynamic section in this file. There are no relocations in this file. There are no unwind sections in this file. No version information found in this file. Notes at offset 0x00000158 with length 0x000004c8: Owner Data size Description CORE 0x00000090 NT_PRSTATUS (prstatus structure) CORE 0x00000090 NT_PRSTATUS (prstatus structure) VMCOREINFO 0x00000367 Unknown note type: (0x00000000) # and crash runs fine: # crash vm* crash 4.0-4.6.1 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. GNU gdb 6.1 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... KERNEL: vmlinux DUMPFILE: vmcore CPUS: 2 DATE: Wed Jan 9 09:53:25 2008 UPTIME: 00:01:49 LOAD AVERAGE: 0.60, 0.29, 0.10 TASKS: 96 NODENAME: dell-pe700-01.rhts.boston.redhat.com RELEASE: 2.6.18-64.el5PAE VERSION: #1 SMP Mon Jan 7 18:17:59 EST 2008 MACHINE: i686 (3391 Mhz) MEMORY: 3.7 GB PANIC: "SysRq : Trigger a crashdump" PID: 4072 COMMAND: "bash" TASK: f7195000 [THREAD_INFO: f5968000] CPU: 0 STATE: TASK_RUNNING (SYSRQ) crash> (3) Running 2.6.18-64.el5xen (with KDUMP_KERNELVER="2.6.18-64.el5PAE"), readelf dumps garbage when it gets to the vmcoreinfo section: # readelf -a vmcore ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: CORE (Core file) Machine: Intel 80386 Version: 0x1 Entry point address: 0x0 Start of program headers: 64 (bytes into file) Start of section headers: 0 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 5 Size of section headers: 0 (bytes) Number of section headers: 0 Section header string table index: 0 There are no sections in this file. There are no sections in this file. Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align NOTE 0x0000000000000158 0x0000000000000000 0x0000000000000000 0x000000002444c8e0 0x000000002444c8e0 0 LOAD 0x000000002444ca38 0x00000000c0000000 0x0000000000000000 0x00000000000a0000 0x00000000000a0000 RWE 0 LOAD 0x00000000244eca38 0x00000000c0100000 0x0000000000100000 0x0000000000f00000 0x0000000000f00000 RWE 0 LOAD 0x00000000253eca38 0x00000000c9000000 0x0000000009000000 0x000000002f000000 0x000000002f000000 RWE 0 LOAD 0x00000000543eca38 0xffffffffffffffff 0x0000000038000000 0x00000000b7fc0000 0x00000000b7fc0000 RWE 0 There is no dynamic section in this file. There are no relocations in this file. There are no unwind sections in this file. No version information found in this file. Notes at offset 0x00000158 with length 0x2444c8e0: Owner Data size Description CORE 0x00000090 NT_PRSTATUS (prstatus structure) Xen 0x00000010 Unknown note type: (0x01000002) Xen 0x00000024 Unknown note type: (0x01000001) CORE 0x00000090 NT_PRSTATUS (prstatus structure) Xen 0x00000010 Unknown note type: (0x01000002) ���3���D$��$��t 0x0000000c Unknown note type: (0x24448900) # and not surprisingly, crash fails as well: # /usr/tmp/crash*12/crash vm* crash 4.0-4.6.1 Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 Red Hat, Inc. Copyright (C) 2004, 2005, 2006 IBM Corporation Copyright (C) 1999-2006 Hewlett-Packard Co Copyright (C) 2005, 2006 Fujitsu Limited Copyright (C) 2006, 2007 VA Linux Systems Japan K.K. Copyright (C) 2005 NEC Corporation Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc. Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc. This program is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Enter "help copying" to see the conditions. This program has absolutely no warranty. Enter "help warranty" for details. Segmentation fault # I wouldn't think there would be any difference using the non-PAE capture kernel for the dom0 kernel, but I didn't test it. The RHTS machine is: dell-pe700-01.rhts.boston.redhat.com
> The RHTS machine is: dell-pe700-01.rhts.boston.redhat.com Neil, do you want me to hold onto this machine for you?
Dave, thank you for the offer, but I'll check my own out. I just need to build Ken'ichi's patch into a kernel.
Ok, I just confirmed that with Ken'ichi 's patch from comment #4, the dom0 kernels do the right thing on a crash. I'll post this shortly.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in 2.6.18-71.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html