Bug 423731 - i386 dom0 kdump vmcore file created with bogus notes section
Summary: i386 dom0 kdump vmcore file created with bogus notes section
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.1
Hardware: All
OS: Linux
low
low
Target Milestone: ---
: ---
Assignee: Neil Horman
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 223925
TreeView+ depends on / blocked
 
Reported: 2007-12-13 17:28 UTC by Dave Anderson
Modified: 2008-05-21 15:03 UTC (History)
4 users (show)

Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-05-21 15:03:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
delete /sys/kernel/vmcoreinfo on dom0 kernel (811 bytes, patch)
2007-12-27 08:05 UTC, Ken'ichi Ohmichi
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0314 0 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5.2 2008-05-20 18:43:34 UTC

Description Dave Anderson 2007-12-13 17:28:17 UTC
Description of problem:

With the 2.6.18-58.el5xen kernel and kexec-tools-1.102pre-8.el5,
the dom0 kdump kernel is created with an invalid notes section
size. 

Version-Release number of selected component (if applicable):

i386 kernel-2.6.18-58.el5xen
kexec-tools-1.102pre-8.el5

How reproducible:

Configure dom0 kdump to use bare-metal 2.6.18-58.el5 kernel 
as the kdump kernel, and force a kernel crash.

Steps to Reproduce:
1.
2.
3.
  
Actual results:

The vmcore causes the crash utility to fail with a segmentation violation.

Investigating the ELF header with readelf, it shows that the notes section
contains an absurdly large number of bytes (0x14228e98 in this example),
and then readelf dies with a segmentation violation:

# readelf -a vmcore
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         5
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no sections in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  NOTE           0x0000000000000158 0x0000000000000000 0x0000000000000000
                 0x0000000014228e98 0x0000000014228e98         0
  LOAD           0x0000000014228ff0 0x00000000c0000000 0x0000000000000000
                 0x00000000000a0000 0x00000000000a0000  RWE    0
  LOAD           0x00000000142c8ff0 0x00000000c0100000 0x0000000000100000
                 0x0000000001f00000 0x0000000001f00000  RWE    0
  LOAD           0x00000000161c8ff0 0x00000000ca000000 0x000000000a000000
                 0x000000002e000000 0x000000002e000000  RWE    0
  LOAD           0x00000000441c8ff0 0xffffffffffffffff 0x0000000038000000
                 0x0000000007e8c000 0x0000000007e8c000  RWE    0

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

No version information found in this file.

Notes at offset 0x00000158 with length 0x14228e98:
  Owner         Data size       Description
  CORE          0x00000090      NT_PRSTATUS (prstatus structure)
  Xen           0x00000010      Unknown note type: (0x01000002)
  Xen           0x00000024      Unknown note type: (0x01000001)
  CORE          0x00000090      NT_PRSTATUS (prstatus structure)
  Xen           0x00000010      Unknown note type: (0x01000002)
Segmentation fault
#

Expected results:

The vmcore should be similar to that produced for a bare-metal kernel
of the same version.  For example:

# readelf -a vmcore
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              CORE (Core file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         5
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no sections in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  NOTE           0x0000000000000158 0x0000000000000000 0x0000000000000000
                 0x00000000000004a0 0x00000000000004a0         0
  LOAD           0x00000000000005f8 0x00000000c0000000 0x0000000000000000
                 0x00000000000a0000 0x00000000000a0000  RWE    0
  LOAD           0x00000000000a05f8 0x00000000c0100000 0x0000000000100000
                 0x0000000001f00000 0x0000000001f00000  RWE    0
  LOAD           0x0000000001fa05f8 0x00000000ca000000 0x000000000a000000
                 0x000000002e000000 0x000000002e000000  RWE    0
  LOAD           0x000000002ffa05f8 0xffffffffffffffff 0x0000000038000000
                 0x0000000007e8cc00 0x0000000007e8cc00  RWE    0

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

No version information found in this file.

Notes at offset 0x00000158 with length 0x000004a0:
  Owner         Data size       Description
  CORE          0x00000090      NT_PRSTATUS (prstatus structure)
  CORE          0x00000090      NT_PRSTATUS (prstatus structure)
  VMCOREINFO            0x00000340      Unknown note type: (0x00000000)
#


Additional info:

Comment 1 Neil Horman 2007-12-17 21:16:38 UTC
This was split out from a bug that was thought to be two problems.  It turns out
that the patch that was initially submitted that created this problem was
corrupt, a rediff introduced unexpected changes by accident.  I've posted the
proper version to RHKL.  Once it is integrated we should retest to make sure
this bug still exists.

Comment 2 Dave Anderson 2007-12-17 21:25:17 UTC
Yep...

Comment 3 Ken'ichi Ohmichi 2007-12-27 08:03:58 UTC
Hi,
makedumpfile cannot use the note section of vmcoreinfo for xen.
On the following, Oda-san told about it but it hasn't be implemented yet.
http://lists.infradead.org/pipermail/kexec/2007-September/000790.html

How about deleting /sys/kernel/vmcoreinfo if building dom0 kernel like
the attached patch ? We confirmed that the crash utility can read the
vmcore which is created on the kernel (2.6.18-58.el5xen) with this patch.

Comment 4 Ken'ichi Ohmichi 2007-12-27 08:05:53 UTC
Created attachment 290443 [details]
delete /sys/kernel/vmcoreinfo on dom0 kernel

Comment 5 Neil Horman 2008-01-08 21:25:44 UTC
I just confirmed that the proper vmcoreinfo patch made it into -64.el5, and
confirmed that it fixed the zero length /proc/vmcore issue.  I don't have a 386
machine handy at the moment to confirm that this problem is also fixed (they're
tied up with other tests).  Does anyone getting copied on this bug have a RHEL5
x86 box that they can test this out on quickly?

Comment 6 Dave Anderson 2008-01-08 21:43:22 UTC
I just requested a RHEL5 i386 from RHTS.

Does the -64.el5 kernel contain/require the patch from Ken'ichi
in comment #4 above? 

Comment 7 Neil Horman 2008-01-08 23:50:29 UTC
Doh!  Sorry, Dave, I could have just done that.  I wasn't thinking.

Regarding The patch in comment 4, the short answer is, I'm not sure.  Given the
history of when this bug started, I'd say its a 50/50 shot as to weather the
bogus patch was responsible for this problem, or if the presence of the
vmcoreinfo file at all caused this.  Looking at the problem description, my
guess would be that the bogus note section will still be created, regardless of
if you use makedumpfile or not.  As to weather that will affect the behavior of
crash, that likely depends on crash's ability to ignore the vmcoreinfo note section.

I'd say a test is worth 1000 guesses.  Dave, I was bone-headed about the RHTS
thing.  If you tell me what system you reserved, I'll go ahead and take care of
the testing.  Thanks!

Comment 8 Dave Anderson 2008-01-09 13:57:12 UTC
I just got the system this morning -- I'm installing the 3 64.el5 
i686 kernels and will give them a whirl w/kexec-tools-1.102pre-8.el5.




Comment 9 Dave Anderson 2008-01-09 15:35:56 UTC
 
I got mixed results -- the stock kernels did the right
thing, but the dom0 xen kernel still resulted in a
vmcore with a bogus notes section:

  2.6.18-64.el5: OK
  2.6.18-64.el5PAE: OK
  2.6.18-64.el5xen: malformed vmcore notes

Here are the results for each:

(1) Running 2.6.18-64.el5, readelf shows this:

  # readelf -a vmcore
  ELF Header:
    Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
    Class:                             ELF64
    Data:                              2's complement, little endian
    Version:                           1 (current)
    OS/ABI:                            UNIX - System V
    ABI Version:                       0
    Type:                              CORE (Core file)
    Machine:                           Intel 80386
    Version:                           0x1
    Entry point address:               0x0
    Start of program headers:          64 (bytes into file)
    Start of section headers:          0 (bytes into file)
    Flags:                             0x0
    Size of this header:               64 (bytes)
    Size of program headers:           56 (bytes)
    Number of program headers:         5
    Size of section headers:           0 (bytes)
    Number of section headers:         0
    Section header string table index: 0
  
  There are no sections in this file.
  
  There are no sections in this file.
  
  Program Headers:
    Type           Offset             VirtAddr           PhysAddr
                   FileSiz            MemSiz              Flags  Align
    NOTE           0x0000000000000158 0x0000000000000000 0x0000000000000000
                   0x00000000000004b4 0x00000000000004b4         0
    LOAD           0x000000000000060c 0x00000000c0000000 0x0000000000000000
                   0x00000000000a0000 0x00000000000a0000  RWE    0
    LOAD           0x00000000000a060c 0x00000000c0100000 0x0000000000100000
                   0x0000000000f00000 0x0000000000f00000  RWE    0
    LOAD           0x0000000000fa060c 0x00000000c9000000 0x0000000009000000
                   0x000000002f000000 0x000000002f000000  RWE    0
    LOAD           0x000000002ffa060c 0xffffffffffffffff 0x0000000038000000
                   0x00000000b7fc0000 0x00000000b7fc0000  RWE    0
  
  There is no dynamic section in this file.
  
  There are no relocations in this file.
  
  There are no unwind sections in this file.
  
  No version information found in this file.
  
  Notes at offset 0x00000158 with length 0x000004b4:
    Owner         Data size       Description
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    VMCOREINFO            0x00000353      Unknown note type: (0x00000000)

and crash works fine:

  # crash vm*

  crash 4.0-4.6.1
  Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  GNU gdb 6.1
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "i686-pc-linux-gnu"...
  
        KERNEL: vmlinux                           
      DUMPFILE: vmcore
          CPUS: 2
          DATE: Wed Jan  9 09:41:49 2008
        UPTIME: 00:15:35
  LOAD AVERAGE: 0.00, 0.09, 0.07
         TASKS: 96
      NODENAME: dell-pe700-01.rhts.boston.redhat.com
       RELEASE: 2.6.18-64.el5
       VERSION: #1 SMP Mon Jan 7 18:03:30 EST 2008
       MACHINE: i686  (3391 Mhz)
        MEMORY: 3.7 GB
         PANIC: "SysRq : Trigger a crashdump"
           PID: 2575
       COMMAND: "bash"
          TASK: f701aaa0  [THREAD_INFO: f6212000]
           CPU: 0
         STATE: TASK_RUNNING (SYSRQ)
  
  crash>
  
  
(2) Running 2.6.18-64.el5PAE, readelf shows this:
  
  # readelf -a vmcore
  ELF Header:
    Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
    Class:                             ELF64
    Data:                              2's complement, little endian
    Version:                           1 (current)
    OS/ABI:                            UNIX - System V
    ABI Version:                       0
    Type:                              CORE (Core file)
    Machine:                           Intel 80386
    Version:                           0x1
    Entry point address:               0x0
    Start of program headers:          64 (bytes into file)
    Start of section headers:          0 (bytes into file)
    Flags:                             0x0
    Size of this header:               64 (bytes)
    Size of program headers:           56 (bytes)
    Number of program headers:         5
    Size of section headers:           0 (bytes)
    Number of section headers:         0
    Section header string table index: 0
  
  There are no sections in this file.
  
  There are no sections in this file.
  
  Program Headers:
    Type           Offset             VirtAddr           PhysAddr
                   FileSiz            MemSiz              Flags  Align
    NOTE           0x0000000000000158 0x0000000000000000 0x0000000000000000
                   0x00000000000004c8 0x00000000000004c8         0
    LOAD           0x0000000000000620 0x00000000c0000000 0x0000000000000000
                   0x00000000000a0000 0x00000000000a0000  RWE    0
    LOAD           0x00000000000a0620 0x00000000c0100000 0x0000000000100000
                   0x0000000000f00000 0x0000000000f00000  RWE    0
    LOAD           0x0000000000fa0620 0x00000000c9000000 0x0000000009000000
                   0x000000002f000000 0x000000002f000000  RWE    0
    LOAD           0x000000002ffa0620 0xffffffffffffffff 0x0000000038000000
                   0x00000000b7fc0000 0x00000000b7fc0000  RWE    0
  
  There is no dynamic section in this file.
  
  There are no relocations in this file.
  
  There are no unwind sections in this file.
  
  No version information found in this file.
  
  Notes at offset 0x00000158 with length 0x000004c8:
    Owner         Data size       Description
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    VMCOREINFO            0x00000367      Unknown note type: (0x00000000)
  # 
  
and crash runs fine:

  # crash vm*
  
  crash 4.0-4.6.1
  Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  GNU gdb 6.1
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.
  There is absolutely no warranty for GDB.  Type "show warranty" for details.
  This GDB was configured as "i686-pc-linux-gnu"...
  
        KERNEL: vmlinux                           
      DUMPFILE: vmcore
          CPUS: 2
          DATE: Wed Jan  9 09:53:25 2008
        UPTIME: 00:01:49
  LOAD AVERAGE: 0.60, 0.29, 0.10
         TASKS: 96
      NODENAME: dell-pe700-01.rhts.boston.redhat.com
       RELEASE: 2.6.18-64.el5PAE
       VERSION: #1 SMP Mon Jan 7 18:17:59 EST 2008
       MACHINE: i686  (3391 Mhz)
        MEMORY: 3.7 GB
         PANIC: "SysRq : Trigger a crashdump"
           PID: 4072
       COMMAND: "bash"
          TASK: f7195000  [THREAD_INFO: f5968000]
           CPU: 0
         STATE: TASK_RUNNING (SYSRQ)
  
  crash> 
  

(3) Running 2.6.18-64.el5xen (with KDUMP_KERNELVER="2.6.18-64.el5PAE"),
    readelf dumps garbage when it gets to the vmcoreinfo section:

  # readelf -a vmcore
  ELF Header:
    Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
    Class:                             ELF64
    Data:                              2's complement, little endian
    Version:                           1 (current)
    OS/ABI:                            UNIX - System V
    ABI Version:                       0
    Type:                              CORE (Core file)
    Machine:                           Intel 80386
    Version:                           0x1
    Entry point address:               0x0
    Start of program headers:          64 (bytes into file)
    Start of section headers:          0 (bytes into file)
    Flags:                             0x0
    Size of this header:               64 (bytes)
    Size of program headers:           56 (bytes)
    Number of program headers:         5
    Size of section headers:           0 (bytes)
    Number of section headers:         0
    Section header string table index: 0
  
  There are no sections in this file.
  
  There are no sections in this file.
  
  Program Headers:
    Type           Offset             VirtAddr           PhysAddr
                   FileSiz            MemSiz              Flags  Align
    NOTE           0x0000000000000158 0x0000000000000000 0x0000000000000000
                   0x000000002444c8e0 0x000000002444c8e0         0
    LOAD           0x000000002444ca38 0x00000000c0000000 0x0000000000000000
                   0x00000000000a0000 0x00000000000a0000  RWE    0
    LOAD           0x00000000244eca38 0x00000000c0100000 0x0000000000100000
                   0x0000000000f00000 0x0000000000f00000  RWE    0
    LOAD           0x00000000253eca38 0x00000000c9000000 0x0000000009000000
                   0x000000002f000000 0x000000002f000000  RWE    0
    LOAD           0x00000000543eca38 0xffffffffffffffff 0x0000000038000000
                   0x00000000b7fc0000 0x00000000b7fc0000  RWE    0
  
  There is no dynamic section in this file.
  
  There are no relocations in this file.
  
  There are no unwind sections in this file.
  
  No version information found in this file.
  
  Notes at offset 0x00000158 with length 0x2444c8e0:
    Owner         Data size       Description
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    Xen           0x00000010      Unknown note type: (0x01000002)
    Xen           0x00000024      Unknown note type: (0x01000001)
    CORE          0x00000090      NT_PRSTATUS (prstatus structure)
    Xen           0x00000010      Unknown note type: (0x01000002)
  ���3���D$��$��t              0x0000000c      Unknown note
type: (0x24448900)
  #
  
and not surprisingly, crash fails as well:
  
  # /usr/tmp/crash*12/crash vm*
  
  crash 4.0-4.6.1
  Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  Segmentation fault
  #

I wouldn't think there would be any difference using the non-PAE
capture kernel for the dom0 kernel, but I didn't test it.

The RHTS machine is: dell-pe700-01.rhts.boston.redhat.com


Comment 10 Dave Anderson 2008-01-09 19:39:41 UTC
> The RHTS machine is: dell-pe700-01.rhts.boston.redhat.com

Neil, do you want me to hold onto this machine for you?


Comment 11 Neil Horman 2008-01-09 20:07:56 UTC
Dave, thank you for the offer, but I'll check my own out.  I just need to build
Ken'ichi's patch into a kernel. 

Comment 12 Neil Horman 2008-01-10 18:31:21 UTC
Ok, I just confirmed that with Ken'ichi 's patch from comment #4, the dom0
kernels do the right thing on a crash.  I'll post this shortly.

Comment 13 RHEL Program Management 2008-01-14 07:36:44 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 14 Don Zickus 2008-01-21 17:30:11 UTC
in 2.6.18-71.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 17 errata-xmlrpc 2008-05-21 15:03:52 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html



Note You need to log in before you can comment on or make changes to this bug.