Bug 478943 - crash fails to read el4u7 pv core dump files collected from xm dump-core
Summary: crash fails to read el4u7 pv core dump files collected from xm dump-core
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: crash
Version: 5.3
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Dave Anderson
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-06 05:38 UTC by Joe Jin
Modified: 2010-10-06 12:38 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-10-06 12:38:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Joe Jin 2009-01-06 05:38:50 UTC
Description of problem:

A running PV guest at Oracle VM server crashed, with xm dump-core
command dump the vmcore, when try to analysis it with crash got 
following error message: 

# crash /usr/lib/debug/lib/modules/2.6.9-78.0.5.0.1.ELxenU/vmlinux 7562549-vmcore-1 -d3

crash 4.0-7.5
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
crash: 7562549-vmcore-1: not a netdump ELF dumpfile
crash: 7562549-vmcore-1: not a kdump ELF dumpfile
        flags: 109 (XENDUMP_LOCAL|XC_CORE_ELF|XC_CORE_P2M_CREATE)
          xfd: 3
    page_size: 4096
          ofp: 0
         page: 88fd058
     panic_pc: 0
     panic_sp: 0
     accesses: 0
   cache_hits: 0 
     last_pfn: -1
    redundant: 0 
    poc[5000]: 88fe060 (none used)

      xc_save:
                  nr_pfns: 0 (0x0)
            vmconfig_size: 0 (0x0)
             vmconfig_buf: 0
           p2m_frame_list: 0 (none)
                 pfns_not: 0
          pfns_not_offset: 0
         vcpu_ctxt_offset: 0
  shared_info_page_offset: 0
          region_pfn_type: 0
              batch_count: 0
            batch_offsets: 0 (none)
             ia64_version: 0
        ia64_page_offsets: 0 (none)

      xc_core:
                   header:
                xch_magic: f00febed (XC_CORE_MAGIC)
             xch_nr_vcpus: 1
             xch_nr_pages: 91762 (0x16672)
          xch_ctxt_offset: 1892 (0x764)
         xch_index_offset: 8788 (0x2254)
         xch_pages_offset: 1478656 (0x169000)
                elf_class: ELFCLASS64
        elf_strtab_offset: 377335808 (0x167db000)
           format_version: 0000000000000001
       shared_info_offset: 4692 (0x1254)
       elf_index_pfn[128]: 
0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 0:-1 
               last_batch:
                    index: 0 (0 - 0)
                 accesses: 0
               duplicates: 0 
                    elf32: 0
                    elf64: 88fcaa8
               p2m_frames: 0
     p2m_frame_index_list: 

Elf64_Ehdr:
                e_ident: \177ELF
      e_ident[EI_CLASS]: 2 (ELFCLASS64)
       e_ident[EI_DATA]: 1 (ELFDATA2LSB)
    e_ident[EI_VERSION]: 1 (EV_CURRENT)
      e_ident[EI_OSABI]: 0 (ELFOSABI_SYSV)
 e_ident[EI_ABIVERSION]: 1
                 e_type: 4 (ET_CORE)
              e_machine: 3 (EM_386)
              e_version: 1 (EV_CURRENT)
                e_entry: 0
                e_phoff: 0
                e_shoff: 40
                e_flags: 0
               e_ehsize: 40
            e_phentsize: 38
                e_phnum: 0
            e_shentsize: 40
                e_shnum: 7
             e_shstrndx: 1

Elf64_Shdr:
                sh_name: 0 ""
                sh_type: 0 (SHT_NULL)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 0
                sh_size: 0
                sh_link: 0
                sh_info: 0
           sh_addralign: 0
             sh_entsize: 0

Elf64_Shdr:
                sh_name: 1 ".shstrtab"
                sh_type: 3 (SHT_STRTAB)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 167db000
                sh_size: 48
                sh_link: 0
                sh_info: 0
           sh_addralign: 0
             sh_entsize: 0
                         .shstrtab
                         .note.Xen
                         .xen_prstatus
                         .xen_shared_info
                         .xen_p2m
                         .xen_pages
                         
Elf64_Shdr:
                sh_name: b ".note.Xen"
                sh_type: 7 (SHT_NOTE)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 200
                sh_size: 564
                sh_link: 0
                sh_info: 0
           sh_addralign: 0
             sh_entsize: 0
                 namesz: 4
                  descz: 0
                   type: 2000000 (XEN_ELFNOTE_DUMPCORE_NONE)
                   name: Xen
                         (empty)
                 namesz: 4
                  descz: 32
                   type: 2000001 (XEN_ELFNOTE_DUMPCORE_HEADER)
                   name: Xen
                         00000000f00febed 0000000000000001 
                         0000000000016672 0000000000001000 
                 namesz: 4
                  descz: 1276
                   type: 2000002 (XEN_ELFNOTE_DUMPCORE_XEN_VERSION)
                   name: Xen
                         0000000000000003 0000000000000001 
                         ffff82840500342e 0000000000000001 
                         7372657620636367 2e312e34206e6f69 
                         3130373030322031 2064655228203530 
                         2e312e3420746148 0000002932352d31 
                         0000000000000001 ffff828c80131b61 
                         ffff8200746f6f72 ffff82840589aa30 
                         6f2e706368647375 726f63656c636172 
                         ffff006d6f632e70 000000021d1b6061 
                         2074634f206e6f4d 36303a3630203732 
                         205444502033303a ffff820038303032 
                         2d302e332d6e6578 782034365f363878 
                         782d302e332d6e65 00207032335f3638 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         616c696176616e75 7820343600656c62 
                         782d302e332d6e65 00207032335f3638 
                         0000000000000000 0000000000000000 
                         0000000000000000 0000000000000000 
                         00001000ff400000 
                 namesz: 4
                  descz: 8
                   type: 2000003 (XEN_ELFNOTE_DUMPCORE_FORMAT_VERSION)
                   name: Xen
                         0000000000000001 

Elf64_Shdr:
                sh_name: 15 ".xen_prstatus"
                sh_type: 1 (SHT_PROGBITS)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 764
                sh_size: af0
                sh_link: 0
                sh_info: 0
           sh_addralign: 8
             sh_entsize: af0

Elf64_Shdr:
                sh_name: 23 ".xen_shared_info"
                sh_type: 1 (SHT_PROGBITS)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 1254
                sh_size: 1000
                sh_link: 0
                sh_info: 0
           sh_addralign: 4
             sh_entsize: 1000

Elf64_Shdr:
                sh_name: 34 ".xen_p2m"
                sh_type: 1 (SHT_PROGBITS)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 2254
                sh_size: 166720
                sh_link: 0
                sh_info: 0
           sh_addralign: 4
             sh_entsize: 10

Elf64_Shdr:
                sh_name: 3d ".xen_pages"
                sh_type: 1 (SHT_PROGBITS)
               sh_flags: 0
                sh_addr: 0
              sh_offset: 169000
                sh_size: 16672000
                sh_link: 0
                sh_info: 0
           sh_addralign: 1000
             sh_entsize: 1000

cannot determine relocation value: not a live system
gdb /usr/lib/debug/lib/modules/2.6.9-78.0.5.0.1.ELxenU/vmlinux 
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...

MEMBER_OFFSET(vcpu_guest_context, ctrlreg): 2716
ctrlreg[0]: 8005003b
ctrlreg[1]: 0
ctrlreg[2]: b7a26000
ctrlreg[3]: ffb002 -> mfn: 200ffb
ctrlreg[4]: 0
ctrlreg[5]: 0
ctrlreg[6]: 0
ctrlreg[7]: 0
crash: cannot find mfn 2101243 (0x200ffb) in page index

crash: cannot read/find cr3 page


error message like bug 233151 reported, but that bug based el5,
not sure if have any difference between el4 and el5.

Comment 1 Joe Jin 2009-01-06 05:45:54 UTC
BTW: If guest OS working fine and live dump-core, crash could work fine with the vmcore.

Comment 2 Dave Anderson 2009-01-06 13:42:38 UTC
Can you please make the vmcore and vmlinux available to download?

(There's not much I can do without them.)

Comment 3 Joe Jin 2009-01-08 00:53:44 UTC
Unfortunately I could could not provide vmcore and vmlinux for you for our policy!
but you could got crash info from bug 249867 

Guest OS kernel crashed during do memory operation: xen_destroy_contiguous_region()

At el4u7 kernel, destroy region have two steps:
1. Zap current PTEs, giving away the underlying pages.
2. Map new pages in place of old pages.

at this issue, kernel crashed at step 2
means hypervisor have unmap some pages bug not map new page to right place,
I guess step 2 failed made cr3 page not in guest OS memory context.

Comment 4 Joe Jin 2009-01-08 02:11:45 UTC
I have reproduce it with following:

At step 2 before call HYPERVISOR_memory_op(), direct call BUG() then create vmcore
with crash utility will got "crash: cannot read/find cr3 page" report msg.

Comment 5 Joe Jin 2009-01-08 03:04:39 UTC
update:

If trigger a panic before step 1, vmcore work fine
If trigger a panic before step 2 and after step 1, vmcore could not work

I think the root cause is the page have detach from guest but not build new mapping.

A question is if have not ctrlreg[3] -> cr3, crash will not work, right?

Thanks,
Joe

Comment 6 Dave Anderson 2009-01-08 13:47:15 UTC
> A question is if have not ctrlreg[3] -> cr3, crash will not work, right?

Unfortunately that is correct -- all reads from the vmcore are based upon
being able to translate kernel-virtual-addresses to pseudo-physical-addresses
to machine-addresses.  Without the top of the "page directory tree" cr3 page,
the translation is impossible.

Comment 7 Dave Anderson 2009-01-08 17:15:26 UTC
And BTW, please give my regards to Deepak...


Note You need to log in before you can comment on or make changes to this bug.