Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 593285 - crash unable to open core created via "virsh dump" of KVM guest
crash unable to open core created via "virsh dump" of KVM guest
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: crash (Show other bugs)
6.0
All Linux
low Severity medium
: rc
: ---
Assigned To: Dave Anderson
Dave Anderson
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-18 08:36 EDT by Jeff Layton
Modified: 2014-06-18 03:40 EDT (History)
4 users (show)

See Also:
Fixed In Version: crash-5.0.0-15.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-10 15:03:40 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Jeff Layton 2010-05-18 08:36:27 EDT
I've got a core dump that I created using "virsh dump" on a KVM guest:

# file /tmp/rhel6.core 
/tmp/rhel6.core: QEMU's suspend to disk image

...the kernel that the core is from is:

# uname -r
2.6.32-26.el6.jtltest.005.x86_64.debug

...when I try to open that file with crash, I get these errors. The vmlinux is the correct one for the core:

---------------[snip]----------------
# crash /usr/lib/debug/lib/modules/2.6.32-26.el6.jtltest.005.x86_64.debug/vmlinux /tmp/rhel6.core 

crash 5.0.3
Copyright (C) 2002-2010  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.0                               
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905bc0ff  type: "possible"
WARNING: cannot read cpu_possible_map
crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905fc0ff  type: "present"
WARNING: cannot read cpu_present_map
crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905dc0ff  type: "online"
WARNING: cannot read cpu_online_map
WARNING: cannot read linux_banner string
crash: /usr/lib/debug/lib/modules/2 and /tmp/rhel6.core do not match!

Usage:
  crash [-h [opt]][-v][-s][-i file][-d num] [-S] [mapfile] [namelist] [dumpfile]

Enter "crash -h" for details.

---------------[snip]----------------

...the above was with crash 5.0.3 (built using the SRPM on Dave A's people page), but I get the same results with the 5.0.0 version that's in the latest RHEL6 repo. The kernel here is a test kernel that is based on 2.6.32-26.el6, mostly with some NFS and CIFS patches layered on it (nothing that really touches the core vmlinux at all).
Comment 2 Dave Anderson 2010-05-18 08:52:30 EDT
Can I get the vmlinux and rhel6.core pair please?
Comment 5 Dave Anderson 2010-05-18 09:14:04 EDT
Thanks Jeff, I'm copying them now. 

Also it would be helpful if you could run crash live on the guest 
(running the same kernel), and then post the output of this command:

 crash> help -m | grep phys_base

Also, this is of interest:

> crash: /tmp/rhel6.core: fwrite: No such file or directory

The fwrite() is actually writing to a temporary file created with
tmpfile(), and should never fail.  Is it possible that the filesystem
containing /tmp is full?
Comment 6 Jeff Layton 2010-05-18 09:42:36 EDT
Ok, it fired up just fine on the live guest:

crash> help -m | grep phys_base  
                phys_base: 0


.../tmp is just a subdir of root on this guest and has tons of free space:

# df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root
                      28423176   5958696  21020640  23% /

...permissions should be ok:
# stat /tmp
  File: `/tmp'
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: fd00h/64768d	Inode: 915713      Links: 3
Access: (1777/drwxrwxrwt)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2010-05-18 09:36:48.925417562 -0400
Modify: 2010-05-18 09:37:10.724391881 -0400
Change: 2010-05-18 09:37:10.724391881 -0400

...I also put selinux in permissive mode temporarily to see if that was affecting it, but it didn't behave any differently.
Comment 7 Dave Anderson 2010-05-18 09:59:12 EDT
Hmmm, works for me using my 5.0.3-plus-a-bunch-patches, but many of
the patches are KVM updates:

$ /var/CVS/crash/crash burn/2.6.32-26.el6.jtltest.005_virsh_dump/rhel6.core  burn/2.6.32-26.el6.jtltest.005_virsh_dump/vmlinux

crash 5.0.3p11
Copyright (C) 2002-2010  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.0                               
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: burn/2.6.32-26.el6.jtltest.005_virsh_dump/vmlinux
    DUMPFILE: burn/2.6.32-26.el6.jtltest.005_virsh_dump/rhel6.core
        CPUS: 4
        DATE: Tue May 18 07:51:25 2010
      UPTIME: 00:21:54
LOAD AVERAGE: 0.00, 0.02, 0.00
       TASKS: 133
    NODENAME: dhcp231-227.rdu.redhat.com
     RELEASE: 2.6.32-26.el6.jtltest.005.x86_64.debug
     VERSION: #1 SMP Mon May 17 11:05:55 EDT 2010
     MACHINE: x86_64  (2992 Mhz)
      MEMORY: 1 GB
       PANIC: ""
         PID: 0
     COMMAND: "swapper"
        TASK: ffffffff817829e0  (1 of 4)  [THREAD_INFO: ffffffff8175e000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)
     WARNING: panic task not found

crash> 

Can you try this with your version of crash:

 # crash --machdep phys_base=0 vmlinux rhel6.core
Comment 8 Dave Anderson 2010-05-18 10:12:50 EDT
> Can you try this with your version of crash:
>
> # crash --machdep phys_base=0 vmlinux rhel6.core 

I tried it with crash-5.0.3, and using the --machdep argument 
works for me:

$ crash burn/2.6.32-26.el6.jtltest.005_virsh_dump/rhel6.core  burn/2.6.32-26.el6.jtltest.005_virsh_dump/vmlinux --machdep phys_base=0

crash 5.0.3
Copyright (C) 2002-2010  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
NOTE: setting phys_base to: 0x0                 

GNU gdb (GDB) 7.0
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: burn/2.6.32-26.el6.jtltest.005_virsh_dump/vmlinux
    DUMPFILE: burn/2.6.32-26.el6.jtltest.005_virsh_dump/rhel6.core
        CPUS: 4
        DATE: Tue May 18 07:51:25 2010
      UPTIME: 00:21:54
LOAD AVERAGE: 0.00, 0.02, 0.00
       TASKS: 133
    NODENAME: dhcp231-227.rdu.redhat.com
     RELEASE: 2.6.32-26.el6.jtltest.005.x86_64.debug
     VERSION: #1 SMP Mon May 17 11:05:55 EDT 2010
     MACHINE: x86_64  (2992 Mhz)
      MEMORY: 1 GB
       PANIC: ""
         PID: 0
     COMMAND: "swapper"
        TASK: ffffffff817829e0  (1 of 4)  [THREAD_INFO: ffffffff8175e000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)
     WARNING: panic task not found

crash> 

I *think* this workaround should also work with the RHEL6 version
of crash, but I don't have a RHEL6 x86_64 on-hand at the moment,
and I can't test it just yet.
Comment 9 Jeff Layton 2010-05-18 10:17:06 EDT
> # crash --machdep phys_base=0 vmlinux rhel6.core 

Yep, works for me too. Thanks for the workaround!
Comment 10 Dave Anderson 2010-05-18 10:29:35 EDT
OK, I know what the fix is that is addressed by the --machdep workaround.

The original failure to write to the tmpfile() issue is perplexing though.
It's creating a temporary file that maps kernel-memory-to-dumpfile-offset
because KVM saveVM files are not designed to be "random-access", so the
whole damn file needs to be "scanned" during initialization to find out
where things are.  Anyway, if the creation of that tmpfile fails, the crash
session probably ought to be killed immediately.  But I never figured it
would ever fail, and I currently just let it print an error message and
continue.

So I'm curious -- if you reproduce the original issue, i.e., where your
rhel6.core file is located in /tmp, do you still see the "fwrite" error
if you do *not* use the --machdep workaround?
Comment 11 RHEL Product and Program Management 2010-05-18 10:35:13 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 12 Jeff Layton 2010-05-18 10:39:48 EDT
Actually, when I tested it, I used my original command and just added the option:

# crash --machdep phys_base=0 /usr/lib/debug/lib/modules/2.6.32-26.el6.jtltest.005.x86_64.debug/vmlinux /tmp/rhel6.core 

...that's what worked fine.
Comment 13 Dave Anderson 2010-05-18 10:50:49 EDT
Right -- but can you try it again *without* the --machdep workaround to see
if you can reproduce this again: 

> crash: /tmp/rhel6.core: fwrite: No such file or directory
Comment 14 Dave Anderson 2010-05-18 12:00:49 EDT
NEEDINFO re: comment #13
Comment 15 Jeff Layton 2010-05-18 12:13:13 EDT
Oh yes:

# crash /usr/lib/debug/lib/modules/2.6.32-26.el6.jtltest.005.x86_64.debug/vmlinux /tmp/rhel6.core

...
crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905bc0ff  type: "possible"
WARNING: cannot read cpu_possible_map
crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905fc0ff  type: "present"
WARNING: cannot read cpu_present_map
crash: /tmp/rhel6.core: fwrite: No such file or directory
crash: read error: kernel virtual address: ffffff81905dc0ff  type: "online"
WARNING: cannot read cpu_online_map
WARNING: cannot read linux_banner string
crash: /usr/lib/debug/lib/modules/2 and /tmp/rhel6.core do not match!
...

...but this is pretty much the same thing I did when I originally reported it. I'm not doing anything different, so maybe I'm missing the point of what you're asking?
Comment 16 Dave Anderson 2010-05-18 13:48:21 EDT
> ...but this is pretty much the same thing I did when I originally
> reported it.  I'm not doing anything different, so maybe I'm missing
> the point of what you're asking?

Because I couldn't understand the "fwrite" errors on the tmpfile()-
generated file...

But now I see that it's a cut-and-paste error - that I have also 
subsequently fixed.

I have two functions that deal with the tmpfile() memory map, a "store"
function that's only used during that "scanning..." part early on, and
then a "load" function that's used during every read of the dumpfile.
And the "fwrite" error message that's in the "load" function was 
cut-and-pasted from the "store" function:

int
load_memfile_offset(uint64_t physaddr, off_t *entry_ptr)
{
        if (fseek(kvm->mem, MEMFILE_OFFSET(physaddr), SEEK_SET) < 0) {
                error(INFO, "%s: fseek: %s\n", pc->dumpfile, strerror(errno));
                return SEEK_ERROR;
        }

        if (fread((entry_ptr), sizeof(off_t), 1, kvm->mem) != 1) {
                error(INFO, "%s: fwrite: %s\n", pc->dumpfile, strerror(errno));
                return READ_ERROR;
        }

        return 0;
}

So it's actually failing a read, probably due to a miscalculation of
a physical address based upon the faulty "phys_base" that's being used.

So it's a false alarm on my part -- please forgive me...

Thanks Jeff
Comment 17 Jeff Layton 2010-05-18 15:33:57 EDT
No worries. Glad you found the bug.
Comment 19 Dave Anderson 2010-05-21 15:39:39 EDT
QA assist:

Using the supplied dumpfile, crash 5.0.0-14.el6 fails to initialize:

  # crash vmlinux rhel6.core
  
  crash 5.0.0-14.el6
  Copyright (C) 2002-2010  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  GNU gdb (GDB) 7.0                               
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...
  
  crash: rhel6.core: fwrite: No such file or directory
  crash: read error: kernel virtual address: ffffff81905bc0ff  type: "possible"
  WARNING: cannot read cpu_possible_map
  crash: rhel6.core: fwrite: No such file or directory
  crash: read error: kernel virtual address: ffffff81905fc0ff  type: "present"
  WARNING: cannot read cpu_present_map
  crash: rhel6.core: fwrite: No such file or directory
  crash: read error: kernel virtual address: ffffff81905dc0ff  type: "online"
  WARNING: cannot read cpu_online_map
  WARNING: cannot read linux_banner string
  crash: vmlinux and rhel6.core do not match!
  
  Usage:
    crash [-h [opt]][-v][-s][-i file][-d num] [-S] [mapfile] [namelist] [dumpfile]
  
  Enter "crash -h" for details.
  #
  
Crash version 5.0.0-15.el6 comes up as expected:
  
  # crash vmlinux rhel6.core 
  
  crash 5.0.0-15.el6
  Copyright (C) 2002-2010  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006  IBM Corporation
  Copyright (C) 1999-2006  Hewlett-Packard Co
  Copyright (C) 2005, 2006  Fujitsu Limited
  Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
  Copyright (C) 2005  NEC Corporation
  Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
  Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
  This program is free software, covered by the GNU General Public License,
  and you are welcome to change it and/or distribute copies of it under
  certain conditions.  Enter "help copying" to see the conditions.
  This program has absolutely no warranty.  Enter "help warranty" for details.
   
  GNU gdb (GDB) 7.0                               
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...
  
        KERNEL: vmlinux                           
      DUMPFILE: rhel6.core
          CPUS: 4
          DATE: Tue May 18 07:51:25 2010
        UPTIME: 00:21:54
  LOAD AVERAGE: 0.00, 0.02, 0.00
         TASKS: 133
      NODENAME: dhcp231-227.rdu.redhat.com
       RELEASE: 2.6.32-26.el6.jtltest.005.x86_64.debug
       VERSION: #1 SMP Mon May 17 11:05:55 EDT 2010
       MACHINE: x86_64  (2992 Mhz)
        MEMORY: 1 GB
         PANIC: ""
           PID: 0
       COMMAND: "swapper"
          TASK: ffffffff817829e0  (1 of 4)  [THREAD_INFO: ffffffff8175e000]
           CPU: 0
         STATE: TASK_RUNNING (ACTIVE)
       WARNING: panic task not found
  
  crash>
Comment 21 Han Pingtian 2010-05-28 06:14:56 EDT
I hit a 'initialization failed' on the virsh-dumped KVM guest core, which comes from kernel 2.6.32-30.el6.x86_64:
[root@rhel6 images]# crash -d 99 /usr/lib/debug/lib/modules/2.6.32-30.el6.x86_64/vmlinux rhel6.core 

crash 5.0.0-16.el6
Copyright (C) 2002-2010  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
rhel6.core: QEMU_VM_FILE_MAGIC
please wait... (scanning KVM dumpfile)
                                                
crash: rhel6.core: initialization failed

This is maybe a libvirt bug?
Comment 22 Han Pingtian 2010-05-28 06:47:41 EDT
The core is here: http://lacrosse.corp.redhat.com/~phan/kvm_guest_core/rhel6.core
Comment 23 Dave Anderson 2010-06-01 09:11:31 EDT
I cannot access the dumpfile:

# wget http://lacrosse.corp.redhat.com/~phan/kvm_guest_core/rhel6.core 
--09:06:55--  http://lacrosse.corp.redhat.com/~phan/kvm_guest_core/rhel6.core
Resolving lacrosse.corp.redhat.com... 10.11.255.147
Connecting to lacrosse.corp.redhat.com|10.11.255.147|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
09:06:55 ERROR 403: Forbidden.
Comment 24 Dave Anderson 2010-06-02 10:51:42 EDT
(In reply to comment #21)
> I hit a 'initialization failed' on the virsh-dumped KVM guest core, which comes
> from kernel 2.6.32-30.el6.x86_64:
> [root@rhel6 images]# crash -d 99
> /usr/lib/debug/lib/modules/2.6.32-30.el6.x86_64/vmlinux rhel6.core 
> 
> crash 5.0.0-16.el6
> Copyright (C) 2002-2010  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006  Fujitsu Limited
> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> Copyright (C) 2005  NEC Corporation
> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for details.
> 
> rhel6.core: QEMU_VM_FILE_MAGIC
> please wait... (scanning KVM dumpfile)
> 
> crash: rhel6.core: initialization failed
> 
> This is maybe a libvirt bug?    

This bug is a new and different issue than the 5.0.0-15.el6 fix
that addressed this BZ.  This new issue is being tracked here:
 
 Bug 597187 - guest core by virsh dump cannot be analysed by crash:
              initialization failed 
 https://bugzilla.redhat.com/show_bug.cgi?id=597187
Comment 25 releng-rhel@redhat.com 2010-11-10 15:03:40 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.