Bug 715070

Summary: crash: RHEL5 "virsh dump" support fails if guest has >= 3GB memory
Product: Red Hat Enterprise Linux 5 Reporter: Dave Anderson <anderson>
Component: crashAssignee: Dave Anderson <anderson>
Status: CLOSED ERRATA QA Contact: Han Pingtian <phan>
Severity: high Docs Contact:
Priority: high    
Version: 5.8CC: caiqian, dwu, phan
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: crash-5.1.8-1.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-21 00:52:56 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Dave Anderson 2011-06-21 15:06:29 EDT
Description of problem:

The crash utility does not correctly support dumpfiles
created by "virsh dump" on a RHEL5 host if the guest
kernel ws configured with 3GB or more of physical memory.

The issue is the I/O hole that is created for the guest.
When a guest kernel is run on a RHEL5 host, the I/O hole
is located between 0xc0000000 and 0x100000000 (3gb - 4gb).
When a guest kernel is run on a RHEL6 host, the I/O hole
is located between 0xe0000000 and 0x100000000 (3.5gb - 4gb).  
When support was originally added to the crash utility 
for "virsh dump" dumpfiles, it was done presuming the 
RHEL6 512mb I/O hole.

As a result, if a guest kernel on a RHEL5 host contains
more than 3gb of physical memory, all memory above that
threshold gets incorrectly accessed from the dumpfile,
causing the crash utility to spew numerous error messages
and then fail during initialization.

Version-Release number of selected component (if applicable):

crash-4.1.2-8.el5
Any guest kernel with 3gb or more of memory run on a RHEL5 KVM host.

How reproducible:
Always.

Steps to Reproduce:
1. Create a guest with 4GB of memory on a RHEL5 KVM host
2. virsh dump the guest
3. run crash against the guest dumpfile
  
Actual results:

The original report was a RHEL4 guest with 4GB of memory
running on a RHEL5 host.  When run against the RHEL5 version
of the crash utility, it fails like so:

$ crash vmlinux 390745-rh4-malongo.dmp

crash 4.1.2-8.el5
Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb 6.1                                     
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

please wait... (gathering kmem slab cache data)
crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 10137a44980  type: "kmem_cache buffer"

crash: unable to initialize kmem slab cache subsystem


crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: load_memfile_offset: read: Inappropriate ioctl for device
please wait... (gathering module symbol data)
crash: load_memfile_offset: read: Inappropriate ioctl for device

WARNING: cannot access vmalloc'd module memory

please wait... (gathering task table data)
crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: cannot read pid_hash node

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013adc4000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013add8000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 101382ea000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013add2000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013adce000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 101367ac000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013a2d0000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 101371ba000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 101395ae000  type: "fill_thread_info"

crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 1013add0000  type: "fill_thread_info"
please wait... (determining panic task)         
WARNING: active task 101398c0030 on cpu 2 not found in PID hash


crash: load_memfile_offset: read: Inappropriate ioctl for device

crash: read error: kernel virtual address: 101398c0030  type: "fill_task_struct"

crash: task does not exist: 101398c0030

$ 

Expected results:

The fix was introduced upstream in crash version 5.1.3:

 - Fix to more correctly determine the KVM I/O hole size and location.
   The I/O hole size to this point in time is either 1GB or 512MB, but
   its setting is hardwired into the Qemu code that was used to create 
   the dumpfile.  The dumpfile is a "savevm" file that is designed to be
   used for guest migration, and since inter-version save/load is not 
   supported, the I/O hole information does not have to encoded into the
   dumpfile.  Without the patch, the I/O hole for dumpfiles created by 
   older Qemu version was not being set to 1GB, so if the KVM guest was 
   configured with more than 3GB of memory, the crash session would 
   typically display numerous "read error" messages during session 
   initialization.
   (anderson@redhat.com)

Here is the output using the same dumpfile with the current upstream
version of crash:

$ crash vmlinux 390745-rh4-malongo.dmp

crash 5.1.6
Copyright (C) 2002-2011  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 
GNU gdb (GDB) 7.0                      
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

      KERNEL: vmlinux                           
    DUMPFILE: 390745-rh4-malongo.dmp
        CPUS: 4
        DATE: Fri Feb 11 14:19:56 2011
      UPTIME: 00:34:04
LOAD AVERAGE: 2.70, 0.97, 0.54
       TASKS: 85
    NODENAME: smart02
     RELEASE: 2.6.9-89.33.1.ELsmp
     VERSION: #1 SMP Mon Nov 15 19:02:30 EST 2010
     MACHINE: x86_64  (2800 Mhz)
      MEMORY: 4 GB
       PANIC: ""
         PID: 0
     COMMAND: "swapper"
        TASK: ffffffff803e2b00  (1 of 4)  [THREAD_INFO: ffffffff80508000]
         CPU: 0
       STATE: TASK_RUNNING 
     WARNING: panic task not found

crash> 

Additional info:
Comment 7 errata-xmlrpc 2012-02-21 00:52:56 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0203.html