Bug 221273 - crash on live RHEL4 x86_64 domU fails to start
crash on live RHEL4 x86_64 domU fails to start
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.5
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Chris Lalancette
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-01-03 09:36 EST by Jeff Layton
Modified: 2014-06-18 03:35 EDT (History)
4 users (show)

See Also:
Fixed In Version: RHBA-2007-0304
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-08 00:35:55 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fix for page_is_ram for x86_64 live crash (815 bytes, patch)
2007-01-03 13:19 EST, Chris Lalancette
no flags Details | Diff

  None (edit)
Description Jeff Layton 2007-01-03 09:36:16 EST
When I run crash on a live x86_64 domU kernel, it fails to start with the error:

crash: read error: kernel virtual address: ffffffff8038db08  type:
"phys_to_machine_mapping"

I'm using:

crash-4.0-3.16
kernel-xenU-2.6.9-42.36.EL.TEST.bz184549.3 (patched -42.36 kernel)

On a RHEL5 domU on the same physical machine, it works fine. crash also seems to
work ok on a RHEL4 domU on an i686 machine.
Comment 1 Dave Anderson 2007-01-03 09:49:21 EST
Three questions first:

1. Did this work OK on earlier kernels?

2. Does the same thing happen when you xendump it?

3. What's the output of "crash -d7"?

Comment 2 Jeff Layton 2007-01-03 09:56:34 EST
Here's the crash -d7 output. The other 2 questions I'm not sure of:

1) I haven't tested earlier kernels. Do you have a suggested one I should test?
2) by xendump do you mean an "xm save" or something else?

----------[snip]-----------

# crash -d7

crash 4.0-3.16
Copyright (C) 2002, 2003, 2004, 2005, 2006  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006  Fujitsu Limited
Copyright (C) 2006  VA Linux Systems Japan K.K.
Copyright (C) 2005  NEC Corporation
Copyright (C) 1999, 2002  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.
 

find_booted_kernel: search for [Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU
(root@dhcp231-89.rdu.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3))
#1 SMP Wed Jan 3 08:23:41 EST 2007
]
mount_points[0]: / (947210)
mount_points[1]: /proc (947230)
mount_points[2]: /dev (947250)
mount_points[3]: / (947270)
mount_points[4]: /dev (947290)
mount_points[5]: /selinux (9472b0)
mount_points[6]: /proc (9472d0)
mount_points[7]: /sys (9472f0)
mount_points[8]: /dev/pts (947310)
mount_points[9]: /dev/shm (947330)
mount_points[10]: /proc/sys/fs/binfmt_misc (947350)
mount_points[11]: /var/lib/nfs/rpc_pipefs (947390)
searchdirs[8]: /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/
searchdirs[0]: /usr/src/linux/
searchdirs[1]: /boot/
searchdirs[2]: /boot/efi/redhat
searchdirs[3]: /boot/efi/EFI/redhat
searchdirs[4]: /
searchdirs[5]: /usr/src/debug/
searchdirs[6]: /usr/src/redhat/BUILD/kernel-2.6.9/linux/
searchdirs[7]: /usr/src/redhat/BUILD/kernel-2.6.9/linux-2.6.9/
find_booted_kernel: check:
/usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux
find_booted_kernel: found:
/usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux
get_live_memory_source: /dev/crash
/sbin/modprobe crash
"crash" module loaded: [crash][7745][0]
/proc/misc: 62 crash => 10/62
/proc/version:
Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU
(root@dhcp231-89.rdu.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3))
#1 SMP Wed Jan 3 08:23:41 EST 2007
/usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux:
Linux version 2.6.9-42.36.EL.TEST.bz184549.3xenU
(root@dhcp231-89.rdu.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3))
#1 SMP Wed Jan 3 08:23:41 EST 2007
gdb /usr/lib/debug/lib/modules/2.6.9-42.36.EL.TEST.bz184549.3xenU/vmlinux 
GNU gdb 6.1
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

<readmem: ffffffff8038db08, KVADDR, "phys_to_machine_mapping", 8, (FOE), 8f12d8>
crash: read error: kernel virtual address: ffffffff8038db08  type:
"phys_to_machine_mapping"
"crash" module loaded: [crash][7745][0]
/sbin/rmmod crash

Comment 3 Dave Anderson 2007-01-03 10:11:46 EST
> 1) I haven't tested earlier kernels. Do you have a suggested one I should test?

Apparently crash was never tested on live x86_64 domU kernels, only
on xendump and "xm save" dumpfiles.  So I'm presuming that it doesn't
work with any "stock" RHEL4 x86_64 domU kernel?  

> 2) by xendump do you mean an "xm save" or something else?

Alt-sysrq-c from the domU or "xm dump-core" from the dom0.

Comment 4 Jeff Layton 2007-01-03 10:24:33 EST
If I do an "xm dump-core" and then run crash against the corefile with the same
debuginfo it works fine.

The patches in my kernel are all NFS-related, so I doubt they are affecting
this, but I'll test with a stock -42.36 kernel and report back.
Comment 5 Jeff Layton 2007-01-03 10:32:35 EST
I get the same results with a stock -42.36 kernel as well:

crash: read error: kernel virtual address: ffffffff8038db08  type:
"phys_to_machine_mapping"
Comment 6 Dave Anderson 2007-01-03 10:54:58 EST
Ok thanks.  That's the very first read attempted, so
we're not getting too far are we?

It's not entirely clear at this point whether this is a crash
utility or a /dev/crash driver (kernel) issue.

If you do a "dmesg" just after the crash utility attempt fails,
are there any messages from the /dev/crash driver?

Chris, do you have a set-up that I can work/squat on?

 
Comment 7 Jeff Layton 2007-01-03 10:57:07 EST
Yep, there are some messages that go to the ring buffer when it fails:

crash memory driver: version 1.0
crash memory driver: !page_is_ram(pfn: 38d)
Comment 8 Dave Anderson 2007-01-03 11:34:18 EST
Ok good -- this is a RHEL4 x86_64 xen kernel issue with page_is_ram().

In RHEL5, it's #define'd in arch/x86_64/mm/init-xen.c

  static inline int page_is_ram (unsigned long pagenr)
  {
          return 1;
  }
  EXPORT_SYMBOL_GPL(page_is_ram);

Chris is going to address this in his kernel one way or another.

Jeff, you did absolutely confirm that the i686 RHEL4 xenU kernel
runs crash OK on a live system, right?

Comment 9 Jeff Layton 2007-01-03 11:40:38 EST
Yes, it definitely works on i686. Though there, I'm using an older rev of crash:

crash 4.0-3.4

...I'm presuming that doesn't matter though.

Comment 10 Dave Anderson 2007-01-03 11:47:01 EST
It shouldn't -- but the version of crash that I'm currently pushing
through the RHEL4-U5 errata process is 4.0-3.9, which you can grab from:

  porkchop:/mnt/redhat/brewroot/packages/crash/4.0/3.9
Comment 11 Jeff Layton 2007-01-03 12:06:17 EST
i686 still works with 4.0-3.9
Comment 12 Chris Lalancette 2007-01-03 13:19:56 EST
Created attachment 144723 [details]
Fix for page_is_ram for x86_64 live crash
Comment 17 Jason Baron 2007-01-10 14:16:04 EST
committed in stream U5 build 42.40. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 18 RHEL Product and Program Management 2007-01-18 10:07:31 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 19 Jay Turner 2007-01-18 15:02:31 EST
QE ack for RHEL4.5.
Comment 22 Red Hat Bugzilla 2007-05-08 00:35:55 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html

Note You need to log in before you can comment on or make changes to this bug.